Professional Documents
Culture Documents
1 Intro Up To RL - TD PDF
1 Intro Up To RL - TD PDF
Reinforcement Learning
Rich Sutton
Reinforcement Learning and Artificial Intelligence Laboratory
Department of Computing Science
University of Alberta, Canada
R&L
A I
Part 1: Why?
The coming of artificial intelligence
• When people finally come to understand the principles of
intelligence—what it is and how it works—well enough to
design and create beings as intelligent as ourselves
• It will change the way we work and play, our sense of self, life,
and death, the goals we set for ourselves and for our societies
Logarithmic+Plot Logarithmic+Plot
16
The computational
mega-trend
Moore’s law
• In computer Go
we thought human ideas were key, but it turned out (MCTS 2006–)
that big, sample-based search was key
4 dan
Monte-Carlo Search Zen
3 dan
Strength of sample-based
2 dan
Master 1 dan
search programs
Zen
Beginner 1 kyu
MoGo
2 kyu
3 kyu MoGo
4 kyu
CrazyStone
5 kyu Strength of conventional
6 kyu
search programs Indigo Traditional Search
7 kyu
Handtalk
8 kyu Indigo
9 kyu
10 kyu
Projecting the trend suggests a
11 kyu
computer will be world champion
12 kyu
by 2025
13 kyu
14 kyu
15 kyu
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2014
Year
/ 45
Two diverging approaches to AI
• Strong automation (computation & data rule)
• Generality
• Approximation