You are on page 1of 13

Introduction

Lecture slides for Chapter 1 of Deep Learning


www.deeplearningbook.org
Ian Goodfellow
2016-09-26
Representations Matter
APTER 1. INTRODUCTION

Cartesian coordinates Polar coordinates


y

x r

Figure 1.1 suppose we want to separate


ure 1.1: Example of different representations:
(Goodfellow 2016)
Depth: Repeated Composition
CHAPTER 1. INTRODUCTION

Output
CAR PERSON ANIMAL
(object identity)

3rd hidden layer


(object parts)

2nd hidden layer


(corners and
contours)

1st hidden layer


(edges)

Visible layer
(input pixels)

Figure 1.2: Illustration of a deep learning model. It is difficult for a computer to understand
Figure 1.2
the meaning of raw sensory input data, such as this image represented as a collection
(Goodfellow 2016)
Computational Graphs
CHAPTER 1. INTRODUCTION

Element Element
Set Set

+
+
⇥ ⇥ ⇥ Logistic
Regression
Logistic
Regression

w1 x1 w2 x2 w x

Figure 1.3: Illustration of computational graphs mapping an input to an output where


Figure
each node performs an operation. Depth is the1.3
length of the longest path from(Goodfellow
input 2016)to
Machine Learning and AI

Deep learning Example:


Shallow
Example: Example:
Example: autoencoders
Logistic Knowledge
MLPs
regression bases

Representation learning

Machine learning

AI

Figure 1.4
Figure 1.4: A Venn diagram showing how deep learning is a kind of representation learning, (Goodfellow 2016)
CHAPTER 1. INTRODUCTION

Learning Multiple Components


Figure 1.5 Output

Mapping from
Output Output
features

Additional
Mapping from Mapping from layers of more
Output
features features abstract
features

Hand- Hand-
Simple
designed designed Features
features
program features

Input Input Input Input

Deep
Classic learning
Rule-based
machine
systems Representation
learning (Goodfellow 2016)
learning
Organization of the Book
CHAPTER 1. INTRODUCTION

Figure 1.6 1. Introduction

Part I: Applied Math and Machine Learning Basics

3. Probability and
2. Linear Algebra
Information Theory

4. Numerical 5. Machine Learning


Computation Basics

Part II: Deep Networks: Modern Practices

6. Deep Feedforward
Networks

7. Regularization 8. Optimization 9. CNNs 10. RNNs

11. Practical
12. Applications
Methodology

Part III: Deep Learning Research

13. Linear Factor 15. Representation


14. Autoencoders
Models Learning

16. Structured 17. Monte Carlo


Probabilistic Models Methods

18. Partition
19. Inference
Function

20. Deep Generative


Models

(Goodfellow 2016)
Figure 1.6: The high-level organization of the book. An arrow from one chapter to another
Historical Waves
0.000250
Frequency of Word or Phrase

cybernetics
0.000200
(connectionism + neural networks)
0.000150

0.000100

0.000050

0.000000
1940 1950 1960 1970 1980 1990 2000
Year

ure 1.7: The figure shows two of Figure


the three1.7
historical waves of artificial neural
(Goodfellow 2016)
Historical Trends: Growing Datasets
109
Dataset size (number examples)

108 Canadian Hansard


WMT Sports-1M
107 ImageNet10k
106 Public SVHN
105 Criminals ImageNet ILSVRC 2014
104
MNIST CIFAR-10
103
102 T vs. G vs. F Rotated T vs. C
Iris
101
100
1900 1950 1985 2000 2015
Year
ure 1.8: Dataset sizes have increased greatly over time. In the early 1900s, statistician
died datasets using hundreds or thousands of manually compiled measurements (Garson
0; Gosset, 1908; Anderson, 1935; Fisher, 1936). In the 1950s through 1980s, the pioneer
iologically inspired machine learningFigure 1.8 with small, synthetic datasets,
often worked (Goodfellow suc
2016)
CHAPTER 1. INTRODUCTION

The MNIST Dataset

Figure 1.9
Figure 1.9: Example inputs from the MNIST dataset. The “NIST” stands for National
(Goodfellow 2016)
Connections per Neuron
104 Human

6 Cat
Connections per neuron

9 7
4
103 Mouse
2
10
5
8
102 Fruit fly
3
1

101
1950 1985 2000 2015
Year
Figure 1.10: Initially, the number of connections between neurons in artificial neura
networks was limited by hardware capabilities. Today, the number of connections between
Figure
neurons is mostly a design consideration. Some1.10
artificial neural networks have(Goodfellow
nearly a
2016)
Number of Neurons
Number of neurons (logarithmic scale)

1011 Human
1010
17 20
109 16 19 Octopus
108 14 18
107 11 Frog
106 8
105 3 Bee
Ant
104
103 Leech
13
102
101 1 2 12 15 Roundworm
6 9
100 5 10
10 1 4 7
10 2 Sponge
1950 1985 2000 2015 2056
Year
ure 1.11: Since the introduction of hidden units, artificial neural networks have doub
ize roughly every 2.4 years. Biological neural network sizes from Wikipedia (2015

1. Perceptron (Rosenblatt, 1958, 1962) Figure 1.11 (Goodfellow 2016)


Solving Object Recognition
0.30
ILSVRC classification error rate

0.25

0.20

0.15

0.10

0.05

0.00
2010 2011 2012 2013 2014 2015
Year
gure 1.12: Since deep networks reached the scale necessary to compete in the Ima
arge Scale Visual Recognition Challenge, they have consistently won the compe
ery year, and yielded lower and lower
Figureerror rates each time. Data from(Goodfellow
1.12 Russak 2016)

You might also like