Chap1 Intro

Chapter 1
Introduction
Representations Matter
APTER 1. INTRODUCTION
Cartesian coordinates Polar coordinates

y
x r
ure 1.1: Example of diﬀerent representations:

Figure 1.1 suppose we want to separate
Dr. CONVOLBO 2
Depth: Repeated
CHAPTER 1. INTRODUCTION
Composition
Output
CAR PERSON ANIMAL
(object identity)
3rd hidden layer

(object parts)
2nd hidden layer

(corners and
contours)
1st hidden layer

(edges)
Visible layer
(input pixels)
Figure 1.2: Illustration of a deep learning model. It is diﬃcult for a computer to understand
Dr. CONVOLBO
Figure 1.2
the meaning of raw sensory input data, such as this image represented as a collection 3
Computational Graphs
Element Element
Set Set
+
+
⇥ ⇥ ⇥ Logistic
Regression
Logistic
Regression
w1 x1 w2 x2 w x
Figure 1.3: Illustration of computational graphs mapping an input to an output wher

each Figure
node performs an operation. Depth
Dr. CONVOLBO is the1.3
length of the longest path from input4 to
Machine Learning and AI
Deep learning Example:

Shallow
Example: Example:
Example: autoencoders
Logistic Knowledge
MLPs
regression bases
Representation learning
Machine learning
AI
Dr. CONVOLBO
Figure 1.4
Figure 1.4: A Venn diagram showing how deep learning is a kind of representation learning, 5
Learning Multiple
Components Output
Figure 1.5
Mapping from
Output Output
features
Additional
Mapping from Mapping from layers of more
Output
features features abstract
features
Hand- Hand-
Simple
designed designed Features
features
program features
Input Input Input Input
Deep
Classic learning
Rule-based
machine
systems Representation
learning
Dr. CONVOLBO
learning 6
Organization of the Book
Figure 1.6 1. Introduction
Part I: Applied Math and Machine Learning Basics
3. Probability and
2. Linear Algebra
Information Theory
4. Numerical 5. Machine Learning

Computation Basics
Part II: Deep Networks: Modern Practices
6. Deep Feedforward
Networks
7. Regularization 8. Optimization 9. CNNs 10. RNNs
11. Practical
12. Applications
Methodology
Part III: Deep Learning Research
13. Linear Factor 15. Representation

14. Autoencoders
Models Learning
16. Structured 17. Monte Carlo

Probabilistic Models Methods
18. Partition
19. Inference
Function
20. Deep Generative

Models
Dr. CONVOLBO 7
Figure 1.6: The high-level organization of the book. An arrow from one chapter to another
Historical Waves
0.000250
Frequency of Word or Phrase
cybernetics
0.000200
(connectionism + neural networks)
0.000150
0.000100
0.000050
0.000000
1940 1950 1960 1970 1980 1990 2000
Year
1.7: The figure shows two of Figure

ureDr. CONVOLBO 1.7
the three historical waves of artificial neural
8
Historical Trends: Growing
Datasets
109
Dataset size (number examples)
108 Canadian Hansard

WMT Sports-1M
107 ImageNet10k
106 Public SVHN
105 Criminals ImageNet ILSVRC 2014
104
MNIST CIFAR-10
103
102 T vs. G vs. F Rotated T vs. C
Iris
101
100
1900 1950 1985 2000 2015
Year
ure 1.8: Dataset sizes have increased greatly over time. In the early 1900s, statistician
died datasets using hundreds or thousands of manually compiled measurements (Garson
0; Gosset, 1908; Anderson, 1935; Fisher, 1936). In the 1950s through 1980s, the pioneer
Dr. CONVOLBO inspired machine learningFigure
iologically 1.8 with small, synthetic datasets, suc
often worked 9
The MNIST Dataset
Dr. CONVOLBO
Figure 1.9
Figure 1.9: Example inputs from the MNIST dataset. The “NIST” stands for National 10
Connections per Neuron
104 Human
6 Cat
Connections per neuron
9 7
4
103 Mouse
2
10
5
8
102 Fruit fly
3
1
101
1950 1985 2000 2015
Year
Figure 1.10: Initially, the number of connections between neurons in artificial neura
networks was limited by hardware capabilities. Today, the number of connections between
neurons Figure
is mostly a design consideration.
Dr. CONVOLBO
1.10
Some artificial neural networks have nearly11a
Number of Neurons
Number of neurons (logarithmic scale)
1011 Human
1010
17 20
109 16 19 Octopus
108 14 18
107 11 Frog
106 8
105 3 Bee
Ant
104
103 Leech
13
102
101 1 2 12 15 Roundworm
6 9
100 5 10
10 1 4 7
10 2 Sponge
1950 1985 2000 2015 2056
Year
ure 1.11: Since the introduction of hidden units, artificial neural networks have doub
ize roughly every 2.4 years. Biological neural network sizes from Wikipedia (2015
1. Dr.Perceptron
CONVOLBO (Rosenblatt, 1958, 1962) Figure 1.11 12
Solving Object Recognition
0.30
ILSVRC classification error rate
0.25
0.20
0.15
0.10
0.05
0.00
2010 2011 2012 2013 2014 2015
Year
gure 1.12: Since deep networks reached the scale necessary to compete in the Ima
arge Scale Visual Recognition Challenge, they have consistently won the compe
ery year, and yielded lower and Figure
Dr. CONVOLBO lower error rates each time. Data from Russak
1.12 13

Chap1 Intro

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Chap1 Intro

Uploaded by

Copyright:

Available Formats

Chapter 1

Cartesian coordinates Polar coordinates

ure 1.1: Example of diﬀerent representations:

3rd hidden layer

2nd hidden layer

1st hidden layer

Figure 1.3: Illustration of computational graphs mapping an input to an output wher

Deep learning Example:

Input Input Input Input

Figure 1.6 1. Introduction

Part I: Applied Math and Machine Learning Basics

4. Numerical 5. Machine Learning

Part II: Deep Networks: Modern Practices

7. Regularization 8. Optimization 9. CNNs 10. RNNs

Part III: Deep Learning Research

13. Linear Factor 15. Representation

16. Structured 17. Monte Carlo

20. Deep Generative

1.7: The figure shows two of Figure

108 Canadian Hansard

The MNIST Dataset

You might also like