You are on page 1of 13

Chapter 1

Introduction
Representations Matter
APTER 1. INTRODUCTION

Cartesian coordinates Polar coordinates


y

x r

ure 1.1: Example of different representations:


Figure 1.1 suppose we want to separate
Dr. CONVOLBO 2
Depth: Repeated
CHAPTER 1. INTRODUCTION

Composition
Output
CAR PERSON ANIMAL
(object identity)

3rd hidden layer


(object parts)

2nd hidden layer


(corners and
contours)

1st hidden layer


(edges)

Visible layer
(input pixels)

Figure 1.2: Illustration of a deep learning model. It is difficult for a computer to understand
Dr. CONVOLBO
Figure 1.2
the meaning of raw sensory input data, such as this image represented as a collection 3
Computational Graphs
CHAPTER 1. INTRODUCTION

Element Element
Set Set

+
+
⇥ ⇥ ⇥ Logistic
Regression
Logistic
Regression

w1 x1 w2 x2 w x

Figure 1.3: Illustration of computational graphs mapping an input to an output wher


each Figure
node performs an operation. Depth
Dr. CONVOLBO is the1.3
length of the longest path from input4 to
Machine Learning and AI

Deep learning Example:


Shallow
Example: Example:
Example: autoencoders
Logistic Knowledge
MLPs
regression bases

Representation learning

Machine learning

AI

Dr. CONVOLBO
Figure 1.4
Figure 1.4: A Venn diagram showing how deep learning is a kind of representation learning, 5
CHAPTER 1. INTRODUCTION

Learning Multiple
Components Output
Figure 1.5

Mapping from
Output Output
features

Additional
Mapping from Mapping from layers of more
Output
features features abstract
features

Hand- Hand-
Simple
designed designed Features
features
program features

Input Input Input Input

Deep
Classic learning
Rule-based
machine
systems Representation
learning
Dr. CONVOLBO
learning 6
Organization of the Book
CHAPTER 1. INTRODUCTION

Figure 1.6 1. Introduction

Part I: Applied Math and Machine Learning Basics

3. Probability and
2. Linear Algebra
Information Theory

4. Numerical 5. Machine Learning


Computation Basics

Part II: Deep Networks: Modern Practices

6. Deep Feedforward
Networks

7. Regularization 8. Optimization 9. CNNs 10. RNNs

11. Practical
12. Applications
Methodology

Part III: Deep Learning Research

13. Linear Factor 15. Representation


14. Autoencoders
Models Learning

16. Structured 17. Monte Carlo


Probabilistic Models Methods

18. Partition
19. Inference
Function

20. Deep Generative


Models

Dr. CONVOLBO 7
Figure 1.6: The high-level organization of the book. An arrow from one chapter to another
Historical Waves
0.000250
Frequency of Word or Phrase

cybernetics
0.000200
(connectionism + neural networks)
0.000150

0.000100

0.000050

0.000000
1940 1950 1960 1970 1980 1990 2000
Year

1.7: The figure shows two of Figure


ureDr. CONVOLBO 1.7
the three historical waves of artificial neural
8
Historical Trends: Growing
Datasets
109
Dataset size (number examples)

108 Canadian Hansard


WMT Sports-1M
107 ImageNet10k
106 Public SVHN
105 Criminals ImageNet ILSVRC 2014
104
MNIST CIFAR-10
103
102 T vs. G vs. F Rotated T vs. C
Iris
101
100
1900 1950 1985 2000 2015
Year
ure 1.8: Dataset sizes have increased greatly over time. In the early 1900s, statistician
died datasets using hundreds or thousands of manually compiled measurements (Garson
0; Gosset, 1908; Anderson, 1935; Fisher, 1936). In the 1950s through 1980s, the pioneer
Dr. CONVOLBO inspired machine learningFigure
iologically 1.8 with small, synthetic datasets, suc
often worked 9
CHAPTER 1. INTRODUCTION

The MNIST Dataset

Dr. CONVOLBO
Figure 1.9
Figure 1.9: Example inputs from the MNIST dataset. The “NIST” stands for National 10
Connections per Neuron
104 Human

6 Cat
Connections per neuron

9 7
4
103 Mouse
2
10
5
8
102 Fruit fly
3
1

101
1950 1985 2000 2015
Year
Figure 1.10: Initially, the number of connections between neurons in artificial neura
networks was limited by hardware capabilities. Today, the number of connections between
neurons Figure
is mostly a design consideration.
Dr. CONVOLBO
1.10
Some artificial neural networks have nearly11a
Number of Neurons
Number of neurons (logarithmic scale)

1011 Human
1010
17 20
109 16 19 Octopus
108 14 18
107 11 Frog
106 8
105 3 Bee
Ant
104
103 Leech
13
102
101 1 2 12 15 Roundworm
6 9
100 5 10
10 1 4 7
10 2 Sponge
1950 1985 2000 2015 2056
Year
ure 1.11: Since the introduction of hidden units, artificial neural networks have doub
ize roughly every 2.4 years. Biological neural network sizes from Wikipedia (2015

1. Dr.Perceptron
CONVOLBO (Rosenblatt, 1958, 1962) Figure 1.11 12
Solving Object Recognition
0.30
ILSVRC classification error rate

0.25

0.20

0.15

0.10

0.05

0.00
2010 2011 2012 2013 2014 2015
Year
gure 1.12: Since deep networks reached the scale necessary to compete in the Ima
arge Scale Visual Recognition Challenge, they have consistently won the compe
ery year, and yielded lower and Figure
Dr. CONVOLBO lower error rates each time. Data from Russak
1.12 13

You might also like