You are on page 1of 34

What data scientists should know

about deep learning

Andrew Ng
Applications of Deep Learning

Images Speech Behavior

Andrew Ng
Computer vision:
Find coffee mug

Andrew Ng
Computer vision:
Find coffee mug

Early, poor computer vision results.

Andrew Ng
Neurons in the brain Deep Learning: Neural network

Output

Andrew Ng
Computer vision

Andrew Ng
What is a neural network?

W1 W2 W3 W4

Data (image) x5
x4 Yes/No
x1 x2 x3 (Mug or not?)

x1 λ , x2 λ
5 5

x2 = (W1 ´ x1 )+
x3 = (W2 ´ x2 )+
Andrew Ng
Supervised learning (learning from labeled data)

X Y
Image Yes/No
(Is it a coffee mug?)

Data:
Yes

No

Andrew Ng
Why is Deep Learning taking off?

Engine

Large neural networks

Fuel

Labeled data
(x,y pairs)
Andrew Ng
almost
It’s all about scale.

Andrew Ng
Rocket engines: Deep Learning driven by scale

Cloud HPC
CPU GPU
(many CPUs) (many GPUs)
1 million 10 million 1 billion 100 billion
connections connections connections connections
(2007) (2008) (2011) (2015)

Andrew Ng
Can a computer understand these pictures?

A yellow bus driving down a road Living room with white couch and
with green trees and green grass blue carpeting. The room in the
in the background. apartment gets some afternoon sun.

Andrew Ng
Supervised learning (learning from labeled data)

X Y
Image Caption

A yellow bus driving down a


road with green trees and
green grass in the background.

Andrew Ng
Learning to Caption

A yellow bus
driving down….

Data (image)
Caption

Andrew Ng
Learning to answer questions

X Y
(Image,Question) Answer

( , )
What is the color
The bus is red.
of the bus?
(公共汽车是红色的)
(公共汽车是设么颜色的?)

Andrew Ng
Learning to Answer Questions

What is the
color of the
bus?
The bus is red ....

Data
(image, question)

Andrew Ng
Andrew Ng
Andrew Ng
Applications of Deep Learning

Images Speech Behavior

Andrew Ng
Speech recognition

Audio Language
Features Phonemes Transcript
model

Data (audio)

Andrew Ng
Baidu Deep Speech: The rocket engine

Data (audio)
Transcript

Andrew Ng
Baidu Deep Speech: The rocket fuel (data)

120000
>100,000
100000

80000

Hours of data 60000 Synthesized


data
40000

20000

80 300 2000
0
WSJ Switchboard Fisher Deep Speech

Dataset
Andrew Ng
Speech recognition performance

Error

Andrew Ng
Most people don’t understand the difference
between 95% accuracy and 99% accuracy.

99% is game changing.

With 99% accuracy, we could


redesign your cellphone using a
speech interface.

Andrew Ng
Speech will transform the Internet of Things

Car interfaces Home appliances Wearables

(e.g., TV, microwave,


music player, ….)

Andrew Ng
Applications of Deep Learning

Images Speech Behavior

Andrew Ng
Deep Learning and big data

Web search/Advertising Datacenter management Computer security

27
Andrew Ng
Applications of Deep Learning

Images Speech Behavior

Andrew Ng
almost
It’s all about scale.

Andrew Ng
Why deep learning

Deep learning

Performance Older learning


algorithms

Amount of data

How do data science techniques scale with amount of data?


Andrew Ng
The problem of scale: Mobile devices

Image 72 keypoints Face segmentation

Binary size Speed

Desktop model 153MB 1.25 fps

Mobile model 800 KB 25 fps


(190x reduction) (20x speedup)

Andrew Ng
Faceyou
(available on Apple
app store)

Andrew Ng
The future of Deep Learning

Images Speech Behavior


The AI rocket

Andrew Ng
We have superpowers

AI Data Science 34
Andrew Ng

You might also like