You are on page 1of 13

Get Compact Representation using Deep Networks

Method and Application


Zhengbo Li
Shanghai Jiao Tong University
lzb940214@sjtu.edu.cn

November 19, 2015

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

1 / 13

Overview

Motivation
Method
Performance
Application
Future work

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

2 / 13

Motivation: why do we need compact representation?

Useful compact representation of original data needs less


computational and spacial resources.
Interesting we want to know what are the compact representations
(essentially the same as what do gates learn).

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

3 / 13

Dataset

A low resolution version of MNIST.


Convert 28 by 28 pictures to 14 by 14 pictures, each input is a 196
dimension vector.
Due to limited computational resource and time.

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

4 / 13

Method: Autoencoder

Dilemma:
Shallow autoencoders (single or a few hidden layers):
Advantage: easy to find a good local minimum
Disadvantage: not complex enough to get good representations
Deep autoencoders (more hidden layers):
Advantage: complex enough, good representation is possible
Disadvantage: very likely to get stuck into poor local minimums

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

5 / 13

Method: Combine the Advantages


Example: get a 4-dimensional representation of the 196 dimensional
hand written digits, aka, use 4 real numbers to represent a picture.
Step 1: Use the 196 dimensional original input to train a 100
dimensional representation.
Step 2: Use the 100 dimensional representation to train a 50
dimensional representation.

Figure 1 : Step 1(left), Step 2(right)


Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

6 / 13

Method: Combine the Advantages, cont


Step 3: Combine these two networks. Use the red and blue weights
we got as initial weights and continue training, thus we get a 50
dimensional representation of the original 196 dimensional input.

Figure 2 : Step 3

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

7 / 13

Method: Combine the Advantages, cont.


Step 4: Use the 50 dimensional representation to train a 20
dimensional representation.
Step 5: Combine the networks. Use the red, blue and green weights
we got as initial weights and continue training, thus we get a 20
dimensional representation of the original 196 dimensional input.

Figure 3 : Step 4(left), Step 5(right)


Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

8 / 13

Method: Combine the Advantages, cont.

Keep inserting hidden layers in the middle to get more compact


representations.
Final network structure:[196, 100, 50, 20, 10, 4, 10, 20, 50, 100, 196]

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

9 / 13

Performance
To evaluate performance, cost = sum square of the differences
between input and output, averaged for all inputs

Method

Cost

Top 4 principle components (SVD)


Single hidden layer autoencoder with 4 hidden gates
Autoencoder with same architecture, but train all layers together
Our method

12.1284
6.1094
10.0036
2.2951

Table 1 : Cost comparison for different methods

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

10 / 13

Application: Generating samples


Dimension reduction has many applications, omitted here.
Pick up a random 4-dimensional vector. With high probability it
corresponds to a hand written digit.

Figure 4 : Generated Numbers


Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

11 / 13

Future work

Try other datasets.


See what do these 4 hidden gates learn (why the 4 dimensional
representation achieves low cost).
Why deep networks are easy to get stuck into poor local minimums?

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

12 / 13

Thank you for listening.

Zhengbo Li (SJTU)

Get Compact Representation

November 19, 2015

13 / 13

You might also like