CNN For Computer Vision

EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
Computer Vision with Deep Learning
By
Dr. Subrahmanyam Murala
Associate Professor
CVPR Lab
Department of Electrical Engg.
IIT Ropar
Rupnagar, Punjab
Source: Fei-Fei Li, CS231n: Convolutional Neural Networks for Visual Recognition, Lectures.
Goodfellow et al., Deep Learning, MIT Press
Classification
q K- Nearest Neighbor
q Neural Network
q Deep Learning (CNN)
K- Nearest Neighbor
q Most basic instance-based method
q Data represented in a vector space
q Supervised learning
INSTANCE-BASED METHOD
q Approximating real valued or discrete-valued target functions

q Learning in this algorithm consists of storing the presented training data
q When a new query instance is encountered, a set of similar related
instances is retrieved from memory and used to classify the new query
instance
K- Nearest Neighbor
q In k-NN classification, the output is a class membership.
q An object is classified by a majority vote of its neighbors, with the object being
assigned to the class most common among its k nearest neighbors (k is a
positive integer, typically small).
q If k = 1, then the object is simply assigned to the class of that single nearest
neighbor.
1-NN 5-NN
Neural Network
Perceptron: A perceptron takes several binary inputs, x1, x2,… and produces a
single binary output.
Neural Network
00 1
01 1
10 1
11 0
NAND
Neural Network
Sigmoid neurons
q Network need to learn weights and biases so that the output from the network
correctly classifies the data.
q To see how learning might work, suppose we make a small change in some
weight (or bias) in the network.
q What we'd like is for this small change in weight to cause only a small
corresponding change in the output from the network.
q As we'll see in a moment, this property will make learning possible.
Neural Network
Sigmoid neurons
q When our network contains perceptrons, a small change in the weights or
bias of any single perceptron in the network can sometimes cause the
output of that perceptron to completely flip, say from 0 to 1.
q We can overcome this problem by introducing a new type of artificial neuron

called a sigmoid neuron.
Neural Network
Tanh function
Tanh activation function take value between [-1, 1]. It’s defined
as follows.
e x - e- x
T ( x) = tanh( x) = x - x
e +e
Its derivative is
tanh ' ( x) = 1 - tanh 2 ( x)
• It’s zero centred function.
• It’s saturated function so rescaling of data is required.
• Because of exponential terms computationally very
expensive.
Neural Network
Neural Network
Neural Network
Learning with gradient descent
Neural Network
Learning with gradient descent
Neural Network
Learning with stochastic gradient descent (SGD)
Neural Network
The backpropagation algorithm
1. Input : Set the corresponding activation for the input layer.

2. Feedforward: For each compute and .
3. Output error : Compute the vector
4. Backpropagate the error: For each compute
5. Output: The gradient of the cost function is given by

Neural Network
Training
Neural Network
The backpropagation algorithm
Neural Network
Learning
Deep Learning: Convolutional Neural Networks

Drawbacks of Neural Networks
q The number of trainable parameters becomes extremely large.

q Little or no invariance to shifting, scaling, and other forms of distortion


Shift left





q the topology of the input data is completely ignored

q work with raw data.
Feature 1
Feature 2
Source: Fei-Fei Li & Justin Johnson & Serena Yeung


q Feature extraction and training
CNN
ReLU: Rectified Linear Unit
Range: [ 0 to 1] Range: [ 0 to infinity)

Leaky ReLU: Leaky Rectified Linear Unit
ReLU Leaky ReLU
Range: [ - infinity to infinity)

Deep Learning Terminology

• Epoch: one forward pass and one backward pass of all training examples.
• (Mini) Batch size: Number of training examples in one forward/backward pass. The
higher the batch size is, the more memory space you’ll need.
• Number of iterations: Number of passes, each pass using [batch size] number of
examples. To be clear, one pass = one forward pass + one backward pass
• Tensor: Tensor is an n-dimensional array. Tensors can keep track of a computational
graph and gradients, but they’re also useful as a generic tool for scientific computing.
• Learning rate: It determines how fast we want to update the weights during
optimization, if learning rate is too small, gradient descent can be slow to find the
minimum and if it’s too large gradient descent may not converge(it can overshoot the
minima).
• Data Normalization
• Dropout: Dropout is a regularization technique for reducing overfitting in neural
networks. At each training step we randomly drop out (set to zero) set of nodes, thus
we create a different model for each training case, all of these models share weights.
It’s a form of model averaging.








AlexNet Source: Link

AlexNet Source: Link

VGG-16
VGG-16
ResNet 50, 101, 152
Source: Link
GoogleNet/InceptionNet
MobileNet
Depthwise Separable Convolution

Auto-Encoders
1. Supervised Learning
2. Unsupervised Learning
3. Semi-Supervised Learning
4. Classification
5. Regression
Auto-Encoders
Source: Sudeshna
• Traditionally, autoencoders were used for dimensionality

reduction or feature learning.
• Recently, theoretical connections between autoencoders and
latent variable models have brought autoencoders to the
forefront of generative modeling
• Autoencoders may be thought of as being a special case of
feedforward networks, and may be trained with all of the same
techniques, typically minibatchgradient descent following
gradients computed by back-propagation
UNet for Image Segmentation
Source: Olaf Ronneberger et al.

Source: Binglin, Shashank & Bhargav

GANs
qGenerative: Learn a generative model
qAdversarial: Trained in an adversarial setting
qNetworks: Use Deep Neural Networks


CNN For Computer Vision

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CNN For Computer Vision

Uploaded by

Copyright:

Available Formats

EE639: Computer Vision

Dr. S Murala, EED, IIT Ropar

Computer Vision with Deep Learning

q Approximating real valued or discrete-valued target functions

q We can overcome this problem by introducing a new type of artificial neuron

1. Input : Set the corresponding activation for the input layer.

5. Output: The gradient of the cost function is given by

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion

Deep Learning: Convolutional Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion

Deep Learning: Convolutional Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion

Deep Learning: Convolutional Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion

Deep Learning: Convolutional Neural Networks

q the topology of the input data is completely ignored

Source: Fei-Fei Li & Justin Johnson & Serena Yeung

Deep Learning: Convolutional Neural Networks

q Feature extraction and training

ReLU: Rectified Linear Unit

Range: [ 0 to 1] Range: [ 0 to infinity)

Leaky ReLU: Leaky Rectified Linear Unit

ReLU Leaky ReLU

Range: [ - infinity to infinity)

Deep Learning Terminology

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

Deep Learning: Convolutional Neural Networks

AlexNet Source: Link

AlexNet Source: Link

ResNet 50, 101, 152

Depthwise Separable Convolution

• Traditionally, autoencoders were used for dimensionality

UNet for Image Segmentation

Source: Olaf Ronneberger et al.

Source: Binglin, Shashank & Bhargav

qGenerative: Learn a generative model

qAdversarial: Trained in an adversarial setting

qNetworks: Use Deep Neural Networks

You might also like