You are on page 1of 81

EE639: Computer Vision

Dr. S Murala, EED, IIT Ropar

Computer Vision with Deep Learning

By
Dr. Subrahmanyam Murala
Associate Professor
CVPR Lab
Department of Electrical Engg.
IIT Ropar
Rupnagar, Punjab

Source: Fei-Fei Li, CS231n: Convolutional Neural Networks for Visual Recognition, Lectures.
Goodfellow et al., Deep Learning, MIT Press
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Classification

q K- Nearest Neighbor
q Neural Network
q Deep Learning (CNN)
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

K- Nearest Neighbor
q Most basic instance-based method
q Data represented in a vector space
q Supervised learning

INSTANCE-BASED METHOD

q Approximating real valued or discrete-valued target functions


q Learning in this algorithm consists of storing the presented training data
q When a new query instance is encountered, a set of similar related
instances is retrieved from memory and used to classify the new query
instance
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

K- Nearest Neighbor
q In k-NN classification, the output is a class membership.
q An object is classified by a majority vote of its neighbors, with the object being
assigned to the class most common among its k nearest neighbors (k is a
positive integer, typically small).
q If k = 1, then the object is simply assigned to the class of that single nearest
neighbor.

1-NN 5-NN
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Perceptron: A perceptron takes several binary inputs, x1, x2,… and produces a
single binary output.
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network

00 1
01 1
10 1
11 0

NAND
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Sigmoid neurons

q Network need to learn weights and biases so that the output from the network
correctly classifies the data.
q To see how learning might work, suppose we make a small change in some
weight (or bias) in the network.
q What we'd like is for this small change in weight to cause only a small
corresponding change in the output from the network.
q As we'll see in a moment, this property will make learning possible.
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Sigmoid neurons
q When our network contains perceptrons, a small change in the weights or
bias of any single perceptron in the network can sometimes cause the
output of that perceptron to completely flip, say from 0 to 1.

q We can overcome this problem by introducing a new type of artificial neuron


called a sigmoid neuron.
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Tanh function
Tanh activation function take value between [-1, 1]. It’s defined
as follows.
e x - e- x
T ( x) = tanh( x) = x - x
e +e

Its derivative is
tanh ' ( x) = 1 - tanh 2 ( x)
• It’s zero centred function.
• It’s saturated function so rescaling of data is required.
• Because of exponential terms computationally very
expensive.
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Learning with gradient descent
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Learning with gradient descent
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
Learning with stochastic gradient descent (SGD)
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Neural Network
The backpropagation algorithm

1. Input : Set the corresponding activation for the input layer.


2. Feedforward: For each compute and .
3. Output error : Compute the vector
4. Backpropagate the error: For each compute

5. Output: The gradient of the cost function is given by


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
Neural Network
Training
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
Neural Network
The backpropagation algorithm
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
Neural Network
Learning
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks
q The number of trainable parameters becomes extremely large.
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion

Shift left
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks

q Little or no invariance to shifting, scaling, and other forms of distortion


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks

q the topology of the input data is completely ignored


q work with raw data.

Feature 1

Feature 2

Source: Fei-Fei Li & Justin Johnson & Serena Yeung


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


Drawbacks of Neural Networks

q Feature extraction and training

CNN
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

ReLU: Rectified Linear Unit

Range: [ 0 to 1] Range: [ 0 to infinity)


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Leaky ReLU: Leaky Rectified Linear Unit

ReLU Leaky ReLU

Range: [ - infinity to infinity)


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning Terminology


• Epoch: one forward pass and one backward pass of all training examples.
• (Mini) Batch size: Number of training examples in one forward/backward pass. The
higher the batch size is, the more memory space you’ll need.
• Number of iterations: Number of passes, each pass using [batch size] number of
examples. To be clear, one pass = one forward pass + one backward pass
• Tensor: Tensor is an n-dimensional array. Tensors can keep track of a computational
graph and gradients, but they’re also useful as a generic tool for scientific computing.
• Learning rate: It determines how fast we want to update the weights during
optimization, if learning rate is too small, gradient descent can be slow to find the
minimum and if it’s too large gradient descent may not converge(it can overshoot the
minima).
• Data Normalization
• Dropout: Dropout is a regularization technique for reducing overfitting in neural
networks. At each training step we randomly drop out (set to zero) set of nodes, thus
we create a different model for each training case, all of these models share weights.
It’s a form of model averaging.
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Deep Learning: Convolutional Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

AlexNet Source: Link


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

AlexNet Source: Link


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

VGG-16
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

VGG-16
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

ResNet 50, 101, 152

Source: Link
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

GoogleNet/InceptionNet
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

MobileNet
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Depthwise Separable Convolution


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Auto-Encoders

1. Supervised Learning
2. Unsupervised Learning
3. Semi-Supervised Learning
4. Classification
5. Regression
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Auto-Encoders

Source: Sudeshna
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

• Traditionally, autoencoders were used for dimensionality


reduction or feature learning.
• Recently, theoretical connections between autoencoders and
latent variable models have brought autoencoders to the
forefront of generative modeling
• Autoencoders may be thought of as being a special case of
feedforward networks, and may be trained with all of the same
techniques, typically minibatchgradient descent following
gradients computed by back-propagation
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

UNet for Image Segmentation

Source: Olaf Ronneberger et al.


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

Source: Binglin, Shashank & Bhargav


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

GANs

qGenerative: Learn a generative model

qAdversarial: Trained in an adversarial setting

qNetworks: Use Deep Neural Networks


EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar
EE639: Computer Vision
Dr. S Murala, EED, IIT Ropar

You might also like