Professional Documents
Culture Documents
• A computer systems learns from historical data, which represents some past
experiences of an application domain.
• The goal is to learn a target function that can be used to predict the response
variable (Regression/Classification).
● Ability to mimic human and replace certain monotonous tasks - which require
some intelligence.
– like recognizing handwritten characters
Supervised vs Unsupervised Learning
Supervised Learning : Discover patterns in the data that relate data attributes
with target/label attribute, data is labeled.
Computer vision is closely linked with artificial intelligence, as the computer must
interpret what it sees, and then perform appropriate analysis or act accordingly.
Tools & Technologies
Computer Vision
Tools:
Technologies
● Java
● Python
● C++
Most Computer vision and machine learning developer prefer python language
because it easy to learn and high rich library
Machine Learning & Deep Learning
Top Libraries:
● Keras
● Tensorflow
● Theano
● Scikit Learn
● Pandas
● OpenCV
● Numpy
Brain Differences
Now wait before you start thinking that you can just create a huge neural network
and call strong AI, there are some few points to remember:
Just a list:
● The artificial neuron fires totally different than the brain
● A human brain has 100 billion neurons and 100 trillion connections
(synapses) and operates on 20 watts(enough to run a dim light bulb) - in
comparison the biggest neural network have 10 million neurons and 1 billion
connections on 16,000 CPUs (about 3 million watts)
● The brain is limited to 5 types of input data from the 5 senses.
● Children do not learn what a cow is by reviewing 100,000 pictures labelled
“cow” and “not cow”, but this is how machine learning works.
● Probably we don't learn by calculating the partial derivative of each neuron
related to our initial concept. (By the way we don't know how we learn)
Real Neuron
Artificial Neuron
Example of Simple ANN
Single-layer Perceptron Network / ANN
Architecture of Deep Neural Network
Basic Terminology
Terminology
• Features
– The number of features or distinct traits that can be used to describe each
item in a quantitative manner.
• Samples
– A sample is an item to process (e.g. classify). It can be a document, a
picture, a sound, a video, a row in database or CSV file, or whatever you can
describe with a fixed set of quantitative traits.
• Feature vector
– is an n-dimensional vector of numerical features that represent some object.
• Feature extraction
• Training/Evolution set
– Set of data to discover potentially predictive relationships.
• Activation Function
● The high value will have the high probability but not the higher probability.
Uses
● The high value will have the higher probability than other values.
A rectified linear unit has output 0 if the input is less than 0, and raw output
otherwise. That is, if the input is greater than 0, the output is equal to the input.
ReLUs' machinery is more like a real neuron in your body.
Range: [ 0 to infinity)
Machine Learning Techniques
Backpropagation:
Hidden Layers :
Which are neuron nodes stacked in between inputs and outputs, allowing neural
networks to learn more complicated features (such as XOR logic)
Batch Normalization :
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
image = cv2.imread(‘put image here’)
gray_image = cv2.cvtColor(image,cv2.COLOR_GRAY2BGR)
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
cv2.imshow('img',img)
cv2.waitKey(0)
cv2.destroyAllWindows()
Output :
Cifar-10 Example
# load Library
from __future__ import print_function
import keras
from keras.datasets import cifar10
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential, load_model
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
import os
batch_size = 32
num_classes = 10
epochs = 4
data_augmentation = True
num_predictions = 20
save_dir = os.path.join(os.getcwd(), 'saved_models')
model_name = 'keras_cifar10_trained_model.h5'
# The data, shuffled and split between train and test sets:
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')
# Convert class vectors to binary class matrices.
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)
#define model
model = Sequential()
model.add(Conv2D(32, (3, 3), padding='same',
input_shape=x_train.shape[1:]))
model.add(Activation('relu'))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), padding='same'))
model.add(Activation('relu'))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(512))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_classes))
model.add(Activation('softmax'))
# initiate RMSprop optimizer
opt = keras.optimizers.rmsprop(lr=0.0001, decay=1e-6)
# Let's train the model using RMSprop
model.compile(loss='categorical_crossentropy',
optimizer=opt,metrics=['accuracy'])
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
if not data_augmentation:
print('Not using data augmentation.')
model.fit(x_train, y_train,batch_size=batch_size,epochs=epochs,
validation_data=(x_test, y_test),shuffle=True)
else:
print('Using real-time data augmentation.')
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
width_shift_range=0.1, # randomly shift imgs horizontally (fraction of total width)
height_shift_range=0.1, # randomly shift images vertically (fraction of total
height)
horizontal_flip=True, # randomly flip images
vertical_flip=False) # randomly flip images
# Compute quantities required for feature-wise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(x_train)
if not os.path.isdir(save_dir):
os.makedirs(save_dir)
model_path = os.path.join(save_dir, model_name)
model.save(model_path)
print('Saved trained model at %s ' % model_path)
#Score trained model.
scores = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', scores[0])
print('Test accuracy:', scores[1])
Predict Label
import cv2,numpy
from keras.models import load_model
model = load_model('saved_models/augmented_best_model.h5')
img = cv2.imread('/dog.jpg')
img = cv2.resize(img, (32, 32))
image = img.transpose(0, 1, 2)
image = image.astype('float32')
image = image / 255
image = numpy.expand_dims(image, axis=0)
pred = model.predict(image)
pred_class = model.predict_classes(image)
print('predict Class',pred_class)
OUTPUT: [5]
Input Image >>dog.jpg
Classes:
0 : airplane
1 : automobile
2 : bird
3 : cat
4 : deer
5 : dog
6 : frog
7 : horse
8 : ship
9 : truck