C9c KNN ANN DNN

K NEAREST NEIGHBOR
NEURAL NETWORK
DEEP LEARNING
TS NGUYỄN ĐỨC THÀNH 1

K NEAREST NEIGHBOR K LÂN CẬN
• Thuật toán phân lớp gán điểm dữ liệu mới thuộc lớp nào tùy
theo khoảng cách đến k dữ liệu cũ đã được phân lớp trước đó
và quyết định dựa trên khoảng cách nhỏ hơn. Thuật toán có
ưu điểm là không cần huấn luyện trước, chỉ cần cung cấp tập
training data (trainData) và label (responses)

K NEAREST NEIGHBOR K LÂN CẬN OPENCV 2
• Khởi tạo mô hình KNearest
CvKNearest knn(const CvMat* trainData, const
CvMat* responses, const CvMat* sampleIdx=0,
bool isRegression=false, int max_k=32 );
isRegression – Type of the problem: true for regression and false
for classification.
maxK – Number of maximum neighbors that may be passed to the
method CvKNearest::find_nearest().
sampleIdx=0: use all sample for training
Hoặc dùng lệnh đơn giản với thông số mặc định
CvKNearest knn

• Huấn luyện
CvKNearest::train
boolean knn.train(Mat trainData, Mat responses)
float find_nearest(Mat samples, int k, Mat results,
Mat neighborResponses, Mat dists)
results – Vector with results of prediction (regression or
classification) for each input sample. It is a single-precision floating-
point vector with number_of_samples elements.
neighbors – Optional output pointers to the neighbor vectors
themselves. It is an array of k*samples->rows pointers.
neighborResponses – Optional output values for corresponding
neighbors. It is a single-precision floating-point matrix of number_of
_samples * k size.
dist – Optional output distances from the input vectors to the
corresponding neighbors. It is a single-precision floating-point
matrix of number_of_samples * k size.
KNN PYTHON 3.6
import cv2
import numpy as np
import matplotlib.pyplot as plt
# Feature set containing (x,y) values of 25 known/training data
trainData = np.random.randint(0,100,(25,2)).astype(np.float32)
# Labels each one either Red or Blue with numbers 0 and 1
responses = np.random.randint(0,2,(25,1)).astype(np.float32)
# Take Red and Blue families and plot them
red = trainData[responses.ravel()==0]
plt.subplot(311)
plt.scatter(red[:,0],red[:,1],80,'r','^')
blue = trainData[responses.ravel()==1]
KNN PYTHON 3.6
plt.scatter(blue[:,0],blue[:,1],80,'b','s')
knn = cv2.ml.KNearest_create()
#Train data
knn.train(trainData,cv2.ml.ROW_SAMPLE,responses)
# Add a new point
newcomer = np.random.randint(0,100,(1,2)).astype(np.float32)
plt.subplot(312)
plt.scatter(newcomer[:,0],newcomer[:,1],80,'g','o')
#Classifier New point, Red:0, Blue:1
ret, results, neighbours, dist = knn.findNearest(newcomer, 3)
print ('result: ', results,'\n')
print ("neighbours: ", neighbours,"\n")

print ("distance: ", dist)
# 10 new comers
newcomers =
np.random.randint(0,100,(10,2)).astype(np.float32)
plt.subplot(313)
plt.scatter(newcomers[:,0],newcomers[:,1],80,'g','o')
ret, results,neighbours,dist = knn.findNearest(newcomers, 3)
# The results also will contain 10 labels.
print ('result: ', results,'\n')
print ("neighbours: ", neighbours,"\n")
print ("distance: ", dist)
plt.show()

KNN PYTHON 3.6
New Comer
result: [[0.]]
neighbours: [[0. 1.
0.]]
distance: [[196.
289. 314.]]

KNN OPENCV 3 C++
• Project2018/KNNOpencvC++
• K Nearest Neighbor Opencv 3.docx

K Nearest Neighbor Opencv 3

KNN MATLAB
fitcknn : Fit k-nearest neighbor classifier
Mdl = fitcknn(X,Y)
Mdl = fitcknn(Tbl,formula)
X: data, Y, label, Tbl: data and label,
Predict: Predict labels using k-nearest neighbor classification
model
label = predict(mdl,X)
[label,score,cost] = predict(mdl,X)

PHÂN LỚP HOA DIÊN VĨ IRIS
Iris flower dataset là một bộ dữ liệu nhỏ. Bộ dữ liệu này bao
gồm thông tin của ba loại hoa Iris (một loài hoa lan) khác nhau:
Iris setosa, Iris virginica và Iris versicolor. Mỗi loại có 50 bông
hoa được đo với dữ liệu là 4 thông tin: chiều dài, chiều rộng
đài hoa (sepal), và chiều dài, chiều rộng cánh hoa (petal). Tập
dữ liệu FisherIris Data Set được tạo bởi nhà khoa học Ronald
Fisher. Download: https://archive.ics.uci.edu/ml/machine-
learning-databases/iris/iris.data.

IRIS. DATA
Iris.data là file text chứa
SepalLength,SepalWidth,PetalLength,PetalWidth,Species
IRIS. DATA

IRIS. DATA

PHÂN LỚP HOA DIÊN VĨ DÙNG MATLAB
• Matlab chứa dataset iris ở thư mục
toolbox/stats/statsdemo/fisheriris.csv
SepalLength,SepalWidth,PetalLength,PetalWidth,Species
5.1,3.5,1.4,0.2,setosa
4.9,3,1.4,0.2,setosa
4.7,3.2,1.3,0.2,setosa
4.6,3.1,1.5,0.2,setosa
meas : ma trận 150x4 số thực là kích thước đài hoa và cánh hoa
Species: cell 150x1 chứa tên loài hoa setosa, virginica, versicolor
load fisheriris
f = figure;
gscatter(meas(:,1), meas(:,2), species,'rgb','osd');
xlabel('Sepal length');
ylabel('Sepal width');
N = size(meas,1);

>>openExample('stats/PredictClassificationUsingKNNClassifierExample')
This example shows how to predict classification for a k-nearest neighbor
classifier.
Construct a KNN classifier for the Fisher iris data as in Construct KNN
Classifier.
load fisheriris
X = meas; % Use all data for fitting
Y = species; % Response data
x = meas(:,3:4);
gscatter(x(:,1),x(:,2),species)
legend('Location','best')
newpoint = [5 1.45 3 2];
line(newpoint(1),newpoint(2),'marker','x','color','k',...
'markersize',10,'linewidth',2)


Mdl = fitcknn(X,Y,'NumNeighbors',5)
flwr = mean(X); % an average flower
flwrClass = predict(Mdl,flwr)
flwrClass =
1×1 cell array
{'versicolor'}
newpointClass= predict(Mdl, newpoint)
newpointClass =
1×1 cell array
{'versicolor'}

PHÂN LỚP HOA DIÊN VĨ
PYTHON FROM SCRATCH
import csv
import random
import math
import operator
#split data randomly into training set and test set
def loadDataset(filename, split, trainingSet=[] , testSet=[]):
with open(filename, 'r') as csvfile:
lines = csv.reader(csvfile)
dataset = list(lines)
for x in range(len(dataset)-1):
for y in range(4):
dataset[x][y] = float(dataset[x][y])
if random.random() < split:
trainingSet.append(dataset[x])
else:
testSet.append(dataset[x])
PYTHON FROM SCRATCH
def euclideanDistance(instance1, instance2, length):
distance = 0
for x in range(length):
distance += pow((instance1[x] - instance2[x]), 2)
return math.sqrt(distance)
def getNeighbors(trainingSet, testInstance, k):
distances = []
length = len(testInstance)-1
for x in range(len(trainingSet)):
dist = euclideanDistance(testInstance, trainingSet[x], length)
distances.append((trainingSet[x], dist))
distances.sort(key=operator.itemgetter(1))
neighbors = []
for x in range(k):
neighbors.append(distances[x][0])
return neighbors TS NGUYỄN ĐỨC THÀNH 21
PYTHON FROM SCRATCH
def getResponse(neighbors):
classVotes = {}
for x in range(len(neighbors)):
#get label
response = neighbors[x][-1]# get last element of array
if response in classVotes:
classVotes[response] += 1
else:
classVotes[response] = 1
sortedVotes = sorted(classVotes.items(), key=operator.itemgetter(1),
reverse=True)
return sortedVotes[0][0]
def getAccuracy(testSet, predictions):
correct = 0
for x in range(len(testSet)):
if testSet[x][-1] == predictions[x]:
correct += 1 TS NGUYỄN ĐỨC THÀNH 22
PYTHON FROM SCRATCH
return (correct/float(len(testSet))) * 100.0
return float(correct/len(testSet)) * 100.0
def main():
# prepare data
trainingSet=[]
testSet=[]
split = 0.67
loadDataset('iris.data', split, trainingSet, testSet)
print ('Train set: ' + repr(len(trainingSet)))#repr: convert to printable string
print ('Test set: ' + repr(len(testSet)))
print (trainingSet[10])
# generate predictions
predictions=[]
k=3
for x in range(len(testSet)):
neighbors = getNeighbors(trainingSet, testSet[x], k)

PYTHON FROM SCRATCH
result = getResponse(neighbors)
predictions.append(result)
## print('> predicted=' + repr(result) + ', actual=' +
repr(testSet[x][-1]))
accuracy = getAccuracy(testSet, predictions)
print('Accuracy: ' + repr(accuracy) + '%')
#Try new data
test=[4.8, 3.0, 1.4, 0.1]
neighbors = getNeighbors(trainingSet, test, k)
result = getResponse(neighbors)
predictions.append(result)
print('> predicted=' + repr(result))
main()
PHÂN LỚP HOA DIÊN VĨ DÙNG
PYTHON SKLEARN
https://machinelearningcoban.com/2017/01/08/knn/
import numpy as np
from sklearn import neighbors, datasets
from sklearn.metrics import accuracy_score
iris = datasets.load_iris()
iris_X = iris.data
iris_y = iris.target
X0 = iris_X[iris_y == 0,:]
PYTHON SKLEARN
print ('Number of classes: %d' %len(np.unique(iris_y)))
print ('Number of data points: %d' %len(iris_y))
print ('\nSamples from class 0:\n', X0[:5,:])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
iris_X, iris_y, test_size=50)
print ("Training size: %d" %len(y_train))
print ("Test size : %d" %len(y_test))
clf = neighbors.KNeighborsClassifier(n_neighbors = 10, p =
2)
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)

PYTHON SKLEARN
print ("Accuracy of 10NN with major voting: %.2f %%"
%(100*accuracy_score(y_test, y_pred)))
Number of classes: 3
Number of data points: 150
Samples from class 0:
[[5.1 3.5 1.4 0.2]
[4.9 3. 1.4 0.2]
[4.7 3.2 1.3 0.2]
[4.6 3.1 1.5 0.2]
[5. 3.6 1.4 0.2]]
Training size: 100
Test size : 50
Accuracy of 10NN with major voting: 98.00 %

PYTHON SKLEARN
• Để tăng độ chính xác ta gán trọng số khác nhau cho mỗi
trong 10 điểm lân cận . Cách đánh trọng số phải thoải
mãn điều kiện là một điểm càng gần điểm test data thì
phải được đánh trọng số càng cao (tin tưởng hơn). Cách
đơn giản nhất là lấy nghịch đảo của khoảng cách này.
clf = neighbors.KNeighborsClassifier(n_neighbors = 10, p =
2, weights = 'distance')
clf.fit(X_train, y_train)
y_pred = clf.predict(X_test)
print "Accuracy of 10NN (1/distance weights): %.2f %%"
%(100*accuracy_score(y_test, y_pred))

DÙNG PYTHON OPENCV
• Download data set lưu vào thư mục
https://archive.ics.uci.edu/ml/machine-learning-
databases/iris/iris.data.
• Đọc dataset
np.genfromtxt(‘iris.data', delimiter=',', dtype=None,
names=('sepal length', 'sepal width', 'petal length', 'petal width',
'label'))
• hay
values = np.loadtxt(‘iris.data', delimiter=',', usecols=[0,1,2,3])
labels = np.loadtxt(‘iris.data', delimiter=',', usecols=[4])

import cv2
import numpy as np
values = np.loadtxt('iris.data',dtype=np.float32, delimiter=',',
usecols=[0,1,2,3])
labels = np.loadtxt('iris.data', dtype=np.str,delimiter=',',
usecols=[4])
lab=[]
for k in range (len(labels)):
if labels[k]=='Iris-setosa':
lab.append(1.0)
elif labels[k]=='Iris-versicolor':
lab.append(2.0)
elif labels[k]=='Iris-virginica':
lab.append(3.0)
lab=np.array(lab,dtype=np.float32)
knn.train(values,cv2.ml.ROW_SAMPLE,lab)
#test new value
test=np.array([[6.0,3.4,4.5,1.6]],dtype=np.float32)
retval, result, neigh_resp, dists = knn.findNearest(test, 3)
if result==1.0:
result='Iris-setosa'
elif result==2.0:
result='Iris-versicolor'
elif results==3.0:
result='Iris-virginica'
print('Class of New data:',result)
MNIST OCR HANDWRITTEN KNN PYTHON
• Tập dữ liệu số viết tay Modified National Institute of Standards and
Technology database chứa 60.000 ảnh số viết tay 28x28 dùng huấn luyện
và 10.000 ảnh để kiểm tra.
• Nhận dạng chữ viết tay dùng KNN, download tập dữ liệu từ trang web
http://yann.lecun.com/exdb/mnist/, chửa 4 file
train-images-idx3-ubyte.gz: tập ảnh huấn luyện
train-labels-idx1-ubyte.gz: tập nhãn huấn luyện chứa số 0..9
t10k-images-idx3-ubyte.gz: tập ảnh kiểm tra
t10k-labels-idx1-ubyte.gz: tập nhãn kiểm tra
Tập ảnh huấn luyện chứa 60.000 ảnh còn tập ảnh kiểm tra chứa 10.000 ảnh.
Giải nén và chứa trong folder ví dụ MNIST 4 file trên. Chú ý khi giải nén dấu –
trước idx sẽ thành dấu chấm, ta phải đổi trở lại thành dấu gạch
train-images-idx3-ubyte
train-labels-idx1-ubyte
t10k-images-idx3-ubyte
t10k-labels-idx1-ubyte

• Ảnh là các số viết tay 28*28 trắng trên nền đen chứa dưới dạng
vector hàng 768 phần tử.
• Cài đặt phần mềm python-mnist có nhiệm vụ lấy thông tin từ tập
dữ liệu MNIST , trong command windows gõ pip install python-
mnist
• Khi viết chương trình knn, dùng lệnh sau
from mnist import MNIST
mndata = MNIST(‘đường dẫn thư mục chứa các file MNIST')
images, labels = mndata.load_training()
test, labeltest=mndata.load_testing()
• images, test là ma trận, mỗi hàng tương ứng một số, là vector 768
phần tử có giá trị là cường độ mỗi điểm ảnh,
• labes, labeltest là ma trận một cột, mỗi hàng là số từ 0 đến 9, số hàng
tùy thuộc số ảnh.
• Chương trình e:/computer vision/Project 2018/MNIST
HANDWRITTEN DIGIT/MINISTOCRHandWrittenDigitKNN.py
import winsound
frequency = 2500 # Set Frequency To 2500 Hertz
duration = 1000 # Set Duration To 1000 ms == 4 second
import time
import numpy as np
import cv2
from matplotlib import pyplot as plt
from mnist import MNIST
mnist = MNIST('e:/computer vision/MNIST/')
img_train, lbl_train = mnist.load_training()
img_test, lbl_test = mnist.load_testing()

#show any number image use print
index = 7777
img1=img_train[index]
print(lbl_train[index])
print('Showing num:
{}'.format(lbl_train[index]))
print(mnist.display(img1)
#show one number image use cv2
img1=np.array(img1, 'uint8')
img1=img1.reshape(28, 28) # reshape into 2D
matrix
cv2.imshow('img',img1)
cv2.waitKey(),cv2.destroyAllWindows()
model = cv2.ml.KNearest_create()

X_train = np.float32(img_train)#matrix 60.000 rows 784 columns
y_train = np.float32(lbl_train)#vector 60.000 rows of values 0..9
print('Training')
model.train(X_train,cv2.ml.ROW_SAMPLE, y_train)
print('Training Completed'), print('Testing, Wait!')
def tic():
# Homemade version of matlab tic and toc functions
global startTime_for_tictoc
startTime_for_tictoc = time.time()
def toc():
if 'startTime_for_tictoc' in globals():

MNIST OCR HANDWRITTEN KNN
PYTHON
print ("Elapsed time is " + str(time.time() - startTime_for_tictoc)
+ " seconds.")
else:
print ("Toc: start time not set")
#Test 10.000 samples, Wait some minutes
X_test = np.float32(img_test)
retval, results, neigh_resp, dists = model.findNearest(X_test, 3)
correct = np.count_nonzero(results.flat()== lbl_test)
print('Test Completed'),accuracy = correct*100.0/len(lbl_test))
print ('Accuracy', accuracy)
Kết quả là Accuracy 97.05%

• Test with 100 sample of opencv
#Test use 100 Test image of opencv
img = cv2.imread('d:/TestImageKNN.png' , 0)
#show test image
img=cv2.resize(img,(280,280))
cv2.imshow('TestImage',img)
cv2.waitKey(), cv2.destroyAllWindows()
#split image into 100 cells 10*10, each cell has
size 28*28
img =[np.hsplit(row, 10) for row in
np.vsplit(img, 10)]
img=np.array(img)
print(img[1,1].size)
#reshape img into matrix 10 rows and 784 columns
img = img[:,:10].reshape(-1,784).astype(np.float32)
#create labels
k = np.arange(10)
testlabels = np.repeat(k,10)[:,np.newaxis]
##print(img[1])
##print(testlabels.flatten())
##print(len(testlabels))
retval, results, neigh_resp, dists = model.findNearest(img, 3)
correct = np.count_nonzero(results.flatten() == testlabels)
accuracy = correct*100.0/len(testlabels)
print('Result',results.flatten())
print('Accuracy:', accuracy) TS NGUYỄN ĐỨC THÀNH 39
Result [0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 2. 2. 2. 2.
2. 2. 2. 2. 2. 2. 3. 3. 3. 3. 3. 3. 3. 3. 3. 3. 4. 4. 4. 4. 4. 4. 4. 4.
4. 4. 5. 5. 5. 5. 5. 5. 5. 5. 5. 5. 6. 6. 6. 6. 6. 6. 6. 6. 6. 6. 7. 7.
7. 7. 7. 7. 7. 7. 7. 7. 8. 8. 8. 8. 8. 8. 8. 8. 8. 8. 9. 9. 9. 9. 9. 9.
9. 9. 9. 0.]
Accuracy: 99.0,
có một số sai nhầm số 9 với 0
Data and Label training can be saved in disk and loaded in memory by
commands np.savez and np.load
np.savez('d:/knn_data.npz',train=X_train, train_labels=y_train)
# Now load the data
with np.load('d:/knn_data.npz') as data:
print( data.files )
X_train = data['train']
y_train = data['train_labels']

Viết một chương trình khác để nhận dạng

TestKNNMNISTnumberone.py
import winsound
import numpy as np
import cv2
from matplotlib import pyplot as plt
with np.load('d:/knn_data.npz') as data:
print( data.files )
X_train = data['train']
y_train = data['train_labels']

model = cv2.ml.KNearest_create()
model.train(X_train,cv2.ml.ROW_SAMPLE, y_train)
img = cv2.imread('d:/1.png' , 0)
img=cv2.resize(img,(28,28))
img=np.array(img)
img = img.reshape(-1,784).astype(np.float32)
retval, results, neigh_resp, dists = model.findNearest(img, 3)
print('Result',results.flatten())
frequency = 2500 # Set Frequency To 2500 Hertz
duration = 1000 # Set Duration To 1000 ms == 1 second
winsound.Beep(frequency, duration)

OCR ALPHABET ENGLISH
• Cơ sở dữ liệu: C:/opencv/sources/ samples/data/letter-
recognition.data
• https://archive.ics.uci.edu/ml/datasets/letter+recognition
• Chứa dữ liệu của 28 ký tự viết hoa tiếng Anh. Ví dụ 2 hàng đầu
T,2,8,3,5,1,8,13,0,6,6,10,8,0,8,0,8
I,5,12,3,7,2,10,5,5,4,13,3,9,2,8,4,10
• Mỗi hàng gồm 17 cột, các số là số nguyên chuẩn hóa đến 16
1. lettr ASCII Chữ hoa (26 giá trị từ A đến Z)
2. x-box vị trí ngang của tâm hộp bao tính từ trái
3. y-box vị trí dứng của tâm hộp bao tính từ dưới
4. width bề rộng hộp
5. high bề cao hộp
6. onpix số tổng cộng pixel 1

OCR ALPHABET ENGLISH
7. x-bar trung bình khoảng cách ngang x của on pixels so với tâm
chia cho bề rộng
8. y-bar trung bình khoảng cách dọc y của on pixels so với tâm chia
cho bề cao
9. x2bar trung bình của x bình phương
10. y2bar trung bình của y bình phương 11. xybar trung bình của
tích xy
12. x2ybr trung bình của x * x * y
13. xy2br trung bình của x * y * y
14. x-ege trung bình các điểm biên từ trái sang phải
15. xegvy tương quan của x-ege với y
16. y-ege trung bình các điểm biên quét từ dưới lên
17. yegvx tương quan của y-ege với x

OCR ALPHABET ENGLISH OPENCV 3
PYTHON KNN
import cv2
import numpy as np
# Load the data, converters convert the letter to a number
data= np.loadtxt('C:/opencv/sources/samples/data/letter-
recognition.data', dtype= 'float32', delimiter = ',',
converters= {0: lambda ch: ord(ch)-ord('A')})
# split the data to two, 10000 each for train and test
train, test = np.vsplit(data,2)
# split trainData and testData to features and responses
responses, trainData = np.hsplit(train,[1])
OCR ALPHABET ENGLISH OPENCV 3
PYTHON KNN
labels, testData = np.hsplit(test,[1])
# Initiate the kNN, classify, measure accuracy.
knn.train(trainData,cv2.ml.ROW_SAMPLE,
responses)
ret, result, neighbours, dist =
knn.findNearest(testData, k=5)
correct = np.count_nonzero(result == labels)
accuracy = correct*100.0/10000
print (accuracy)

TESSERACT OCR PYTHON
Nhận dạng ký tự trên hình ảnh, bản scan hay video là đề tài khó vì có
hàng trăm ngôn ngữ , nhiều font chữ, các kiểu chữ viết tay. Tuy nhiên hiện
nay cơ bản vấn đề đã dược giải quyết với các phần mềm thương mại cho
ký tự Latin đánh máy chính xác đến 99%. Đối với chữ viết tay và một số
ngôn ngữ thì độ chính xác vẫn chưa cao.
Phần mềm tesseract là phần mềm OCR miễn phí mã nguồn mở do Google
quản lý, hỗ trợ nhiều hệ điều hành. Version 4.0 (2nd Oct 2018) hỗ trợ hơn
100 ngôn ngữ (có tiếng Việt) tượng hình tượng thanh, viết từ trái sang phải
hoặc ngược lại
https://github.com/tesseract-ocr/tesseract
Download tesseratt ocr https://github.com/UB-Mannheim/tesseract/wiki
cài đặt và khai báo path c:/Program files(x86)/Tesseract_OCR
Cài đặt trên python trong command windows
pip install pillow
pip install pytesseract
pip install numpy
pip install opencv-python

https://www.pyimagesearch.com/2018/09/17
/opencv-ocr-and-text-recognition-with-
tesseract/
Tesseract có thể dùng trong command windows, ví dụ đọc văn
bản trên hình d:/ex1.png. Mở command windows và gõ
c:\>tesseract d:/ex1.png stdout, chương trình xuất ra văn bản

c:\>tesseract d:/ex2.png stdout // xuất văn bản ra màn hình

c:\>tesseract d:/ex2.png d:/out.txt //lưu văn bản vào file
out.txt
Nhận dạng tiếng Việt có phần mềm VietOCR kết hợp Tesseract
download từ http://taimienphi.vn/download-vietocr-37671/taive
Giải nén, chạy file ocr.bat trong e:/computer vision/vietocr
Vào menu file open chọn file ảnh hay PDF… open, file sẽ xua61tt
hiện ở bên trái, chọn Command OCR d8e63 tách văn bả tiếng
Việt xuất hiện ở bên phải . Vào File Save để lưu văn bản vào đĩa.
TESSERACT OCR VIETOCR

• Trong trường hợp ảnh có nhiễu, độ phân giải thấp độ chính
xác nhận dạng có thể giảm, cần phải tiền xử lý
Tesseract làm việc tốt với chữ đen trên nền trắng, ký tự theo chiều
ngang và kích thước lớn họn 20 pixel

from PIL import Image #pip install pillow
import cv2
import sys
import pytesseract
# Define config parameters.
# '-l eng' for using the English language
# '--oem 1' for using LSTM OCR Engine
config = ('-l eng --oem 1 ')
image = cv2.imread("d:/ex5.png", cv2.IMREAD_COLOR)
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
#gray = cv2.threshold(gray, 0, 255,cv2.THRESH_BINARY |
cv2.THRESH_OTSU)
#gray = cv2.medianBlur(gray, 3)
text = pytesseract.image_to_string(gray, config=config)
print(text)




Không nhận
dạng được

OCR MATLAB
• Matlab có ứng dụng ocr giúp nhận dạng ký tự
txt = ocr(I) tách ký tự từ ảnh I
txt = ocr(I, roi) tách ký tự từ vùng roi của ảnh I
txt chứa ký tự, vị trí và độ tin cậy confidence của ký tự đã tách
businessCard = imread('businessCard.png');
ocrResults = ocr(businessCard)
recognizedText = ocrResults.Text;
figure;
imshow(businessCard);
text(600, 150, recognizedText, 'BackgroundColor', [1 1 1]);

Optical Character Recognition (OCR)

close all;
I = imread('handicapSign.jpg');
%Define one or more rectangular regions of interest within I.
roi = [360 118 384 560];
figure;
imshow(I);
%You may also use IMRECT to select a region using a mouse:
%figure; imshow(I); roi = round(getPosition(imrect))
ocrResults = ocr(I, roi);
%Anchor Point at Right Top of TextBox
Iocr = insertText(I,roi(1:2),ocrResults.Text,'AnchorPoint',...
'RightTop','FontSize',30);
figure; imshow(Iocr);


Automatically Detect and Recognize Text in Natural
Images
Automatically Detect and Recognize Text in Natural Images.docx
Tìm vùng văn bản dùng hàm detectMSERFeatures để tách đặc
trưng MSER Maximally Stable Extremal Regions
openExample('vision/TextDetectionExample')
colorImage= imread('handicapSign.jpg');
figure; imshow(colorImage);
I = rgb2gray(colorImage);
Thử dùng OCR tách văn bản ta được văn bản tối nghĩa
ocrtxt = ocr(colorImage); [ocrtxt.Text]
% Detect MSER regions.
[mserRegions, mserConnComp] = detectMSERFeatures(I, ...
'RegionAreaRange',[200 8000],'ThresholdDelta',4);
Figure; imshow(I);hold on TS NGUYỄN ĐỨC THÀNH 62
Images
plot(mserRegions, 'showPixelList',
true,'showEllipses',false) ans =
title('MSER regions') '
hold off :1
E
{D
.. PA
I . SPECIAL i3LA'rE
REQUIRED
UNAUTHORIZED
VEHICLES L
- MAY BE ‘rowan E
AT OWNERS
'
Images
Có những vùng không văn bản bị nhầm,
để khắc phục ta dùng các tính chất hình
học của văn bản. Dùng hàm regionprops
để xét tính chất hình học các vùng
mserStats = regionprops(mserConnComp,
'BoundingBox', 'Eccentricity', ...
'Solidity', 'Extent', 'Euler', 'Image');
bbox = vertcat(mserStats.BoundingBox);
w = bbox(:,3);
h = bbox(:,4);
aspectRatio = w./h;
% Threshold the data to determine which
regions to remove. These thresholds
% may need to be tuned for other images.
Automatically Detect and Recognize Text in
Natural Images
filterIdx = aspectRatio' > 3;
filterIdx = filterIdx | [mserStats.Eccentricity] > .995 ;
filterIdx = filterIdx | [mserStats.Solidity] < .3;
filterIdx = filterIdx | [mserStats.Extent] < 0.2 | [mserStats.Extent] > 0.9;
filterIdx = filterIdx | [mserStats.EulerNumber] < -4;
% Remove regions
mserStats(filterIdx) = []; mserRegions(filterIdx) = [];
% Show remaining regions
Figure;imshow(I);hold on
plot(mserRegions, 'showPixelList', true,'showEllipses',false)
title('After Removing Non-Text Regions Based On Geometric
Properties');hold off
Natural Images

Natural Images
%Loại bỏ tiếp những vùng không văn bản dựa vào bề rộng của nét chữ.
% Get a binary image of the a region, and pad it to avoid boundary
%effects during the stroke width computation.
regionImage = mserStats(6).Image;
regionImage = padarray(regionImage, [1 1]);
% Compute the stroke width image.
distanceImage = bwdist(~regionImage);
skeletonImage = bwmorph(regionImage, 'thin', inf);
strokeWidthImage = distanceImage;
strokeWidthImage(~skeletonImage) = 0;
% Show the region image alongside the stroke width image.
Figure; subplot(1,2,1)
Images
imagesc(regionImage)
title('Region Image')
subplot(1,2,2)
imagesc(strokeWidthImage)
title('Stroke Width Image')
% Compute the stroke width
variation metric
strokeWidthValues =
distanceImage(skeletonImage);
strokeWidthMetric =
std(strokeWidthValues)/mean(stroke
WidthValues);
Images
% Threshold the stroke width variation metric
strokeWidthThreshold = 0.4;
strokeWidthFilterIdx = strokeWidthMetric > strokeWidthThreshold;
% Process the remaining regions
for j = 1:numel(mserStats)
regionImage = mserStats(j).Image;
regionImage = padarray(regionImage, [1 1], 0);
distanceImage = bwdist(~regionImage);
skeletonImage = bwmorph(regionImage, 'thin', inf);
strokeWidthValues = distanceImage(skeletonImage);
strokeWidthMetric =
std(strokeWidthValues)/mean(strokeWidthValues);
Images
strokeWidthFilterIdx(j) =
strokeWidthMetric >
strokeWidthThreshold;
end
% Remove regions based on the
stroke width variation
mserRegions(strokeWidthFilterIdx) =
[];
mserStats(strokeWidthFilterIdx) = [];
% Show remaining regions
figure; imshow(I);hold on;
plot(mserRegions, 'showPixelList',
true,'showEllipses',false)
hold off TS NGUYỄN ĐỨC THÀNH 70
Images
title('After Removing Non-Text Regions Based On Stroke Width
Variation')
Sau khi đã khoanh vùng văn bản ta kết nối các ký tự riêng lẻ thành
dòng ký tự có nghĩa thay vì những ký tự rời rạc. Nguyên tắc là tìm các
vùng văn bản cạnh nhau rồi gom lại thành vùng lớn hơn
% Get bounding boxes for all the regions
bboxes = vertcat(mserStats.BoundingBox);
% Convert from the [x y width height] bounding box %format to the
[xmin ymin xmax ymax] format for %convenience.
xmin = bboxes(:,1);
ymin = bboxes(:,2);
xmax = xmin + bboxes(:,3) - 1;
ymax = ymin + bboxes(:,4) - 1;
Natural Images
% Expand the bounding boxes by a small amount.
expansionAmount = 0.02;
xmin = (1-expansionAmount) * xmin;
ymin = (1-expansionAmount) * ymin;
xmax = (1+expansionAmount) * xmax;
ymax = (1+expansionAmount) * ymax;
% Clip the bounding boxes to be within the image bounds
xmin = max(xmin, 1);
ymin = max(ymin, 1);

Natural Images
xmax = min(xmax, size(I,2));
ymax = min(ymax, size(I,1));
% Show the expanded bounding
boxes
expandedBBoxes = [xmin ymin
xmax-xmin+1 ymax-ymin+1];
IExpandedBBoxes =
insertShape(colorImage,'Rectangle',e
xpandedBBoxes,'LineWidth',3);
figure
imshow(IExpandedBBoxes)
title('Expanded Bounding Boxes
Text')

Images
Có các hộp bao chồng lấn nhau, dùng hàm bboxOverlapRatio tính tỷ số
chồng lấn, hai hộp chồng lấn được gom lại bằng graph, sau đó tìm các
hộp có liên kết với nhau, loại bỏ các hộp cô lập.
% Compute the overlap ratio
overlapRatio = bboxOverlapRatio(expandedBBoxes, expandedBBoxes);
% Set the overlap ratio between a bounding box and itself to zero to
% simplify the graph representation.
n = size(overlapRatio,1);
overlapRatio(1:n+1:n^2) = 0;
% Create the graph
g = graph(overlapRatio);
% Find the connected text regions within the graph
componentIndices = conncomp(g);

Images
% Merge the boxes based on the minimum and maximum dimensions.
xmin = accumarray(componentIndices', xmin, [], @min);
ymin = accumarray(componentIndices', ymin, [], @min);
xmax = accumarray(componentIndices', xmax, [], @max);
ymax = accumarray(componentIndices', ymax, [], @max);
% Compose the merged bounding boxes using the [x y width height]
format.
textBBoxes = [xmin ymin xmax-xmin+1 ymax-ymin+1];
% Remove bounding boxes that only contain one text region
numRegionsInGroup = histcounts(componentIndices);
textBBoxes(numRegionsInGroup == 1, :) = [];
% Show the final text detection result.
Natural Images

Natural Images
ITextRegion =
insertShape(colorImage, 'Rectangle',
textBBoxes,'LineWidth',3);
figure; imshow(ITextRegion);
title('Detected Text')
Sau khi phát giác vùng văn bản dùng

OCR tách văn bản ,có vài chữ nhận
dạng sai nhưng nhìn chung có thể
hiểu được ý nghĩa văn bản
ocrtxt = ocr(I, textBBoxes);
[ocrtxt.Text]

NEURAL NETWORK

NEURAL NETWORK
• Mạng neuron dùng để tiên đoán chuỗi thời gian, mô phỏng hàm
một biến, đa biến phức tạp hay nhận dạng tiếng nói hình ảnh
• Mạng neuron chia làm hai loại shallow network chỉ có vài lớp ẩn,
deep NN có hàng chục hàng trăm lớp ẩn.
• Mạng neuron gồm lớp vào, các lớp ẩn và lớp ra

FEEDFORWARD NN
Ví dụ: Cho hàm t=f(x), ta huấn luyện mạng neuron với ngõ vào x,
ngõ ra y sao cho y≈f(x) cực tiểu bình phương sai số y-t
%nhập dữ liệu huấn luyện simplefit_dataset với x là vector dữ liệu
1*94, t là giá trị theo x 1*94
[x,t] = simplefit_dataset; plot (x,t)
%tạo mạng net 1 lớp ẩn 10 neuron với thuật toán huấn luyện 'trainlm
theo phương pháp lan truyền ngược Levenberg Marquardt
net = feedforwardnet(10); % muốn dùng hàm huấn luyện khác , ví
%dụ Bayes Regularization ta viết net = feedforwardnet(10, ‘trainbr’);
net = train(net,x,t);
view(net) % xem cấu trúc mạng
y = net(x); plot (x, y-t) %tính ngõ ra
perf = perform(net,y,t)
FEEDFORWARD NN

FEEDFORWARD NN
Mạng có 1 ngõ vào , 1 ngõ ra, 1

lớp ẩn, 10 neuron lớp ẩn

FEEDFORWARD NN
Các vector trọng số lớp vào được chứa trong net.IW. Các vector
trọng số liên kết các lớp được chứa trong net.LW. Các giá trị bias
được chứa trong net.b Số lớp là net.numLayers gồm lớp vào, các
lớp ẩn và lớp ra. Sự liên kết giữa ngõ vào j và lớp i chỉ bởi giá trị
1 hay 0 của net.inputLayer{i,j}, tương tự ta có net. layerConnect
và net.outputConnect. Gõ lệnh net ta sẽ có thông tin về mạng.
net.IW{i,j} là vector trọng số từ ngõ vào j đến lớp i, net.LW{i,j}
là vector trọng số từ lớp j đến lớp i, net.b{i} là bias lớp i
IW=net.IW
IW =
2×1 cell array
{10×1 double}
{ 0×0 double}

FEEDFORWARD NN
IW{1,1}
ans =
12.9205; 7.4531; 9.1206; -6.0380; -7.2144; -8.8360; -
4.9697; -6.2756; 6.4273; -10.4042
LW=net.LW
LW =
2×2 cell array
{0×0 double} {0×0 double}
{1×10 double} {0×0 double}
LW{2,1}
ans =
0.1278 0.5588 -0.1646 0.9151 -0.1881 -0.0410 0.2117
0.1988 0.6092 -0.3523
FEEDFORWARD NN
b=net.b
b=
2×1 cell array
{10×1 double}
{[ -0.3253]}
b1=b{1,1};b1
ans =
-11.7523; -6.0097; -5.2119; 2.7746; 1.1551; -0.0172; -
2.3040; -3.8117; 5.8337; -10.6646
Ngõ ra y được tính theo x, IW, LW và bias
y = b2 + LW * tansig( b1 * ones(1,N) + IW * x )
Cũng với ví dụ trên, giả sử lớp ẩn dùng 5 neuron lệnh khởi tạo
mạng là net = feedforwardnet(5); Giả sử dùng 2 ớp ẩn mỗi lớp ẩn
5 neuron ta dùng lệnh net = feedforwardnet([5 5]);
NHẬN DẠNG KÝ TỰ DÙNG MẠNG NN

DEEP LEARNING HỌC SÂU

DNN
• Deep learning là một nhánh của machine learning dựa trên mạng
nơ rôn, tuy nhiên thay vì chỉ có một hay hai lớp ẩn, deep neural
network có thể có đến hàng trăm lớp ẩn nhờ vào sự cải tiến thuật
toán, hỗ trợ của máy tính có GPU (Graphic processing unit) và cơ
sở dữ liệu lớn phục vụ cho việc huấn luyện, tạo ra các mô hình đã
được huấn luyện sẵn như AlexNet, GoogLeNet…
• ImageNet, Pascal VOC là các cơ sở dữ liệu lớn chứa hàng chục
triệu hình ảnh theo hàng chục ngàn chỉ mục thông dụng dùng để
huấn luyện phân lớp vật.
http://www.image-net.org/
http://host.robots.ox.ac.uk/pascal/VOC/

DEEP LEARNING
Deep learning is a branch of machine learning that teaches
computers to do what comes naturally to humans: learn from
experience. Machine learning algorithms use computational
methods to “learn” information directly from data without
relying on a predetermined equation as a model. Deep
learning is especially suited for image recognition, which is
important for solving problems such as facial recognition,
motion detection, and many advanced driver assistance
technologies such as autonomous driving, lane detection,
pedestrian detection, and autonomous parking.
Neural Network Toolbox™ provides simple MATLAB®
commands for creating and interconnecting the layers of a
deep neural network. Examples and pretrained networks make
it easy to use MATLAB for deep learning, even without
knowledge of advanced computer vision algorithms or neural
networks.

DEEP LEARNING INTRODUCTION (Matlab Help)
We have a set of images where each image contains one of four
different categories of object, and we want the deep learning network to
automatically recognize which object is in each image. We label the
images in order to have training data for the network.
Using this training data, the network can then start to understand
the object’s specifc features and associate them with the corresponding
category.
Each layer in the network takes in data from the previous layer,
transforms it, and passes it on. The network increases the complexity
and detail of what it is learning from layer to layer.
Notice that the network learns directly from the data—we have no
inﬂuence over what features are being learned.

DEEP LEARNING INTRODUCTION

Deep learning is a subtype of machine learning. With
machine learning, you manually extract the relevant
features of an image.
With deep learning, you feed the raw images directly
into a deep neural network that learns the features
automatically.
Deep learning often requires hundreds of thousands or
millions of images for the best results. It’s also
computationally intensive and requires a high-
performance GPU.



Convolutional Neural Network
A convolutional neural network (CNN, or ConvNet) is one of the most
popular algorithms for deep learning with images and video.
Like other neural networks, a CNN is composed of an input layer, an
output layer, and many hidden layers in between.

Feature Detection Layers; These layers perform one of three types
of operations on the data: convolution, pooling, or rectifed linear
unit (ReLU).
Convolution puts the input images through a set of convolutional
flters, each of which activates certain features from the images.
Pooling simplifes the output by performing nonlinear
downsampling, reducing the number of parameters that the
network needs to learn about.
Rectifed linear unit (ReLU) allows for faster and more effective
training by mapping negative values to zero and maintaining
positive values.
These three operations are repeated over tens or hundreds of
layers, with each layer learning to detect different features.

Classifcation Layers
After feature detection, the architecture of a CNN shifts to
classifcation.
The next-to-last layer is a fully connected layer (FC) that
outputs a vector of K dimensions where K is the number of
classes that the network will be able to predict. This vector
contains the probabilities for each class of any image being
classifed.
The final layer of the CNN architecture uses a softmax
function to provide the classifcation output.
There is no exact formula for selecting layers. The best
approach is to try a few and see how well they work or to
use a pretrained network.
Computational Resources for Deep Learning
Training a deep learning model can take hours, days, or weeks,
depending on the size of the data and the amount of processing
power you have available. Selecting a computational resource is a
critical consideration when you set up your workﬂow.
Currently, there are three computation options: CPU-based, GPU-
based, and cloud-based.
CPU-based computation is the simplest and most readily available
option. The example described in the previous section works on a
CPU, but we recommend using CPU-based computation only for
simple examples using a pretrained network.
Using a GPU reduces network training time from days to hours.
You can use a GPU in MATLAB without doing any additional
programming. We recommend an NVidia® 3.0 compute-capable
GPU. Multiple GPUs can speed up processing even more.

Computational Resources for Deep Learning
Cloud-based GPU
computation
means that you
don’t have to buy
and set up the
hardware
yourself. The
MATLAB code
you write for
using a local GPU
can be extended
to use cloud
resources with
just a few
settings changes.

CLASSIFY USING PRETRAINED NETWORK ALEXNET
AlexNet is the name of a convolutional neural network, designed by
Alex Krizhevsky and published with Ilya Sutskever and Geoffrey
Hinton. AlexNet contained eight layers; the first five were
convolutional layers, and the last three were fully connected layers.
It used the non-saturating ReLU activation function, which showed
improved training performance over tanh and sigmoid
clear
% Load the neural net
nnet = alexnet; picture = imread(‘path’);
picture = imresize(picture,[227,227]);
% Classify the picture
label = classify(nnet, picture);
% Show the label
imshow(picture); title(char(label));
AlexNet

AlexNet

Classify Image Using GoogLeNet
GoogLeNet has been trained on over a million images and can
classify images into 1000 object categories (such as keyboard, coffee
mug, pencil, and many animals). The network has learned rich
feature representations for a wide range of images. The network
takes an image as input and outputs a label for the object in the
image together with the probabilities for each of the object
categories.
net = googlenet;
inputSize = net.Layers(1).InputSize;
classNames = net.Layers(end).ClassNames;
numClasses = numel(classNames);
%Show ten trained objects
disp(classNames(randperm(numClasses,10)))
I = imread('peppers.png');
I = imresize(I,inputSize(1:2));

%Classify
[label,scores] = classify(net,I);
figure
imshow(I)
title(string(label) + ", " +
num2str(100*scores(classNames
== label),3) + "%");

%Display top five predictions
[~,idx] = sort(scores,'descend');
idx = idx(5:-1:1);
classNamesTop =
net.Layers(end).ClassNames(id
x);
scoresTop = scores(idx);
figure
barh(scoresTop)
xlim([0 1])
title('Top 5 Predictions')
xlabel('Probability')
yticklabels(classNamesTop)

GoogleNet

Transfer Learning Using GoogLeNet
Dùng mạng CNN có sẵn để huấn luyện phân lớp một object mới,
đỡ tốn thời gian
Project 2018/Deep
Learning/TransferLearningGoogleNetMatlab.docx

Create Simple Deep Learning Network
for Handwritten Digit Classification
Project 2018/Deep Learning/Create Simple Deep Learning
Network for Handwritten Digit Classification.docx
 Load and explore image data.
 Define the network architecture.
 Specify training options.
 Train the network.
 Predict the labels of new data and calculate the classification
accuracy.
• Load and Explore Image Data

Create Simple Deep Learning Network for
Handwritten Digit Classification
digitDatasetPath = fullfile(matlabroot,'toolbox','nnet','nndemos', ...
'nndatasets','DigitDataset');
imds = imageDatastore(digitDatasetPath, ...
'IncludeSubfolders',true,'LabelSource','foldernames');
figure;
perm = randperm(10000,20);
for i = 1:20
subplot(4,5,i);
imshow(imds.Files{perm(i)});
end
labelCount = countEachLabel(imds)
img = readimage(imds,1);
size(img)
• Specify Training and Validation Sets
numTrainFiles = 750;
[imdsTrain,imdsValidation] =
splitEachLabel(imds,numTrainFiles,'randomize');
• Define the convolutional neural network architecture.
layers = [
imageInputLayer([28 28 1])
convolution2dLayer(3,8,'Padding','same')
batchNormalizationLayer
reluLayer
maxPooling2dLayer(2,'Stride',2)

reluLayer
maxPooling2dLayer(2,'Stride',2)
reluLayer
fullyConnectedLayer(10)
softmaxLayer
classificationLayer];
• Specify Training Options

Create Simple Deep Learning Network for
Handwritten Digit Classification
options = trainingOptions('sgdm', ...
'InitialLearnRate',0.01, 'MaxEpochs',4, 'Shuffle','every-epoch',
... 'ValidationData',imdsValidation, 'ValidationFrequency',30, ...
'Verbose',false, 'Plots','training-progress');
• Train Network Using Training Data
net = trainNetwork(imdsTrain,layers,options);
• Classify Validation Images and Compute Accuracy
YPred = classify(net,imdsValidation);
YValidation = imdsValidation.Labels;

DEEP LEARNING FRAMWORK
• Framework là các thư viện mã nguồn mở chứa công cụ hỗ trợ các
thuật toán deeplearning. Các framework hiện có: Tensor Flow,
Caffe2, NVIDIA Caffe, Matlab for Deep Learning, Theano, Torch,
PyTorch, MXNet, NEON, CNTK (Microsoft Cognitive
Toolkit)...hỗ trợ lập trình trong Matlab, C++, Python tùy loại.
• Framework hỗ trơ nạp các mô hình mạng đã được huấn luyện, huấn
luyện mạng, ...
• TensorFlow™ (https://www.tensorflow.org/ ) là thư viện mạ nguồn
mở do nhóm Google Brain trong tổ chức AI của Google phát triển,
là thư viện được dùng phổ biến nhất.
• Keras là phần mềm chạy trên Python hỗ trợ sử dụng (backend) các
framework như Tensor Flow, Theano, CNTK, hỗ trợ mạng CNN,
RNN,..

DEEP LEARNING OPENCV
• Opencv3 có module dnn hỗ trợ deep learning với các
framework Caffe, TensorFlow, và Torch/PyTorch
• Các mô hình mạng đã được huấn luyện OpenCV hỗ trợ:
GoogleLeNet,AlexNet, SqueezeNet, VGGNet, ResNet…
• Nạp ảnh: cv2.dnn.blobFromImage, cv2.dnn.blobFromImages
cv::dnn::blobFromImage, cv::dnn::blobFromImages,
• Nạp model: cv2.dnn.readNetFromCaffe,
cv2.dnn.readNetFromTensorFlow, cv2.dnn.readNetFromTorch
cv2.dnn.readhTorchBlob, cv::dnn::readNetFromCaffe,
cv::dnn::readNetFromTensorflow, cv::dnn::readNetFromTorch..
• net.forward: phân lớp dựa trên mạng đã huấn luyện và ảnh vào.
• File mẫu: opencv/sourses/dnn/samples/caffe_googlenet.cpp

TENSORFLOW PYTHON 3.6
Chạy trên python 3.6 64 bit, opencv3 tensorflow 1.12.0
Máy tính nên có GPU
pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.12.0-
py3-none-any.whl
Hay pip install tensorflow
pip install keras
https://machinelearningmastery.com/tutorial-first-neural-network-python-keras/
Mở python, kiểm tra cài đặt
>>> import keras
Using TensorFlow backend.
Chương trình dùng data set của 268 người dương tính và 500 người âm tính với bệnh tiểu
đường. Số liệu mỗi người gồm
1. Number of times pregnant 2. Plasma glucose concentration a 2 hours in an oral glucose
tolerance test 3. Diastolic blood pressure (mm Hg) 4. Triceps skin fold thickness (mm) 5.
2-Hour serum insulin (mu U/ml) 6. Body mass index (weight in kg/(height in m)^2) 7.
Diabetes pedigree function 8. Age (years) 9. Class variable (0 âm tính or 1 dương tính )
Data set: https://raw.githubusercontent.com/jbrownlee/ Datasets/master/pima-indians-
diabetes.data.csv
https://raw.githubusercontent.com/jbrownlee/Datasets/master/pima-indians-diabetes.names

DEEP LEARNING PYTHON
# Create first network with Keras
from keras.models import Sequential
from keras.layers import Dense
import numpy
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load pima indians dataset
dataset = numpy.loadtxt("c:/python36/pima-indians-diabetes-
data.csv", delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:,0:8]
Y = dataset[:,8]
DEEP LEARNING PYTHON
# create model
model = Sequential()
model.add(Dense(12, input_dim=8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
# Compile model
model.compile(loss='binary_crossentropy', optimizer='adam',
metrics=['accuracy'])
# Fit the model
model.fit(X, Y, epochs=150, batch_size=10, verbose=2)
# calculate predictions
predictions = model.predict(X)
# round predictions
rounded = [round(x[0]) for x in predictions]
print(rounded)

DEEP LEARNING IMAGE CLASIFICATION PYTHON
• https://www.pyimagesearch.com/2017/08/21/deep-learning-with-
opencv/
• Dùng dnn của opencv_contrib, mạng đã được huấn luyện
GoogleLeNet (pre-trained on ImageNet) để nhận dạng vật
• deep_learning_with_opencv.py
# import the necessary packages
import numpy as np
import argparse
import time
import cv2
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
help="path to input image")
ap.add_argument("-p", "--prototxt", required=True,
help="path to Caffe 'deploy' prototxt file")
ap.add_argument("-m", "--model", required=True,

help="path to Caffe pre-trained model")
ap.add_argument("-l", "--labels", required=True,
help="path to ImageNet labels (i.e., syn-sets)")
args = vars(ap.parse_args())
# --:space
• --image : The path to the input image.
• --prototxt : The path to the Caffe “deploy” prototxt file.
• --model : The pre-trained Caffe model (i.e,. the network
weights themselves).
• --labels : The path to ImageNet labels (i.e., “syn-sets”).

# load the input image from disk

image = cv2.imread(args["image"])
# load the class labels from disk
rows = open(args["labels"]).read().strip().split("\n")
classes = [r[r.find(" ") + 1:].split(",")[0] for r in rows]
# our CNN requires fixed spatial dimensions for our input
#image(s) so we need to ensure it is resized to 224x224 pixels
#while performing mean subtraction (104, 117, 123) to
#normalize the input;
# after executing this command our "blob" now has the shape:
# (1, 3, 224, 224)
blob = cv2.dnn.blobFromImage(image, 1, (224, 224), (104, 117,
123))
# load our serialized model from disk
print("[INFO] loading model...")
net = cv2.dnn.readNetFromCaffe(args["prototxt"], args["model"])
# set the blob as input to the network and perform a forward-#pass to
obtain our output classification
net.setInput(blob)
start = time.time()
preds = net.forward()
end = time.time()
print("[INFO] classification took {:.5} seconds".format(end - start))
# sort the indexes of the probabilities in descending order (higher
# probabilitiy first) and grab the top-5 predictions
idxs = np.argsort(preds[0])[::-1][:5]
# loop over the top-5 predictions and display them
for (i, idx) in enumerate(idxs):
# draw the top prediction on the input image
if i == 0:
text = "Label: {}, {:.2f}%".format(classes[idx], preds[0][idx] * 100)
cv2.putText(image, text, (5, 25), cv2.FONT_HERSHEY_SIMPLEX,
0.7, (0, 0, 255), 2)
# display the predicted label + associated probability to the console
print("[INFO] {}. label: {}, probability: {:.5}".format(i + 1,
classes[idx], preds[0][idx]))
cv2.imshow("Image", image), cv2.waitKey(0)

• download the source code + pre-trained GoogLeNet

architecture + example images, unzip to a folder for example
d:\deep-learning-opencv
• open up a terminal, change to folder of file and execute the
following command:
• d:\deep-learning-opencv>python
deep_learning_with_opencv.py -i images/jemma.png -p
bvlc_googlenet.prototxt -m bvlc_googlenet.caffemodel -l
synset_words.txt




C9c KNN ANN DNN

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

C9c KNN ANN DNN

Uploaded by

Copyright:

Available Formats

K NEAREST NEIGHBOR

TS NGUYỄN ĐỨC THÀNH 1

TS NGUYỄN ĐỨC THÀNH 2

TS NGUYỄN ĐỨC THÀNH 3

TS NGUYỄN ĐỨC THÀNH 6

TS NGUYỄN ĐỨC THÀNH 7

TS NGUYỄN ĐỨC THÀNH 8

TS NGUYỄN ĐỨC THÀNH 9

TS NGUYỄN ĐỨC THÀNH 10

TS NGUYỄN ĐỨC THÀNH 11

TS NGUYỄN ĐỨC THÀNH 12

TS NGUYỄN ĐỨC THÀNH 13

TS NGUYỄN ĐỨC THÀNH 14

TS NGUYỄN ĐỨC THÀNH 15

TS NGUYỄN ĐỨC THÀNH 17

TS NGUYỄN ĐỨC THÀNH 18

TS NGUYỄN ĐỨC THÀNH 19

TS NGUYỄN ĐỨC THÀNH 23

TS NGUYỄN ĐỨC THÀNH 26

TS NGUYỄN ĐỨC THÀNH 27

TS NGUYỄN ĐỨC THÀNH 28

TS NGUYỄN ĐỨC THÀNH 29

TS NGUYỄN ĐỨC THÀNH 32

TS NGUYỄN ĐỨC THÀNH 34

TS NGUYỄN ĐỨC THÀNH 35

TS NGUYỄN ĐỨC THÀNH 36

TS NGUYỄN ĐỨC THÀNH 37

TS NGUYỄN ĐỨC THÀNH 40

Viết một chương trình khác để nhận dạng

TS NGUYỄN ĐỨC THÀNH 41

TS NGUYỄN ĐỨC THÀNH 42

TS NGUYỄN ĐỨC THÀNH 43

TS NGUYỄN ĐỨC THÀNH 44

TS NGUYỄN ĐỨC THÀNH 46

TS NGUYỄN ĐỨC THÀNH 47

TS NGUYỄN ĐỨC THÀNH 48

c:\>tesseract d:/ex2.png stdout // xuất văn bản ra màn hình

TS NGUYỄN ĐỨC THÀNH 50

TS NGUYỄN ĐỨC THÀNH 51

TS NGUYỄN ĐỨC THÀNH 53

TS NGUYỄN ĐỨC THÀNH 54

TS NGUYỄN ĐỨC THÀNH 55

TS NGUYỄN ĐỨC THÀNH 56

TS NGUYỄN ĐỨC THÀNH 57

TS NGUYỄN ĐỨC THÀNH 58

TS NGUYỄN ĐỨC THÀNH 59

TS NGUYỄN ĐỨC THÀNH 60

TS NGUYỄN ĐỨC THÀNH 61

TS NGUYỄN ĐỨC THÀNH 66

TS NGUYỄN ĐỨC THÀNH 72

TS NGUYỄN ĐỨC THÀNH 73

TS NGUYỄN ĐỨC THÀNH 74

TS NGUYỄN ĐỨC THÀNH 76

Sau khi phát giác vùng văn bản dùng

TS NGUYỄN ĐỨC THÀNH 77

TS NGUYỄN ĐỨC THÀNH 78

TS NGUYỄN ĐỨC THÀNH 79

TS NGUYỄN ĐỨC THÀNH 81

Mạng có 1 ngõ vào , 1 ngõ ra, 1

TS NGUYỄN ĐỨC THÀNH 82

TS NGUYỄN ĐỨC THÀNH 83

TS NGUYỄN ĐỨC THÀNH 86

TS NGUYỄN ĐỨC THÀNH 87

TS NGUYỄN ĐỨC THÀNH 88