Professional Documents
Culture Documents
1
Helwan University
Faculty of engineering
Communication and Computer Department
By:
Mohamed Hassan Mohamed Hassan Communication dep
Shereen Raafat Abd El hady Communication dep
Amr Hisham Computer dep
Donia Mohamed Communication dep
Nourhan Mahmoud Abd El Reheem Computer dep
Karim Fathy Communication dep
Menna Mohamed El Awady Communication dep
Supervised by :
Dr .Samir Gaber
2
Dedication
3
TABLE OF CONTENTS
CH1:Introduction………………………page 5
CH2:features…………………………….page 8
CH3:Emotion recognition………….page18
CH4:Software…………………………..page37
Ch5:Hardware………………………….page49
CH6:Mobile application……………page57
CH7:Final Result………………………page 77
4
CH1:Introduction
As Students of faculty of engineering , Communication and Computer
department , when we were thinking for graduation project idea, we have searched for fields
that helps people .
When we searched in autism ,we have found that All methods of
dealing with autistic people depends on doctors or specialists.
We were thinking of using technology for helping autistic children.
First let's talk about Autism...
Problem definition:-
What is Autism?
Autism, or autism spectrum disorder (ASD)
as it's sometimes called, is a lifelong
condition that affects how a person
experiences the world around them, how
they communicate and interact socially, as
well as their interests and behavior .it is
referred to as a 'spectrum disorder' because it can affect people in different ways and to
varying degrees.
Children with autism have trouble communicating. They have trouble understanding what
other people think and feel. This makes it very hard for them to express themselves either
with words or through gestures, facial expressions, and touch.
5
What is the percentage rate of Autism?
The disorder affects around 700,000 people in the UK, which is more than 1 person in 100
persons which represents the highest percentage all over the world. Also it affects around 1%
of Egypt’s population which is around 800,000 of 90 million people.
6
Hardware
glasses
ST-FAP
Software
mobile
application
1.Hardware glasses:
Smart treatment for autistic patient which makes them more socially engaged using our smart
glasses to help them to recognize & understand facial emotions also to improve the autistic
eye contact .We will do this using a glasses, raspberry-pi3 model B, raspberry-pi camera and a
screen.
Firstly, we will help the autistic patient to improve the eye contact (using brain power
system) On the screen of the glasses ,will appear a mask on the face of the opposite
person to grab the patient's attention as the autistic patient usually interacts with
technology and cartoons much more than interacting with people .by using our smart
glasses first we will try to take the autistic attention by putting a mask on the person’s
face who wants to contact with through image processing then the mask will be
removed to let the autistic eye contact with that person and then we will set a timer to
count time he keeps in eye contact with that person, as long as the patient keeps eye
contact he will take a bonus points as motivation to keep looking and improve patient’s
eye contact.
.
Secondly , we will help the autistic patient to recognize the facial emotions of the
opposite person. As the raspberry-pi camera will video stream the face of the opposite
person and then using raspberry-pi3 model B the face of the opposite person will
undergoes image processing & machine learning, finally word & picture expressing the
7
emotion will appear on the screen with the face of the person as background, to help
the patient to make link between the word and the emotion.
By the training & wearing this glasses for a long time the patient could recognize the
emotions independently by himself and express it by word
CH2:Feautures
We started with image processing, machine & deep learning,
developed software design from scratch and developed the code
which achieve the configurations of our project.
.
❖ Image processing & Deep learning:-
OPENCV:
OpenCV (Open Source Computer Vision) is a library of programming functions mainly aimed at
real-time computer vision. In simple language it is library used for Image Processing. It is
mainly used to do all the operation related to Images.
The library has more than 2500 optimized algorithms, which includes a comprehensive set of
both classic and state-of-the-art computer vision and machine learning algorithms. These
algorithms can be used to detect and recognize faces, identify objects, classify human actions
in videos, track camera movements, track moving objects, extract 3D models of objects,
produce 3D point clouds from stereo cameras, stitch images together to produce a high
resolution image of an entire scene, find similar images from an image database, remove red
eyes from images taken using flash, follow eye movements, recognize scenery and establish
markers to overlay it with augmented reality, etc. OpenCV has more than 47 thousand people
8
of user community and estimated number of downloads exceeding14 million. The library is
used extensively in companies, research groups and by governmental bodies.
What it can do :
1.Read and Write Images.
2. Detection of faces and its features.
3. Detection of shapes like Circle, rectangle etc in a image. E.g Detection of coin in images.
4. Text recognition in images. e.g Reading Number Plates
5. Modifying image quality and colors e.g Instagram, CamScanner.
6. Developing Augmented reality apps.
and many more.....
import Libraries
a) DEPENDENCIES
Let's first install the required dependencies to run this code.
9
1. OpenCV 3.2.0 should be installed.
2. Python v3.5 should be installed.
cv2.waitKey(): This is a keyboard binding function, which takes one argument: (x) time in
milliseconds. The function delays for (x) milliseconds any keyboard event. If (0) is pressed, it
waits indefinitely for a keystroke, if any other key is pressed the program continues.
cv2.destroyAllWindows(): This simply destroys all the windows we created
using cv2.imshow(window_name, image)
Phase (1):
• Face and Eye detection :
10
You look at your phone, and it extracts your face from an image, this is called “Face detection”,
so we could say Face detection is a computer technology being used in a variety of applications
that identifies human faces in digital images.
The first question you would think about is how glasses detect that there is a human face
somewhere, and detect his eyes!
This leads us to go through face detection techniques!!
• Goal
In this session,
✓ We will see the basics of face detection using Haar Feature-based
Cascade Classifiers
✓ We will extend the same for eye detection etc.
• Basics
Object Detection using Haar feature-based cascade classifiers is an effective object
detection method proposed by Paul Viola and Michael Jones in their paper, "Rapid
Object Detection using a Boosted Cascade of Simple Features" in 2001. It is a
machine learning based approach where a cascade function is trained from a lot of
positive and negative images. It is then used to detect objects in other images.
Here we will work with face detection. Initially, the algorithm needs a lot of
positive images (images of faces) and negative images (images without faces) to
train the classifier. Then we need to extract features from it. For this, Haar features
shown in the below image are used. They are just like our convolutional kernel.
Each feature is a single value obtained by subtracting sum of pixels under the white
rectangle from sum of pixels under the black rectangle.
11
Now, all possible sizes and locations of each kernel are used to calculate lots of features. (Just
imagine how much computation it needs? Even a 24x24 window results over 160000 features).
For each feature calculation, we need to find the sum of the pixels under white and black
rectangles. To solve this, they introduced the integral image. However large your image, it
reduces the calculations for a given pixel to an operation involving just four pixels. Nice, isn't it?
It makes things super-fast.
But among all these features we calculated, most of them are irrelevant. For example, consider
the image below. The top row shows two good features. The first feature selected seems to
focus on the property that the region of the eyes is often darker than the region of the nose
and cheeks. The second feature selected relies on the property that the eyes are darker than
the bridge of the nose. But the same windows applied to cheeks or any other place is
irrelevant. So how do we select the best features out of 160000+ features? It is achieved
by Adaboost.
12
For this, we apply each and every feature on all the training images. For each feature, it finds
the best threshold which will classify the faces to positive and negative. Obviously, there will
be errors or miss classifications. We select the features with minimum error rate, which means
they are the features that most accurately classify the face and non-face images. (The process
is not as simple as this. Each image is given an equal weight in the beginning. After each
classification, weights of miss classified images are increased. Then the same process is done.
New error rates are calculated. Also new weights. The process is continued until the required
accuracy or error rate is achieved or the required number of features are found).
The final classifier is a weighted sum of these weak classifiers. It is called weak because it alone
can't classify the image, but together with others forms a strong classifier. The paper says even
200 features provide detection with 95% accuracy. Their final setup had around 6000 features.
(Imagine a reduction from 160000+ features to 6000 features. That is a big gain).
So now you take an image. Take each 24x24 window. Apply 6000 features to it. Check if it is
face or not. Wow.. Isn't it a little inefficient and time consuming? Yes, it is. The authors have a
good solution for that.
In an image, most of the image is non-face region. So it is a better idea to have a simple
method to check if a window is not a face region. If it is not, discard it in a single shot, and
don't process it again. Instead, focus on regions where there can be a face. This way, we spend
more time checking possible face regions.
For this they introduced the concept of Cascade of Classifiers. Instead of applying all 6000
features on a window, the features are grouped into different stages of classifiers and applied
one-by-one. (Normally the first few stages will contain very many fewer features). If a window
fails the first stage, discard it. We don't consider the remaining features on it. If it passes, apply
the second stage of features and continue the process. The window which passes all stages is a
face region. How is that plan!
The authors' detector had 6000+ features with 38 stages with 1, 10, 25, 25 and 50 features in
the first five stages. (The two features in the above image are actually obtained as the best two
features from Ada boost). According to the authors, on average 10 features out of 6000+ are
evaluated per sub-window.
13
So this is a simple intuitive explanation of how Viola-Jones face detection works. Read the
paper for more details or check out the references in the Additional Resources section.
Here we will deal with detection. OpenCV already contains many pre-trained classifiers for
face, eyes, smiles, etc. Those XML files are stored in the opencv/data/haarcascades/ folder.
Let's create a face and eye detector with OpenCV.
First we need to load the required XML classifiers. Then load our input image (or video) in
grayscale mode.
import numpy as np
import cv2 as cv
face_cascade = cv.CascadeClassifier('haarcascade_frontalface_default.xml')
eye_cascade = cv.CascadeClassifier('haarcascade_eye.xml')
img = cv.imread('sachin.jpg')
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
Now we find the faces in the image. If faces are found, it returns the positions of detected
faces as Rect(x ,y ,w ,h). Once we get these locations, we can create a ROI for the face and
apply eye detection on this ROI (since eyes are always on the face !!! ).
15
You may ask how mask is placed on human face in real time video. Actually I had answered this
question 2 lines above in extraction region of face from picture you can replace it easily with
mask as you got the coordinates of the face!
Steps:
1. Detect face from the input video frame
2. Load the mask and make the white region of the mask transparent.
3. Put the mask at the face position
4. Display the image
Pygame
Pygame is a set of Python modules designed for writing games. Pygame adds functionality
on top of the excellent SDL library. This allows you to create fully featured games and
multimedia programs in the python language. Pygame is highly portable and runs on nearly
every platform and operating system. Pygame itself has been downloaded millions of times,
and has had millions of visits to its website. Distributed under GPL.
16
import pygame
pygame.init()
song = pygame.mixer.Sound('thesong.ogg')
clock = pygame.time.Clock()
song.play()
the main problem here is to run both of codes at the same time, which will cause delay
due to the limitations of raspberry pi , so after long time of thinking we have reached to a way
in order to optimize our code to avoid delay .
1- Get SD-Card of class 10 .
17
2-try to get higher than the required space ( in our case was required 16G and we used
32Gi)
3-try to optimize your code to avoid parallel theareding on the processor .
4-work with low resolution as number of pixels will decrease and then processing will
decrease leading to a decrease in delay.
CH3:Emotion Recognition
Phase (2):
• Emotion Recognition:
This is weird!, how could glasses knows or detect emotion on human face , how it even
understand that you smile or being angry , this takes us to know more details about deep
18
learning using artificial intelligence , that allows glasses to take a decision that you are
smiling or angry or whatever…………
So now let’s talk about some topics of deep learning like convolutional neural network (CNN)
and artificial neural network (ANN)to know how we reached this result. Also how to build a
model to predict these emotions.
ANN:-
Human’s brain is the most powerful thing in learning but, how we can create a device which
performs the same operations of our brain?!
so we want to predict something you always could have some input something to start your
predictions of them that’s called the input layer, then you have the output( the value you want
to predict) and it is called the output layer.
And in between we are going to have a hidden layer so as you could see in your brain you
have so many neurons so some information is coming in through your eyes, ears, nose so
basically your senses, and those not going to the output where you have the results directly
but it goes through billions of neurons before guessing the output and this is the whole
concept behind us, we are going to model a brain, so we need these hidden layers that are
there before the output.
19
ANN intuition:-
• The neuron
I/p signal 1 X1 W1
1- Continuous
Xm
2- Binary
3- Categorical
20
By adjusting the weights, the Neural Network decides in every single case what single signal is
poor and what signal is not important to certain neuron, so the weights are the things that can
be adjusted during learning, when you are training under ANN you are basically adjusting all of
the weights.
2- Sigmoid function
This function is used in logistic regression,
smooth function and it is very useful in the final
output layer especially when you are trying to
predict probabilities.
Used in output layer.
3- Softmax function
If you are dealing with independent variables
that has more than 2 categories like 3, 4 ….. categories we will use the softmax function, also
we will change the o/p parameters number that will you know be set as the number of classes.
So softmax is like sigmoid function but it used when you have independent variables which
consists of more than 2 categories as Softmax is kind of Multi Class Sigmoid
21
Why is it better to use Softmax function than sigmoid function?
Sigmoid function is the special case of Softmax function where the number of classes are 2.Use
sigmoid for binary classification and softmax for multiclass classification. If the number of
classes is 2, then softmax is the same as the sigmoid function.
1- Rectifier function
The most popular function for ANN.
Used in hidden layer
We apply the rectifier because we want to increase
the non-linearity in our NN, the reason why we want
to increase non-linearity is because the images
themselves are highly nonlinear as the image has
different colors & elements. So to running this feature
detection to create the feature maps we risk that we
might create something linear so we need to break up the linearity.
O/P value
X1 (Predicted by NN)
W1 Sum(Wi Xi) Y’
X2
i from 1 to
m
W2
Y
Wm Actual value
Xm
10
Chart Title
We need to compare the o/p value with the actual value, so we
8
want to know the error. The goal here is to minimize the cost
6
function because the lower the cost function the closer o/p value
4
to the actual value so we are going to feed this information back
2
22
0
Y' y C
O/P Value actual value cost fun.
into NN, and it goes to the weights so we update the weights and to get the minimum Cost
function that is your final NN, that means your weights have been adjusted and you have
found optimal weights for this dataset that you began your training on and you are ready to
process the testing phase or to the application phase and this whole process is called back
propagation which is used to train the network by adjusting all the weights simultaneously at
the same time.
N.B cost function = ½ (Y’ – Y)^2
All the above was for just one row, in case we have multiple rows we repeat all the above for
each row we will get the o/p value and compare it with the actual value, finally we calculate
the total cost function and adjust the weights.
1) Randomly initialize the weights to small numbers close to zero (but not 0)
23
2) Input the first observation of your dataset in the input layer, each feature in one input
node. So we know the number of nodes in the input layer which is the number of
independent variables we have in our matrix of features.
3) Forward propagation: from left to right, the neurons are activated in a way that the
impact of each neuron’s activation is limited by the weights. Propagate the activations
until getting the predicted result Y.
4) Compare the predicted result to the actual result, measure the generated error.
5) Back propagation: from right to left, the error is back propagated. Update the weights
according to how much they are responsible for the error. The learning rate decides by
how much we update the weights.
6) Repeat steps 1 to 5 and update the weights after each observation (reinforcement
learning using stochastic gradient descent), OR repeat step 1 to 5 but update the
weights only after a batch of observations (batch learning using batch gradient descent).
7) When the whole training set passed through the ANN. That makes an epoch redo move
epoch.
CNN:-
We will see the way that computers to be processing images is going to be extremely similar to
the way we are processing images, after that neural network has been trained to recognize
facial emotions.
24
Then it’s changed to matrix like as shown:
CNN intuition:-
Step 1- convolution operation
Convolution which used to extract our features from the image using feature detector might
hear it being kernel or you might hear it being called filter and creating different convoluted
feature maps.
25
What we have created?
We have reduced the size of the image if you have stride 1 the image reduced a bit but if you
have stride 2 the image is going to be reduced more and that’s a very important function of
the feature detector as the feature detector allows us to bring forward and get rid of all the
unnecessary things.
We create many feature maps to obtain our convolution layer, we use different filters at each
feature map so we use different feature detectors to get different feature maps at the same
image.
Relu layer:
We use Relu activation function just to make sure
that we don’t have any negative pixel values in our
feature maps depending on the parameters that we
use for our convolution operation we can get
26
something out of pixels in the feature map and we need to remove these negative pixels in
order to have non linearity in our CNN
Step 2- pooling:
We know that the max numbers in your feature map they represent where you found the
closest similarity to your feature but then by pooling we get rid of 75% of the information that
is not the feature which is not important thing which helps us in terms of processing. Also
another advantage of pooling is reducing the number of parameters that are going into our
final layer of the NN and therefore we prevent over fitting
We apply the max pooling to reduce the number of nodes we will get in the next step.
If you look at the convoluted image & the pooled image you will see the same features but
with less information.
27
Step 3- flattening:
We take the pooled feature map and we get flatten it into a column, and the reason for that is
because we want to input this into ANN for future processing
You have many pooled layers with many pooled feature maps and then you flatten them so
you put them into one long column sequentially one after the other and you get one huge
vector input for ANN.
so if we don’t reduce the size of these maps we will get 2 large vector and then we will get too
many nodes in the fully connected layers and therefore our model will be highly compute
intensive
28
Why we don’t lose the spatial structure of CNN after flattening?
By creating our feature maps we extracted the spatial structure information by getting high
numbers in the feature maps because these high numbers are associated to a specific feature
in the input image and since then we apply max pooling and keeping these high numbers and
the flattening step just consists of putting all the numbers in the cells of the feature maps into
one single vector so we still keep these high number in the single vector.
If we have different images for the same thing in a different positions and we want NN to
recognize it so we have to make sure that our NN has a property called spatial invariance
meaning that it doesn’t care where the features are as our NN has some level of flexibility to
be able to still find that feature
29
Step 4- Full connection:
In this step we are adding a whole ANN to our CNN, the main purpose of the ANN is to
combine our features into more attributes that predict the classes even better.
After and then all of this is trained through forward propagation and backward propagation
process.
Not only weights are trained in the ANN part but also the feature detectors are trained and
adjusted in that same gradient descent process and that allow us to come up with the best
feature maps.
30
And now let’s see….
How to build a CNN Model?!
To build a model, we will divide it into 2 parts: (CODE)
A) image preprocessing
B) Creating CNN ( we need 2 modules, sequential module to initialize our NN, dense
module to build the layers of our NN)
Now we will take a simple example on a CNN model to detect emotion facial expressions…..
A) image preprocessing:-
Now we have images so we need to do some image preprocessing to be able to input
these images in our CNN, and this will be the first step to build our model.
Step 1:
We are going to do image preprocessing depending on Keras, keras is an amazing library
for deep learning and computer vision as it contains some tricks and tools to import
some images. So to import these images we need to prepare a special structure for the
dataset as follows:
1) Create a folder and name it as dataset.
2) Create 3 subfolders and name the first one as training set (contains 80% of the
images) and the second one as test set (contains 20% of the images) finally the third
one as single prediction.
3) Inside the training set folder create 6 subfolders called (neutral, happy, sad, angry,
surprise, fear) and put the images inside each subfolder.
4) Inside the test set folder create 6 subfolders called (neutral, happy, sad, angry,
surprise, fear) and put the images inside each subfolder.
5) Inside the single prediction folder put only a single image of each emotion you want
to predict.
Till now we make a large part of date or image preprocessing manually and then we just need
to do some feature scaling and image augmentation so that our deep learning model can run
most efficient.
B)Creating CNN:-
Open the Anaconda Navigator and launch spyder to write these
python code lines:
Step 2:
# Importing the Keras libraries and packages:-
from keras.models import Sequential#to initialize our NN as a sequence of layers
from keras.layers import Conv2D#convolution step to add conv layers, 2D because we deal with images
not videos
from keras.layers import MaxPooling2D#pooling step that will add our pooling layers
from keras.layers import Flatten#flattening
from keras.layers import Dense#to add a fully connected layers to a classic ANN
32
Step 3:
# Initializing the CNN
#We are going to create a new object of the sequential class and we called it classifier because
we are making a classifier to classify some images to tell each image is happy or sad
classifier = Sequential()
Step 4:
#Adding the 1stConvolution layer
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))
#32>no of feature detectors, 3,3>row & column of detector, 64*64>pixels from image , 3>13 channels as it’s
colored images.
#we will start with 32 feature detector as it’s common and then we add another convolutional layers and we
can increase the number of feature detectors to be 64, 128,256 as we working on a CPU, so we want to start
slowly.
#if you are using GPU you can increase the no. of pixels to 128 , 256.
Step 5:
# Pooling
classifier.add(MaxPooling2D(pool_size = (2, 2)))
# 2,2 is the size of table that we stride all over the feature map to take the maximum in each sub table
Step 6:
# Flattening
classifier.add(Flatten())
33
Step 7:
And now we will build a fully connected layer as we will use the input vector as the input layer
of a classic ANN, because ANN can be a great classifier for nonlinear problems and since image
classification is a nonlinear problem it will make a perfect job here to classify the images. Also
we have to create a hidden layer which called a fully connected layer and then we will add the
output layer.
# Full connection
classifier.add(Dense(units = 128, activation = 'relu'))
#one hidden layer contains 128 nodes & the activation fun is rectifier
classifier.add(Dense(units = 1, activation = 'softmax'))
#one o/p layer contains 1 node & the activation function is softmax as we have outcome with more than 2
categories.
Step 8:
# Compiling the CNN
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
#optimizer adam is stochastic gradient descent algorithm , loss functions is categorical cross entropy as we
have more than 2 outcomes but if we have 2 outcomes we will use binary cross entropy, metric performance
is accuracy.
34
Step 9:
# Part 2 - Fitting the CNN to the images
from keras.preprocessing.image import ImageDataGenerator
# import the class that allow us to use image data generator function
train_datagen = ImageDataGenerator(rescale = 1./255, # rescale pixels from 0 to 255
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'categorical')
classifier.fit_generator(training_set,
steps_per_epoch = 76, #no of images in trainingset
epochs = 10, #as it increases the accuracy increases
validation_data = test_set,
validation_steps = 19) #no of images in test set
Step 10:
# Adding a second convolutional layer
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
The input is going to be not the images but the pooled feature maps
35
Step 11:
#part 3 - Make new predictions
import numpy as np#to preprocess the image
from keras.preprocessing import image#image module from keras
test_image = image.load_img('dataset/single_prediction/happy_or_sad2.jpg',target_size = (64, 64))
#var which contains the image we are going to load
test_image = image.img_to_array(test_image)
#adding the 3rd dimension to the image to convert it from 2D to 3D
test_image = np.expand_dims(test_image , axis =0)
#adding th 4th dimension to expand the image
result = classifier.predict(test_image)#to predict and tell us it is 1 or 0
training_set.class_indices #mapping to tell us 1,0 is happy or sad,..
if result [0][0] == 0:
prediction = 'Happy'
elif result [0][0] == 1:
prediction = 'sad'
elif result [0][0] == 2:
prediction = 'neutral'
elif result [0][0] == 3:
prediction = 'angry'
elif result [0][0] == 4:
prediction = 'fear'
36
But the best one is to add another convolutional layer as you will see how it will definitely
improve our performance results on your observations on the test set as well it will reduce
over fitting.
If you want to add new convolutional layers then you can increase the number of feature
detectors and double it each time.
To get a really better accuracy that would be to choose a higher target size for your images of
the training set and the test set so that you get more information of your pixel patterns
because indeed if you increase the size of your images that is the size down to which all your
images will be resized you will get a lot more pixels in the rows and a lot more pixels in the
columns in your input images and therefore you will have more information to take on the
pixels but you need GPU.
Now the 2 codes of eye contact and emotion recognition are ready let’s see how to install
them on the raspberry Pi……….
37
Ch4:Software
First our codes were implemented on windows as the
developing team was using it and its easier for them to get use
this environment for the development
❖ But why ?!
✘ There are some important libraries aren’t available on Windows
✘ More space consumption
✘ less support community in the field of computer vision
*Simulation*
38
First: VMware installation
Step 1:
Step 2:
39
Step 3:
Step 4:
40
Step 5:
Step 6:
41
Step 7:
43
After setting up the OS, begin to install the libraries used in
Deep learning.
step2:Install dependencies:
$ sudo apt-get update
$ sudo apt-get upgrade
We then need to install some developer tools, including CMake, which helps us
configure the OpenCV build process:
$ sudo apt-get install build-essential cmake pkg-config
$ sudo apt-get install libjpeg-dev libtiff5-dev libjasper-dev libpng12-dev
$ sudo apt-get install libavcodec-dev libavformat-dev libswscale-dev libv4l-dev
$ sudo apt-get install libxvidcore-dev libx264-dev
$ sudo apt-get install libgtk2.0-dev
$ sudo apt-get install libatlas-base-dev gfortran
$ sudo apt-get install python2.7-dev python3-dev
44
Step 3: Download the OpenCV source code
$ cd ~
$ wget -O opencv.zip https://github.com/Itseez/opencv/archive/3.1.0.zip
$ unzip opencv.zip
$ wget -O opencv_contrib.zip
https://github.com/Itseez/opencv_contrib/archive/3.1.0.zip
$ unzip opencv_contrib.zip
45
(when need to make 4 cores not only one core so use( $ make -j4) but not make this
step better because delay happen )
$ make clean
$ make
$ sudo make install
$ sudo ldconfig
46
❖ Commands:
• sudo apt-get update
For Python 2.7
• sudo apt-get install python-pip python-dev
Next, download the wheel file from this repository and install it:
• wget https://github.com/samjabrahams/tensorflow-on-raspberry-
pi/releases/download/v1.1.0/tensorflow-1.1.0-cp27-none-linux_armv7l.whl
We need to reinstall the mock library to keep it from throwing an error when we import
TensorFlow:
• sudo pip uninstall mock
• sudo pip install mock
Assuming your TensorFlow install exited without error you can now test the installation by
opening a Python shell and trying to import the tensorflow package:
$ python
>> import tensorflow
47
Assuming your libraries install exited without error you can now test the installation by
opening a Python shell and trying to import them:
$ python
>> import scipy
>>import h5py
>>import sklearn
>>import cv2
>>import Tensorflow
Assuming your keras install exited without error you can now test the installation by opening a
Python shell and trying to import keras:
$ python
>>import keras
48
Ch5:Hardware
The three models (all of which we use here at Digital Diner) are the Arduino, Raspberry
Pi and BeagleBone.
We chose these three because they are all readily available, affordable, about the same size
(just larger than 2″ x 3″) and can all be used for creating wonderful digital gadgets. Before we
get to the comparison, here is a brief introduction to each one.
49
❖ Second:Beaglebone
The Beaglebone is the perhaps the least known
of these platforms, but an incredibly capable
board worthy of consideration for many
projects. A powerful Linux computer fits inside
an Altoid’s mint container.
❖ Third: Raspberry-pi 3
The Raspberry-pi is the newcomer to the game. It is not really an embedded computer. It is
actually a very inexpensive full-on desktop computer. It is barebones, but at $35 for a real
computer, its worthy of note and it is a great platform for lots of Maker projects.
50
And here is a comparison between these different platforms:
First, the Arduino and Raspberry Pi and very inexpensive at under $40. The BeagleBone comes
in at nearly the cost of three Arduino Unos. In addition, worthy of note is that the clock speed
on the Arduino is about 40 times slower than the other two and it has 128,000 (!) times less
RAM. Already, you can see the differences starting to come out. The Arduino and Raspberry Pi
are inexpensive and the Raspberry Pi and BeagleBone are much more powerful. Seems like the
Raspberry Pi is looking good at this point, however, it is never that simple. First, its price is not
quite as good as it seems because to run the Raspberry Pi you need to supply your own SD
Card, which will run you another $5-10 in cost.
51
In addition, despite the clock speed similarities, in our tests the BeagleBone ran about twice as
fast as the Raspberry Pi. And perhaps most counterintuitive, the Arduino was right in the mix
as far as performance goes as well, at least for a beginner. The reason for this is that the
Raspberry Pi and BeagleBone both run the Linux operating system. This fancy software makes
these systems into tiny computers, which are capable of running multiple programs at the
same time and being programmed in many different languages. The Arduino is very simple in
design. It can run one program at a time and it programmed in low level C++.
An interesting feature of the BeagleBone and the Raspberry Pi is that they run off a flash
memory card (SD Card in the case of Raspberry Pi and MicroSD Card in the case of
BeagleBone). What this means is that you can give these boards a brain transplant just by
swapping the memory card. You can have multiple configurations and setups on different
cards and when you swap cards, you will be right where you left off with that particular
project. Since both of these boards are fairly sophisticated, it even means that you can easily
change operating systems just by creating different cards to swap in.
❖ Tools:
✓ IDLE python2.7 GUI: the Software interpreter we use to write the machine learning
code
.
✓ Raspberry-pi3: mini-computer, which
do processing & compile the python
codes we use it to make our prototype
portable (will talk about it in details).
52
✓ Memory SD card: to setup the raspbian operating system on it & burning codes.
✓ Glasses: glasses supports 2D & 3D videos, also it will be the final design which carries all
the above
The Raspberry Pi is a low cost, credit-card sized computer that plugs into a computer monitor
or TV, and uses a standard keyboard and mouse. It is a capable little device that enables
people of all ages to explore computing, and to learn how to program in languages like Scratch
and Python. It’s capable of doing everything you’d expect a desktop computer to do, from
browsing the internet and playing high-definition video, image processing & machine learning
works.
The Raspberry Pi is slower than a modern laptop or desktop but is still a complete Linux
computer and can provide all the expected abilities that implies, at a low-power consumption
level. There are a two Raspberry Pi models, the A and the B. The A comes with 256MB of RAM
and one USB port. It is cheaper and uses less power than the B. The current model B comes
with a second USB port, an Ethernet port for connection to a network, and 512MB of RAM.
Hv
53
Raspberry-pi model A Raspberry-pi model B
• Memory SD card: it could start from 4Giga to 32 Giga, we used 32Giga memory because
we need to use large amount of data.
Also it is found with different speed starts from class 2, 4, 6, 10, we used class 10 to
speed up activation of the raspberry-pi operating system.
• Power Bank : power supply to the raspberry-pi.
• Keyboard & mouse & screen
• HDMI Cable: provide voice & image with high HD quality
54
55
• Camera Module:
The Camera Module is a great accessory for the Raspberry Pi, allowing users to take still
pictures and record video in full HD.
This high resolution small size camera is specially designed for all versions of raspberry Pi.
The camera sensor is 5megapixel resolution and can shot videos up to
1080P and static photos up to 25920X1944 Pixel.
The small size camera weights about 3 and its dimensions is about
25x20x9mm. It is connected directly to Raspberry using small ribbon -
comes with the camera - through the CSI (camera serial interface) port
on your Rapberry Pi.
• VR Box:
Contain the lenses of the glasses and it will be the holder which will carry the screen ,
Raspberry pi and camera module.
56
CH6:Mobile Application
❖ Tools:
• Android Studio SDK
• Java JDK
❖ Features:
• Determine Autism Level using test CARS’Questions
• Learn emotions ,gestures ,adjectives ,verbs and daily routines
• Color and Emotions Games to Test the progress
• Help the Child to tell about his needs
57
❖ Code:
In main activity, each option is implemented using a button if the button is clicked, navigate to
the next activity
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_main);
Button leveltest = (Button) findViewById(R.id.autismlevel);
Button emotest = (Button) findViewById(R.id.emotest);
Button colortest = (Button) findViewById(R.id.colortest);
Button learnnbtn = (Button) findViewById(R.id.learnnbtn);
Button needs = (Button) findViewById(R.id.needs);
leveltest.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(MainActivity.this, Question1.class);
startActivity(intent);
}
});
needs.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(MainActivity.this, needs.class);
startActivity(intent);
}
});
learnnbtn.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(MainActivity.this, learnoptions.class);
startActivity(intent);
}
});
emotest.setOnClickListener(new View.OnClickListener() {
58
public void onClick(View v) {
Intent intent2 = new Intent(MainActivity.this, testprogress1.class);
startActivity(intent2);
}
});
colortest.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent2 = new Intent(MainActivity.this, color1.class);
startActivity(intent2);
}
});
59
❖ Code:
If the selected choice is the 1st choice then increment the counter by 1, else If the selected
choice isthe 2nd choice increment the counter by 2,If the selected choice is the 3rd choice
increment the counter by 3, If the selected choice is the 4th choice increment the counter by 4.
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_question1);
final Button a1 = (Button) findViewById(R.id.a1);
final Button a2 = (Button) findViewById(R.id.a2);
final Button a3 = (Button) findViewById(R.id.a3);
final Button a4 = (Button) findViewById(R.id.a4);
a1.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
counter=counter+1;
Intent intent = new Intent(Question1.this, question2.class);
startActivity(intent);
finish();
}
});
a2.setOnClickListener(new View.OnClickListener() {
60
@Override
public void onClick(View v) {
counter=counter+2;
Intent intent = new Intent(Question1.this, question2.class);
startActivity(intent);
finish();
}
});
a3.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
counter=counter+3;
Intent intent = new Intent(Question1.this, question2.class);
startActivity(intent);
finish();
}
});
a4.setOnClickListener(new View.OnClickListener() {
@Override
public void onClick(View v) {
counter=counter+4;
Intent intent = new Intent(Question1.this, question2.class);
startActivity(intent);
finish();
}
});
}
@Override
public void onBackPressed()
{
super.onBackPressed();
startActivity(new Intent(Question1.this, MainActivity.class));
counter=0;
finish();
}
The final result range between 15 and 60 (using all Questions) but we used 9 question of 14
questions so the final result range between 9.6 and 38.5
61
• If the final result less than 19 that indicates that no Autism
• If the final result equals 19 that indicates that a low level of Autism
• If the final result between 19 and 24 that indicates a moderate level of Autism
• If the final result less than 25 and 38.5 that indicates a high level of autism.
❖ Code:
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_start);
This Section supports the project’s idea and helps the child using a slide show of image
using next and previous buttons in each aspect such as emotions, gestures, adjectives,
verbs, and daily routines
❖ Code:
Each learning section is implemented using a button, if the button is clicked, then navigate to
the selected learn section and a slide show is displayed.
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_learnoptions);
Button emobtn = (Button) findViewById(R.id.e);
Button gesturebtn = (Button) findViewById(R.id.gesturebtn);
Button verb = (Button) findViewById(R.id.verb);
Button adj = (Button) findViewById(R.id.adj);
63
emobtn.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(learnoptions.this, emo2.class);
startActivity(intent);
}
});
gesturebtn.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(learnoptions.this, greatwork.class);
startActivity(intent);
}
});
adj.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(learnoptions.this, adjectives.class);
startActivity(intent);
}
});
verb.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
Intent intent = new Intent(learnoptions.this, vb.class);
startActivity(intent);
}
});
imgswitcher=(ImageSwitcher) findViewById(R.id.imgswitcher);
imgswitcher.setFactory(new ViewSwitcher.ViewFactory() {
@Override
public View makeView() {
ImageViewimageView = new ImageView(getApplicationContext());
imageView.setScaleType(ImageView.ScaleType.CENTER_INSIDE);
imageView.setLayoutParams(
new ImageSwitcher.LayoutParams(ViewGroup.LayoutParams.MATCH_PARENT,
ViewGroup.LayoutParams.MATCH_PARENT));
return imageView;
}
});
prvbtn.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
rel.setBackgroundColor(Color.WHITE);
imgswitcher.setInAnimation(in2);
imgswitcher.setOutAnimation(out2);
if(i>0){
i--;
imgswitcher.setImageResource(images[i]);
}
}
});
nxtbtn.setOnClickListener(new View.OnClickListener() {
65
public
void
onClick(
View v)
{
rel.setB
ackgroun
dColor(C
olor.WHI
TE);
imgswitc
her.setI
nAnimati
on(in);
imgswitc
her.setO
utAnimat
ion(out)
;
if(i<ima
ges.leng
th-1 ){
i++;
imgswitc
her.setI
mageReso
urce(ima
ges[i]);
}
});
66
Adjectives Gestures
Verbs Feelings
67
3.Color and Emotions Games to Test the progress
Mobile application also helps the child to test his progress using some games such as
games to test emotions recognition and another to test colors knowledge.
Each time he/ she select a choice if it is right the application will shows that this is a
correct answer, else it will obtain that is a wrong answer.
Code:
@Override
protected void onCreate(Bundle savedInstanceState) {
super.onCreate(savedInstanceState);
setContentView(R.layout.activity_color7);
com.cuboid.cuboidcirclebutton.CuboidButtonblackbtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.blackbtn);
com.cuboid.cuboidcirclebutton.CuboidButtonredbtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.redbtn);
com.cuboid.cuboidcirclebutton.CuboidButtongreenbtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.greenbtn);
com.cuboid.cuboidcirclebutton.CuboidButtonyellowbtn=
68
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.yellowbtn);
com.cuboid.cuboidcirclebutton.CuboidButtonorangebtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.orangebtn);
com.cuboid.cuboidcirclebutton.CuboidButtonmaroonbtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.maroonbtn);
com.cuboid.cuboidcirclebutton.CuboidButtonpurplebtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.purplebtn);
com.cuboid.cuboidcirclebutton.CuboidButtonaquabtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.aquabtn);
com.cuboid.cuboidcirclebutton.CuboidButtonbluebtn=
(com.cuboid.cuboidcirclebutton.CuboidButton) findViewById(R.id.bluebtn);
TextView color= (TextView) findViewById(R.id.color);
aquabtn.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
color_score= color_score+ 1;
Toast.makeText(color7.this, " " صحيح, Toast.LENGTH_LONG).show();
Intent intent = new Intent(color7.this, color8.class);
startActivity(intent);
finish();
}
});
blackbtn.setOnClickListener(new View.OnClickListener() {
public void onClick(View v) {
71
3.Help the Child to tell about his needs
Some of Autistic children whose autism level is advanced can’t talk and tell others about
their needs, so we add the needs section to help the child to point to what he needs
72
Also there is an English version of the application
• Learning Section
Main Activity
Has the same parts such as adjectives, gestures, verbs, and feelings
73
Color Test
Test
Test Result
74
Emotions Test
75
76