You are on page 1of 67

SEMESTER TRAINING REPORT

“AUTONOMOUS DRIVING CAR USING


CONVOLUTIONAL NEURAL NETWORK”
Submitted in partial fulfillment of requirements for the award of the degree

Bachelor of Technology
(Information Technology)
To

IKG Punjab Technical University, Jalandhar

SUBMITTED BY:
Shariq Sajad (1802634)
Yamin Rafiq (1803304)

Submitted To:
Dr AmanpreetKaur

DEPARTMENT OF INFORMATION TECHNOLOGY


Chandigarh Engineering College, Landran
Mohali, Punjab - 140307
DECLARATION

I YAMIN RAFIQ hereby declare that the Project Report entitled ("AUTONOMOUS
DRIVING CAR USING CONVOLUTIONAL NEURAL NETWORK ") is an
authentic record of my own work as requirements of 8 th sem academic during the period
from JAN 2022 to JUNE 2022 for the award of degree of B. Tech. (Information
Technology) under the guidance of Fahim Khan.

Certified that the above statement made by the student is correct to the best of our
knowledge and belief.

(Signature of student)
(Name of Student)
(University Roll No.)

Head of Department
(Signature and Seal)
ACKNOWLEDGMENT

I take this opportunity to express my sincere gratitude to the Director- Principal Dr. Rajdeep
Singh Chandigarh Engineering College, Landran for providing this opportunity to carry out the
present work.
I am highly grateful to the Dr. Sushil Kamboj, HOD-IT, Chandigarh Engineering College,
Landran (Mohali). I would like to expresses my gratitude to other faculty members of
Information Technology department of CEC, Landran for providing academic inputs, guidance
& Encouragement throughout the training period. The help rendered by Dr. Amanpreet Kaur and
Mr. Mandeep Singh Devgan, Supervisor for Experimentation is greatly acknowledged. Finally, I
express my indebtedness to all who have directly or indirectly contributed to the successful
completion of my software training.

YAMIN RAFIQ
1803304
CONTENTS

Chapter 1 (Introduction)
1 Introduction 3
1.1Challanges 3
1.2 Data Collection and Training 5
1.3Contribution 6
Chapter 2 (Literature Review) 7
2.1 Software Requirements 8

Chapter 3 (Introduction & Details of Software Used)


3.1 Udacity self Driving car Simulator 10
3.2 Artificial intelligence 11

3.3 Mechine Learning 12


3.4 Deep Learning 13
3.1.4 Application grounds of IOT 17
3.5 Convolutional Neural Network 14

Chapter 4 (System Design)


4.1 Receive Real Time Data From udacity Simulator 16

4.2 Importing useful Python Libraries 19


4.3 Importing Raw Data Information 22
4.4 Data Cleaning 23
4.5 Visualization & Distribution of Data 24
4.6 Splitting Data Into Training Data & Valiidation Data 26

Page | 2
4.7 Image Augmentation 27
4.8 Image Preprocessing 29
4.9 Training our Datathrough CNN Model 30

Chapter 5 (Testing)

5.1 Autonomous Driving Test Simulator 34

Chapter 6 (Result)
6.1 Result 38
6.2 Value Loss or Accuracy 39

6.3 Generalization On Track (Drive Performance) 42

Chapter 7 (ScreenShorts) 45

Chapter 8 (Conclusion & Future Scope) 51


8.1 Conclusion 52
8.2 Future Scope 54

Chapter 9 (Refrences) 63

Page | 3
List of Figures
S.NO Figure Page no.

1. Figure 1.1 10
2. Figure 1.2 11
3. Figure 1.3 12
4. Figure 1.4 13
5. Figure 1.5 14
6. Figure 2.1 16
7. Figure 2.2 17
8. Figure 2.3 17
9. Figure 2.4 18
10. Figure 2.5 18
11. Figure 2.6 21
12. Figure 2.7 22
13. Figure 2.8 23
14. Figure 2.9 23
15. Figure 2.10 24
16. Figure 2.11 24
17. Figure 2.12 25
18. Figure 2.13 26
19. Figure 2.14 26
20. Figure 2.15 27
21. Figure 2.16 28
22. Figure 2.17 29
23. Figure 2.18 29
24. Figure 2.19 30
25. Figure 2.20 30
26. Figure 2.21 31
27. Figure 2.22 31
28. Figure 2.23 32
29. Figure 3.1 34
30. Figure 3.2 35
31. Figure 3.3 35
32. Figure 3.4 36
33. Figure 3.5 37

Page | 4
34. Figure 4.1 39
35. Figure 4.2 40
36. Figure 4.3 40
37. Figure 4.4 41
38. Figure 4.5 41
39. Figure 4.6 42
40. Figure 4.7 43
41. Figure 4.8 44
42. Figure 5.1 46
43. Figure 5.2 46
44. Figure 5.3 47
45. Figure 5.4 47
46. Figure 5.5 48
47. Figure 5.6 48
48. Figure 5.7 49
49. Figure 5.8 49
50. Figure 5.9 50
51. Figure 5.10 50

Page | 5
CHAPTER 1
INTRODUCTION

Page | 6
1. Introduction:

Self-diving vehicles will provide more than a simple luxury, they will eliminate accidents
caused by tired, intoxicated or distracted drivers. By freeing travelers of the need to also be
drivers, autonomous vehicles will bolster the productivity of commuters by allowing passengers
to work while they travel. As autonomous vehicles become ubiquitous they will also reduce
travel times by improving the flow of traffic. Traffic jams caused by accidents, ‘rubber-necking’
and congestion will be eliminated by the ability of autonomous vehicles to relay information
about road conditions and traffic patterns to each other. By giving us back the time we currently
spend driving and ensuring our safety, self-driving cars will offer significant value to humanity.
However, before trusting our lives to self-driving cars, we must ensure that they are able to
stand up to the complex challenges of real-world driving.

1.1 Challenges:

Humans are effective at driving because of our powerful intuition. The human brain is able to
absorb large amounts of data, filter out what is important and use that information to make
decisions. Driving has traditionally been a very challenging task for computers because of the
wide variety of possible scenarios and lack of intuition. Humans are easily able to identify
obstacles, lane marking and other vehicles however, this is a very challenging task for
computers. The challenge of mimicking human perception can be solved by machine learning
algorithms. Although machine learning algorithms represent a promising approach to
autonomous driving, they come with their own set of challenges. One such challenge is the
amount of data that is required to train them. The sheer volume of data needed to create an
effective vehicle controller presents 8 a sizable engineering challenge. In order to drive in a
particular scenario an autonomous vehicle controller must have human-collected data
demonstrating the behavior necessitated by that scenario. For this reason driving has always

Page | 7
been a human dominated task. Humans are able to adapt to varying lighting, weather and traffic
conditions exceptionally well. Adapting to the wide variety of conditions that can be observed
on the road is a challenging task for computers and is one that is best addressed by an algorithm
that can learn and generalize in the same way that humans do. Autonomous vehicles are enabled
by a collection of machine learning and other algorithms. These algorithms rely heavily on data
collected by human drivers and require a layered approach to transform data into vehicle
controls. This data often consists of video streams from multiple on board sensors including
cameras, lidar, radar and infrared. This information can be used to tune or train algorithms,
essentially acting as experience to be learned from. Environmental changes such as new road
types or weather conditions require collection of hundreds of hours of data, processing and
labeling this data to train the algorithms, and re-training the system. System changes such as
new sensors require modifications to the algorithms and intermediate layers that translate high-
level algorithmic outputs into low-level controls. 1.2 Solution Overview Our work addresses the
challenges of autonomous driving through training based on realistic simulation data and
providing an end-to-end learning approach. The neural network that we present is reflexive,
which means that it accepts raw sensor data as input and directly produces vehicle controls as
output. Neural networks are computational representations of brains and are capable of learning
in a similar fashion. We used our realistic simulation data to train our neural network to operate
a car. We ensure the functionality of our approach through real world experiments 9 on a RC
car test system. To the best of our knowledge, this combination of simulated training, a
reflexive neural network, and real world verification has not been previously attempted and
provides a novel approach to improving autonomous vehicles. The neural network that we
implemented is called a convolutional neural network or ‘CNN’. The CNN is a modular
architecture consisting of small networks tailored for particular tasks. A smaller CNN is used
for image processing as that network style best suits that input, a more traditional multi-layer
perceptron network processes the steering data, and a deep multi-layer perceptron network
merges the results to provide a steering angle for the vehicle. Parallel to this network is an
additional multi-layer perceptron network focused solely on throttle and braking. By breaking
up the neural network in this manner we are able to achieve a high level of modularity. Adding
additional inputs to accommodate additional sensor readings would be trivial due to this design.
Beyond that, separating the steering control and throttle control networks reduces training time

Page | 8
and network complexity. Neural networks perform what is called ‘imitation learning’ which
means that they learn to replicate behavior that they observe. To train the CNN that we designed
we needed a large amount of data. Because of the prohibitive cost and potential danger of
collecting data in a real-world car we decided to use a simulation environment to gather training
data. The simulation that we chose to use is a Udacity Self Driving car Simulator. We chose this
as our simulation because of it’s realistic environment and the ease with which it can be
modified.

1.2 Data Collection and Training:

Before the neural network training process could begin we needed to collect the data with
which to train the neural network. We begin the data collection process by modifying the
environment to our liking and repositioning the camera so that it is fixed to the hood of the test
car. The simulated camera provides images in front of the vehicle as it drives on a variety of
road types and traffic conditions. A data collection script running in the background collects
information about the driver’s steering wheel angle and throttle value. Each image from the
simulation is labeled with the corresponding steering and throttle values. After the data has been
recorded it is used as an example to train the neural network. The image captured from the in-
game camera is used as input to the neural network which is then asked to produce the
corresponding steering and throttle values. By identifying the conditions present in each image
the network is able to learn to adapt to a variety of driving conditions and make generalizations
about the task of driving. To expand the set of data further, we augment the data through
cropping selected regions of each image, which provides examples of poor driving and the
correction needed in those scenarios. Not only does this increase the volume of data that we
have but it imbues the neural network with the ability to correct its mistakes. 1.4 Results After
collecting data and training our neural network, we verified the functionality on a verification
data set, which resulted in an average error rate of 1.9%. We then connect this network to a RC
car system, which was chosen for safety during the development process. The RC car provides a
camera input and receives a steering angle and throttle value. We test the steering sensitivity,
reliability, performance, and obstacle avoidance. The car successfully navigates 98 out of 100

Page | 9
laps of a track specifically designed to challenge it with different road types and tight turns. It
also successfully avoids another car with a 90% success rate. These performance results could
certainly be improved by adding additional sensors to our system. Another front-facing camera
to enable stereoscopic 3D would improve obstacle avoidance. Additional side and rear facing
cameras could enable behavior such as backing up, 3-point turns and safe lane changes.
Although we 11 lacked these inputs we were still able to demonstrate the potential of end-to-
end neural networks as vehicle controllers and the potential for simulations to be used as data
sources.

1.3 Contributions:

Our contributions to this field of study can be summarized by the following: • Training of an
end-to-end convolutional neural network • Training of a neural network with data taken from
the Udacity Self Driving car Simulator . Testing of a neural network trained exclusively with
simulation data in the real world • Instrumentation of a small scale vehicle controlled remotely
by a neural network • Support of the notion that simulations are a viable training ground for
autonomous vehicles
.

Page | 10
CHAPTER 2
Software requirement
specifications

Page | 11
Software:

Technologies that are used in the implementation of this project and the motivation behind using
these are described in this section. TensorFlow: This an open-source library for dataflow
programming. It is widely used for machine learning applications. It is also used as both a math
library and for large computation. For this project Keras, a high-level API that uses TensorFlow
as the backend is used. Keras facilitate in building the models easily as it more user friendly.
Different libraries are available in Python that helps in machine learning projects. Several of
those libraries have improved the performance of this project. Few of them are mentioned in this
section. First, “Numpy” that provides with high-level math function collection to support multi-
dimensional metrices and arrays. This is used for faster computations over the weights
(gradients) in neural networks. Second, “scikit-learn” is a machine learning library for Python
which features different algorithms and Machine Learning function packages. Another one is
OpenCV (Open Source Computer Vision Library) which is designed for computational
efficiency with focus on real-time 4 applications. In this project, OpenCV is used for image
preprocessing and augmentation techniques. The project makes use of MiniConda Environment
which is an open source distribution for Python which simplifies package management and
deployment. It is best for large scale data processing. The machine on which this project was
built, is a personal computer with following configuration: • Processor: Intel(R) Core i3-7200U
@ 2.7GHz • RAM: 8GB • System: 64bit OS, x64 processor.

Page | 12
CHAPTER 3
Introduction and
detail of software used

Page | 13
3.1 Udacity Self Driving car Simulator:

It is a Driving simulator created by Udacity with the help of Unity. Unity is a cross-platform
game engine developed by Unity Technology. This simulator is available for Windows, Linux
and MacOS. Udacity’s Self Driving Car Nanodegree Project , they created a driving simulator
that could be used to train and test autonomous steering models. I trained and compared two
Convolutional Neural Network (CNN) models with largely varying architectures using forward
facing images and steering angles from the vehicle in the simulator. The results suggest that the
choice of CNN architecture for this type of task is less important than the data and augmentation
techniques used. Side-by-side videos of the models in the simulator can be seen in the results.

FIGURE 1.1

Page | 14
3.2 Artificial Intelligence:

Science that empowers computers to mimic human intelligence such as decision making, text
processing, and visual perception. AI is a broader field (i.e., the big umbrella that contains
several subfield such as machine learning, robotics, and computer vision). Artificial intelligence
(AI), also known as machine intelligence, is a branch of computer science that focuses on
building and managing technology that can learn to autonomously make decisions and carry out
actions on behalf of a human being.

AI is not a single technology. It is an umbrella term that includes any type of software or
hardware component that supports machine learning, computer vision, natural language
understanding (NLU) and natural language processing (NLP).

FIGURE 1.2

3.3 Machine Learning:

Page | 15
Machine Learning is a subfield of Artificial Intelligence that enables machines to improve at a
given task with experience. It is important to note that all machine learning techniques are
classified as Artificial Intelligence ones. However, not all Artificial Intelligence could count as
Machine Learning since basic Rule-based engines could be classified as AI but they do not learn
from experience therefore they do not belong to the machine learning category Machine
learning algorithms are often categorized as supervised, unsupervised and reinforcement -
Supervised machine learning algorithms can apply what has been learned in the past to new data
using labeled examples to predict future events. Starting from the analysis of a known training
dataset, the learning algorithm produces an inferred function to make predictions about the
output values. The system is able to provide targets for any new input after sufficient training.
The learning algorithm can also compare its output with the correct, intended output and find
errors in order to modify the model accordingly. - In contrast, unsupervised machine learning
algorithms are used when the information used to train is neither classified nor labeled.
Unsupervised learning studies how systems can infer a function to describe a hidden structure
from unlabeled data. The system doesn’t figure out the right output, but it explores the data and
can draw inferences from datasets to describe hidden structures from unlabeled data. -
Reinforcement machine learning algorithms is a learning method that interacts with its
environment by producing actions and discovers errors or rewards. Trial and error search and
delayed reward are the most relevant characteristics of reinforcement learning. This method
allows machines and software agents to automatically determine the ideal behavior within a
specific context in order to maximize its performance. Simple reward feedback is required for
the agent to learn which action is best; this is known as the reinforcement signal.

Page | 16
FIGURE 1.3

3.4. Deep Learning:


Deep Learning is a specialized field of Machine Learning that relies on training of Deep
Artificial Neural Networks (ANNs) using a large dataset such as images or texts. ANNs are
information processing models inspired by the human brain. The human brain consists of
billions of neurons that communicate to each other using electrical and chemical signals and
enable humans to see, feel, and make decisions. ANNs works by mathematically mimicking the
human brain and connecting multiple “artificial” neurons in a multilayered fashion. The more
hidden layers added to the network, the deeper the network gets. What differentiates deep
learning from machine learning techniques is in their ability to extract feature automatically as
illustrated in the following example:-

Machine learning process:


(1) selecting the model to train,
(2) manually performing feature extraction.

Deep Learning Process:


(1) select architecture of the network,

Page | 17
(2) features are automatically extracted by feeding in the training data (such as images) along
with the target class (label).

FIGURE 1.4

3.5. Convolutional Neural Network (CNN):

Convolutional Neural Networks (CNN) is the most successful Deep Learning method used to
process multiple arrays, e.g., 1D for signals, 2D for images and 3D for videos - CNN consists of
a list of Neural Network layers that transform the input data into an output (class/prediction) -
Development of specialized CNN chips (by NVIDIA, Intel Samsung etc.) for real-time
applications in smartphones, cameras, rebots, self driving cars, etc. The convolutional layer is
the core building block of a CNN, and it is where the majority of computation occurs. It
requires a few components, which are input data, a filter, and a feature map. Let’s assume that
the input will be a color image, which is made up of a matrix of pixels in 3D. This means that
the input will have three dimensions—a height, width, and depth—which correspond to RGB in
an image. We also have a feature detector, also known as a kernel or a filter, which will move
across the receptive fields of the image, checking if the feature is present. This process is known
as a convolution. The feature detector is a two-dimensional (2-D) array of weights, which
represents part of the image. While they can vary in size, the filter size is typically a 3x3 matrix;
this also determines the size of the receptive field. The filter is then applied to an area of the

Page | 18
image, and a dot product is calculated between the input pixels and the filter. This dot product is
then fed into an output array. Afterwards, the filter shifts by a stride, repeating the process until
the kernel has swept across the entire image. The final output from the series of dot products
from the input and the filter is known as a feature map, activation map, or a convolved feature.

FIGURE 1.5

Page | 19
CHAPTER 4
System Design

4.1. Receive Real Time Data from Udacity Simulator:

Udacity has built a simulator for self-driving cars and made it open source for the enthusiasts,
so they can work on something close to a real-time environment. It is built on Unity, the video

Page | 20
game development platform. The simulator consists of a configurable resolution and controls
setting and is very user friendly.

Configuration screen
FIGURE 2.1

The graphics and input configurations can be changed according to user preference and machine
configuration. The user pushes the “Play!” button to enter the simulator user interface. You can
enter the Controls tab to explore the keyboard controls, quite similar to a racing game.

Page | 21
Control Configuration
FIGURE 2.2

Menu Screen.
FIGURE 2.3

Page | 22
The Menu actual screen of the simulator and its components are discussed below. The simulator
involves two tracks. One of them can be considered as simple and another one as complex that
can be evident in the screenshots attached. The word “simple” here just means that it has fewer
curvy tracks and is easier to drive on.

Track 1
FIGURE 2.4

Track 2
FIGURE 2.5

Page | 23
There are two modes for driving the car in the simulator:
(1) Training mode and (2) Autonomous mode.
The training mode gives you the option of recording your run and capturing the training dataset.
The small red sign at the top right of the screen depicts the car being driven in training mode.
The autonomous mode can be used to test the models to see if it can drive on the track without
human intervention. Also, if you try to press the controls to get the car back on track, it will
immediately notify you that it shifted to manual controls.

The simulator’s feature to create your own dataset of images makes it easy to work on the
problem. Some reasons why this feature is useful are as follows:
- The simulator has built the driving features in such a way that it simulates that there are three
cameras on the car. The three cameras are in the center, right and left on the front of the car,
which captures continuously when we record in the training mode.
- The stream of images is captured, and we can set the location on the disk for saving the data
after pushing the record button. The image sets are labeled in a sophisticated manner with a
prefix of center, left, or right indicating from which camera the image has been captured.
- Along with the image dataset, it also generates a datalog.csv file. This file contains the image
paths with corresponding steering angle, throttle, brakes, and speed of the car at that instance.
- Column 1, 2, 3: contains paths to the dataset images of center, right and left respectively
- Column 4: contains the steering angle Column value as 0 depicts straight, positive value is right
turn and negative value is left turn.
- Column 5: contains the throttle or acceleration at that instance
- Column 6: contains the brakes or deceleration at that instance
- Column 7: contains the speed of the vehicle
4.2. Import Useful Python Libraries:

- Pandas : Pandas is a library for python programming language. It is used for create, remove,
edit and data manipulation in dataframe https://pandas.pydata.org/ `pip install pandas` & `conda
install pandas.`

Page | 24
- Numpy : NumPy is a library for the Python programming language, adding support for large,
multi-dimensional arrays and matrices, along with a large collection of high-level mathematical
functions to operate on these arrays. https://numpy.org/ `pip install numpy` & `conda install
numpy`.

- Os : Python OS module provides the facility to establish the interaction between the user and
the operating system. It offers many useful OS functions that are used to perform OS-based tasks
and get related information about operating systems. https://docs.python.org/3/library/os.html
Python3 build-in library

- Matplotlib : Matplotlib is a comprehensive library for creating static, animated, and


interactive visualizations in Python. Matplotlib makes easy things easy and hard things possible.
https://matplotlib.org/ `pip install matplotlib` & `conda install matplotlib`

- Scikit-learn(sklearn) : Scikit-learn is a free software machine learning library for the Python
programming language. It features various classification, regression and clustering algorithms
including support vector machines, random forests, gradient boosting, k-means and DBSCAN,
and is designed to interoperate with the Python numerical and scientific libraries NumPy and
SciPy. Scikit-learn is a NumFOCUS fiscally sponsored project. https://scikit-learn.org/stable/
`pip install scikit-learn`, `pip install sklearn` & `conda install scikit-learn`

- Imgaug : imgaug is a library for image augmentation in machine learning experiments. It


supports a wide range of augmentation techniques, allows to easily combine these and to execute
them in random order or on multiple CPU cores, has a simple yet powerful stochastic interface
and can not only augment images, but also keypoints/landmarks, bounding boxes, heatmaps and
segmentation maps. https://imgaug.readthedocs.io/en/latest/ `pip install imgaug` & `conda install
imgaug`

Page | 25
- OpenCV(cv2) : OpenCV is a library of programming functions mainly aimed at real-time
computer vision. Originally developed by Intel, it was later supported by Willow Garage then
Itseez (which was later acquired by Intel). The library is cross-platform and free for use under
the open-source Apache 2 License. Starting in 2011, OpenCV features GPU acceleration for
real-time operations. https://opencv.org/ `pip install opencv-python` & `conda install opencv`

- Tensorflow : TensorFlow is an end-to-end open source platform for machine learning. It has a
comprehensive, flexible ecosystem of tools, libraries and community resources that lets
researchers push the state-of-the-art in ML and developers easily build and deploy ML powered
applications. https://www.tensorflow.org/ `pip install tensorflow` & `pip install tensorflow-gpu`

- Flask : Flask is a micro web framework written in Python. It is classified as a microframework


because it does not require particular tools or libraries. It has no database abstraction layer, form
validation, or any other components where pre-existing third-party libraries provide common
functions. However, Flask supports extensions that can add application features as if they were
implemented in Flask itself. Extensions exist for object-relational mappers, form validation,
upload handling, various open authentication technologies and several common framework
related tools. https://flask.palletsprojects.com/en/2.0.x/ `pip install flask` & `conda install flask`

- Eventlet : Eventlet is a concurrent networking library for Python https://eventlet.net/ `pip


install eventlet`

- Socketio : Bidirectional and low-latency communication for every platform


https://github.com/miguelgrinberg/python-socketio `pip install python-socketio`

Page | 26
FIGURE 2.6

4.3. Importing Raw Data Information:


After receiving raw data from that simulator we found a folder where it stores all the images and
a csv (comma separated values) file named driving_log.csv.
Now we import that csv file in our code by using pandas and os library and define
column name

Page | 27
.

FIGURE 2.7

Page | 28
4.4. Data Cleaning:

(i) Trim Path:


We trim out the path which is not necessary with the help of split function in center column

FIGURE 2.8

Page | 29
(ii) Take Certain and Important Columns:
In your whole project we focus only on Center camera and steering angle

FIGURE 2.9

4.5. Visualization and Distribution of Data:

In this process we focus on the center value of steering value with respect to our data, therefore
we take a set of data (31 row) and find the mid-center value.

Page | 30
FIGURE 2.10

Now minimize our value and find the closest of zero or perfect zero.

FIGURE 2.11

Page | 31
Now Visualized our steering data.

FIGURE 2.12

We cutoff extra center value by cutoff value as it mentioned in our earlier code and receive a
perfect steering graph. Take large amount of data at center with respect to others steering angle
because most of time car will move in forward direction.

Page | 32
FIGURE 2.13

4.6. Splitting data into Training data and Validation Data:

We have to split our data into training data and validation data for the learning process or
creating model. So here we to split 80% of training data and 20% of validation data from original
data.

FIGURE 2.14

Page | 33
4.7. Image Augmentation:

(i) PAN :
Shifting of image toward left or right and up or down

(ii) ZOOM:
Zoom in or zoom out of image

(iii) BRIGHTNESS
Brightness is the perception elicited by the luminance of a visual target

(iv) FLIP :
A flipped image or reversed image, the more formal term, that is generated by a mirror-reversal
of an original across a vertical axis. During flipping of image we have to change the direction of
steering angle.

FIGURE 2.15

Page | 34
Here we do randomness in image augmentation operation for 50% of probability. As show
below:

FIGURE 2.16

Page | 35
4.8. Image Preprocessing:

Before training we need some preprocessing with image such as cropping, convert to RGB to
YUV (as per NVIDIA propose), Blur, Resize and Normalization.

FIGURE 2.17

FIGURE 2.18

Page | 36
FIGURE 2.19

4.9. Training our data through CNN model:

As we discussed earlier, what is CNN? How does it work? Generally, by doing this we try to
learn to predict steering angle with respectively front camera image First, we define some group
of data in some batch by using this BatchGen() function. Then we create this CNN model by
adding some layers which were prepared by NVIDIA with the help of tensorflow module created
by Google.

FIGURE 2.20

Page | 37
FIGURE 2.21

FIGURE 2.22

Page | 38
Here we see that a total of 10 epochs (300 steps per epoch) will happen by learning rate of
0.0001 and take loss in MSE(mean square error) After completing 10 epochs we get only 0.43%
training loss and 0.25% of validation loss which is great. And it takes 23 min approx. It depends
upon PC/ Laptop configuration. This model generate total 252,219 parameters for learning.

FIGURE 2.23

Page | 39
CHAPTER 5
TESTING

Page | 40
5.1 Autonomous Driving Test in Simulator :

After run trainingSim.py we get model.h5 and then we load that model and do several operations
like image preprocessing because during the autonomous mode simulator collects real time
images captured from the front camera of car and after preprocessed that image the model will
easily predict the steering angle. During connect to udacity simulator we have to use a certain
port that is 4567.

FIGURE 3.1

Page | 41
FIGURE 3.2

Page | 42
FIGURE 3.3

FIGURE 3.4

Page | 43
FIGURE 3.5

Page | 44
CHAPTER 6
RESULTS

Page | 45
6.1 RESULTS:

The following results were observed for each of the previously described
architectures. For a comparison between them, I had to come up with two different
performance metrics.
1. Value loss or Accuracy (computed during training phase)
2. Generalization on Track_2 (drive performance)

6.2 Value loss or Accuracy:

The first evaluation parameter considered here is “Loss” over each epoch of the
training run. To calculate value loss over each epoch, Keras provides “val_loss”,
which is the average loss after that epoch. The loss observed during the initial
epochs at the beginning of training phase is high, but it falls gradually, and that is
evident by the screenshots below.

Page | 46
FIGURE 4.1

FIGURE 4.2

Page | 47
FIGURE 4.3

FIGURE 4.4

Page | 48
FIGURE 4.5

The loss over epochs must be plotted for comparison. I came up with a graph depicting the loss
for each of the three architectures. The graph plotted between 0 to 0.1 (loss values) shows a
clearer comparison between different architecture results.

Page | 49
FIGURE 4.6

Thus, from the graph of Loss over epochs, it can be concluded that Architecture_2 performed the
best at the end of 50 epochs with the least loss of 0.0077. This gave the best results for Track_1
training and testing.

6.3 Generalization on Track_2 (Drive Performance) :

The second metric that was used to evaluate the results is generalization. This can be defined as
how well the values are predicted by the models to drive over a different track it was not trained
for. Here, the values are the predicted steering angle, brakes, and throttle. It is not something that
can be plotted, but the evaluation can be done in terms of how far the car drives on the second
track, without toppling down. The several factors affecting this can be the speed, turns,
conditions on track like elevations, shadows, and so on. As discussed, the models are only
trained on Track_1 data which was simpler but tested on Track_2. Thus, it was significantly
challenging when dealing with Track_2. Few upcoming observations were noted while
experimenting, that supports this claim.

Page | 50
• Though it performed well on Track_1 and gave the best accuracy during training (loss over
epochs), most architectures were not able to even take the first turn over the Track_2.
• The reason could be overfitting for the Track_1 dataset. Overfitting refers to a situation in
which the program tries to model the training data too well. In general, the model trains for
details and even the noise for the data it has passed through. This, in turn, negatively impacts the
generalizing ability.
• Other reasons could be that, Track_2 starts with a road running parallel to the one the car starts
at, with a barrier in between. Figure 4.7 shows a screenshot of the discussed scenario. This can
be one of such scenes that it can never encounter while training for Track_1.

FIGURE 4.7
• To work around this problem, the image preprocessing and augmentation techniques are used.
• Though the accuracy was highest for Architecture_2 (i.e. val_loss was the least), it could not
perform that well on Track_2. In fact, Architecture_3 gave better results while driving on
Track_2.

Page | 51
FIGURE 4.8

• Thus, a generalization rating graph has been plotted in Figure 33 according to the relative
distance the car travelled autonomously on Track_2. Ratings indicated 1 as the inferior
performance, 5 being mediocre, and 10 being the best. Note that these ratings are purely
qualitative and not precise measurements. There were instances where the car could not even
drive a small distance and tried turning into the other parallel track right away and got stuck in
the barrier right at the beginning. This scenario can be treated as rating 1.
• Increasing the number of convolution and max-pooling layers may get you better results. Thus,
some architectures performed better than the others, but larger networks take more time to
compute (train). Having said that, this might not be true that increasing the number of
convolution layers will improve the results every time. When increasing the number further, very
similar results were obtained, even with worse performances sometimes.

Page | 52
CHAPTER 7
SCREENSHOTS

Page | 53
FIGURE 5.1

Page | 54
FIGURE 5.2

FIGURE 5.3

Page | 55
FIGURE 5.4

FIGURE 5.5

FIGURE 5.6

Page | 56
FIGURE 5.7

FIGURE 5.8

Page | 57
FIGURE 5.9

FIGURE 5.10

Page | 58
CHAPTER 8
CONCLUSION
AND
FUTURE SCOPE

8.1 CONCLUSION
Page | 59
This project started with training the models and tweaking parameters to get the best
performance on the tracks and then trying to generalize the same performance on different
tracks. The models that performed best on 1 track did poorly on Track_2, hence there was a need
to use image augmentation and processing to achieve real time generalization. The use of CNN
for getting the spatial features and RNN for the temporal features in the image dataset makes this
combination a great fit for building fast and lesser computation required neural networks.
Substituting recurrent layers for pooling layers might reduce the loss of information and would
be worth exploring in the future projects. It is interesting to find the use of combinations of real
world dataset and simulator data to train these models. Then I can get the true nature of how a
model can be trained in the simulator and generalized to the real world or vice versa. There are
many experimental implementations carried out in the field of self-driving cars and this project
contributes towards a significant part of it.

Udacity car simulator is specially made for Testing Autonomous Driving Car by using Unity.
Unity is a software for game development. There are a lot of car simulators present in the market
which are specially made for this purpose such as Carla, SVL (created by LG), Donkey Car
Simulator, NVIDIA Drive Sim etc. We can also create our own Car simulator by using game
development software like Unity, Unreal Engine 4, Crysis Engine etc. but Udacity Car Simulator
is very simple, easy to use, basically It is made for beginners. Why are those simulators so
important? This is because by using these simulators we can easily make prototypes. Otherwise
companies or organizations make prototypes with real cars which is too dangerous. So
companies like NVIDIA, Tesla, Google, Mercedes Benz, Tata, Mahendra, Waymo, ZOOX first
tested their model in those simulators and then implemented it in real life cars. Some of the
companies such as Tesla, Nissan, Waymo launch their cars in Beta version for public use but still
now it is not too perfect for the people. In other way, we can said ‘that The technology has come
a long way, but there’s still a lot of work to be done’ So In this project we collect data from the
Udacity Simulator which is stored in CSV format along with the front, left and right camera
images. We generally focus on the front camera images as Independent Variable and Steering
angle as Dependent Variable. It means that we learn our CNN model by those front camera
images with respect to steering angles. During Test, it takes images from that simulator car and

Page | 60
predicts steering angle. Before training, we have to do image augmentation and image
preprocessing are necessary for advanced analysis and make it easy to predict steering angle in
other tracks or roads. During Training our CNN model, we selected some data into some batches
and created layers of CNN of different filters. We take 10 epochs (300 steps per epoch) at a
learning rate of 0.0001. After successfully completing this we know our model has only 0.43%
of training loss and 0.25% of Validation loss which is excellent. At the end of this project with
my team members and guidance, we learned a lot of new things. Self Driving Cars and
Autonomous Cars are really a hot topic in this era.

Page | 61
8.2 Future Scope:

As in this project we only consider with cameras there are several sensor are there which help
our to know different obstacles such as LIDER, RADAR, GPS, Motion Sensor, Infrared Sensor
which make too much advance autonomous car. Those models that used this kind of sensor
actually collect a massive amount of data. Those kinds of sensors are used by large scale
companies such as Tesla, Waymo, NVIDIA, Google and many more. Actually those sensors and
HD cameras also collect data of traffic signals, road signs and padistrial as shown in some of the
videos in the Bibliography section. As we said before ‘that technology has come a long way, but
there’s still a lot of work to be done’. Actually the motive of this autonomous driving is that It
helps to reduce road accidents and also humans can comfortably take drive from one place to
another.
There are many improvements which could be made to this project to create a functionally
steering robot. First, the data should be collected in a smoother fashion. The Turtlebot velocity
command input is very crude and does not replicate the smoothness of a steering wheel in a car,
but improvements could be made to the programs which are used to translate Xbox controller
commands to the steering commands received by the Turtlebot. Second, the track could be
altered to accommodate the lack of sensitivity of the steering. To achieve this, the track would
need to be much larger and the turns much wider. It would still require a very steady hand to
operate and the data would need to be checked before being assumed to be smooth enough.
Third, a smoothing algorithm could be applied to the data which averages the values of a given
number of time frames out to create a more accurate representation of the necessary steering
commands. Fourth, full scale tests could be performed on cars. If one can gain access to the
steering angle of the vehicle, it would be significantly easier and more substantial to use this data
instead of that of a Turtlebot, which bears little resemblance to real life scenarios.

Page | 62
CHAPTER 9
REFRENCES

Page | 63
REFRENCES

• https://interfacelearnings.com

Page | 64

You might also like