You are on page 1of 7

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/329683358

Autonomous car driving - one possible implementation using machine


learning algorithm

Conference Paper · June 2018

CITATIONS READS

9 3,856

3 authors, including:

Igor Ciganovic Aleksandar Pluskoski


Univerzitet Union - Računarski Fakultet Univerzitet Union - Računarski Fakultet
4 PUBLICATIONS 9 CITATIONS 4 PUBLICATIONS 9 CITATIONS

SEE PROFILE SEE PROFILE

All content following this page was uploaded by Igor Ciganovic on 15 December 2018.

The user has requested enhancement of the downloaded file.


Autonomous car driving - one possible
implementation using machine learning
algorithm
Igor Ciganović, Aleksandar Pluškoski, and Miloš D Jovanović

Abstract— Different approaches to developing the AI systems complicates the mathematical model, so the solution is some
for self-driving vehicles exist and almost all of them are very kind of a system, that according to human standards, is
complex and with very high hardware requirements. The intelligent i.e. simulates driving as if it were a real driver. That
solution presented in this paper proposes the machine learning is how the idea of using the artificial intelligence was
based system to be as simple as possible with very low hardware developed. As artificial intelligence progresses, so do its sub-
requirements. A simple three layers deep, fully connected neural
network was trained to map input images from a front facing branches and one of the most significant branches is
QVGA camera to steering commands. Based on a input image autonomous vehicles, which is inseparable from the computer
the neural network should choose one of the four available vision. The motivation behind this technology is to design a
commands (forward, left, right or stop). With minimum of the system which can do steering, braking and accelerating all by
training data (250 images) the system learned to follow the road itself. In this task, computer vision can help the system detect
ahead and stay in its lane. The system automatically learns and identify objects, while other algorithms do the decision
necessary road features with only the steering angle as the input
from the human driver. It was never explicitly trained to detect making. [2, 3]
lines on the road. Compared to much more complex solutions
like explicit decomposition of the problem, such as lane detection II. RELATED WORK
and control, and convolutional neural networks like the end to There are plenty of proposed solutions. Some of them are
end learning proposed by the nVidia this system proved to be
mathematical approach, neural networks, reinforcement
surprisingly robust and efficient. We try to prove that this
approach would lead to better performance and lower hardware learning, convolutional neural networks, Q learning. Two
requirements thus making the development of the self-driving most significant solutions are mathematical approach and
vehicles simpler and more cost-effective. Simple artificial neural CNNs. Both methods have their pros and cons. This paper
network, like the one presented in this paper, is enough for presents a simple, fast and effective approach to tackle this
relatively complex process like lane keeping. problem.
Mathematical approach requires explicit decomposition of
Index Terms— Artificial intelligence; AI; neural network; self- the problem. This technique can use monocular or stereo
driving; autonomous vehicle; object detection; computer vision; vision. Most of these techniques were focusing on detection of
haar cascade; robotics; robot lane markers on the highway, which are relatively easy to
detect. Generally, it requires baseline to be horizontal, i.e.
I. INTRODUCTION horizon in the image is parallel to x axis. Lines need to be
Each year just in US around 37,000 people are killed in car thick enough and in a shape of rectangle (or approximate).
accidents. That is a 5.6 % increase from 2015. Human errors After the line boundaries are detected the position of vehicle
caused up to 90% percent of car accidents. [1] Autonomous can be calculated, with pre-calculated calibration data for the
vehicles may help reduce this huge number of fatalities. One camera. [3, 4, 5] With known vehicle position and through
of the first, most popular and most useful technologies is the complex matrix and trigonometry calculations, required
line detection and lane keeping. It started developing in 1980s steering angle can be calculated. Other similar techniques
and to this day it is still being improved. The desire of a man require flat road, known focal length, optical center, pitch
to increase the safety of vehicles on road has led to the angle, yaw angle and height above ground before performing
mathematical transformations. [6]
development of different systems that are implemented into
Another approach is by using convolutional neural
vehicles. Each new system requires an extra mathematical
networks (CNNs). For this approach important thing is to
representation of the data in order to make decisions correctly.
collect valid training data. It is done, by saving commands
Adding new systems into the vehicles exponentially given by human driver and images from onboard cameras.
CNNs have revolutionized pattern recognition. [7] They are
Igor Ciganović is with the School of Computing, Union University, Knez
capable of learning features automatically from training data.
Mihailova 6/VI, 11000 Belgrade, Serbia (e-mail: igor.ciganovic@gmail.com). Due to CNNs high level of complexity it requires high end
Aleksandar Pluškoski is with the School of Computing, Union University, hardware to run. Most of the solutions of this type are using
Knez Mihailova 6/VI, 11000 Belgrade, Serbia (e-mail: multiple graphic processing units (GPUs) or some dedicated
aleksandar.plu@gmail.com).
Miloš D Jovanović, Mihailo Pupin Institute, University of Belgrade,
hardware like NVIDIA DRIVE PX, both of which
Volgina 15, 11000 Belgrade, Serbia, (email: milos.jovanovic@pupin.rs). tremendously accelerate learning rate as well as the execution
of a trained network. All these solutions came with common
flaws like: big, lumpish, complex, expensive and not very
power efficient hardware. [8] One novel solution using CNNs
is so called end to end learning developed by nVidia. With
relatively small amount of training data from human driver,
system learned how to drive a car. [7] Similarly to the solution
presented in this paper. One big flaw of this approach is the
use of the nVidia Devbox and Drive px, with combined cost
in tens of thousands of dollars.
The solution proposed in this paper aims to provide a more
efficient implementation of a autonomous driving AI
algorithm. Using this approach, it is possible to build self-
driving vehicle without a explicit high-level mathematical
modeling and analysis of the problem. Also, it does not
require power-hungry, powerful and expensive hardware for
execution. It is able to be trained using very small amount of
training data, thanks to its simplicity. Fig. 2. Connections of the electrical components of the system.

III. SYSTEM OVERVIEW Software architecture is designed to be modular. Low level


The robot used in this experiment was custom made, hardware control algorithms are on the device itself and they
controlled by Raspberry Pi and Arduino computers. Artificial are responsible for communication with motors, sensors,
intelligence that controlled the robot was executed on a encoders and camera on the vehicle. This consists of two
separate PC. Communication between the two was done over modules Arduino and Raspberry Pi which are connected to
the TCP/IP network. higher level control algorithm through TCP/IP network.
Robot was driven by two powered wheels and front single Higher level control is done using the AI algorithms tough
passive wheel. It had five IR proximity sensors, three facing machine learning. It is responsible for vehicle control during
forward and two on the sides and a front-facing camera. The operation. This module is designed as a multiple state
IR sensors were used in obstacle detection while the camera machine. Depending on the state it calculates different
was used in sign and traffic light detection Robot is shown in machine learning algorithm. It is responsible for vehicle
the Fig. 1. behavior on the road while using autopilot. Highest level of
control is the human user. A user interacts with the system
using the web interface, which enables him to manually
control the vehicle on the road or to monitor AI behavior.
System supports three modes of operation for each vehicle:
manual, semi-autonomous (collision detection system) and
fully autonomous (autopilot).
All modules communicate with each other using well
defined protocols. This enables easy upgrading and scaling
when needed. System is designed to handle multiple vehicles,
multiple users and multiple AI agents. This is achieved by
designing the system to be causal and without memory,
allowing vehicles to be controlled by multiple instances of the
AI agent concurrently as well as enabling the use of the load
balancer between the instances of the AI agent for more
practical scalability.
Fig. 1. Overview of the robot.

The role of the Arduino computer and Raspberry Pi was to


control electronic devices and USB camera. They were in
communication with higher module. The overview of system
in shown Fig. 2. Higher module was executed on the PC and
was responsible for calculating the AI algorithm.
other sensors for better awareness of the environment and
therefore better decision making. The advantage of using a
simple ANN like this one is that when it is trained it requires
only a small amount of data (network parameters) to be
loaded for execution. This means that it is very efficient to
run. It runs fast and with a small memory footprint. This
enables it to run on a very simple hardware.
A. Multiple state machine
The state machine or finite automaton has a set of states,
and it moves from state to state in response to external input.
There are two distinct classes of state machines, differing in
whether the control is deterministic or nondeterministic. In
this implementation it was decided to use deterministic state
machine. [9] It is obviously that the system has a finite
number of states, and all the state transitions are strictly
defined and caused by an input and there are no invalid inputs.
Implementation described in this paper has three different
modes of operation. As shown in the Fig.4. The vehicle starts
in the “Manual control” mode. From there it can be changed
into other modes based on a user input. Modes “Semi-
Fig. 3 Software architecture diagram. autonomous” and “Autonomous” have two separate
implementations of multiple state machines, shown in Fig. 5.
IV. DATA COLLECTION and Fig. 6. respectively.
Training data was collected by manually driving the vehicle
on a model road. Besides the expected road configuration
where road (asphalt) is black with white lines, the vehicle is
also trained and tested with the brownish road, with white
lines. Training conditions were changing from natural light,
sunlight to artificial light, spotlight, with and without shadows
and reflection on the roads surface. Better and more realistic
training and test data has been produced.
Training data was collected in manual mode using images
from the front facing camera only. Collected data is stored as Fig. 4 Autonomous vehicle operation modes.
a pair of values. Fist parameter is command that was sent to In “Semi-autonomous” mode (Fig. 5.) the state machine is
the vehicle (direction of movement) and the second is a video very simple. There are only two possible states. Thus |Q|
frame at the moment the exact command was given. Other equals 2 and there are two inputs, implying two directions of
frames from the video feed, while the vehicle was not movement. Depending on the front IR sensor, the vehicle will
receiving a new command, were not recorded. This enables stop or continue operation (based on user input) if the obstacle
more efficient collection of training data with no duplicate is detected or removed, respectively.
frames and reduced size of the data set. This also speeds up
the process of training the neural network.

V. ARTIFICIAL INTELLIGENCE SYSTEM


Artificial intelligence represents the high-level control
module. AI is responsible for issuing the commands to the
lower level module (the vehicle). It is designed as multiple
state machine. Depending on the current state, different
machine learning algorithm is calculated. This makes future
upgrades and changes of particular algorithms easier to Fig. 5 Semi-autonomous mode finite state machine.
implement. One of the states calculates an artificial neural
network (ANN) which is responsible for steering the vehicle. The most interesting is the state machine in “Autonomous”
The network is three layers deep, the input layer (image from mode, i.e. autopilot, which is shown in Fig. 6. It has four
the camera), one hidden layer and the output layer (steering states. Again, “Free drive” is start and finish state. In this state
command). It is a simple fully connected network. The output ANN has the control of the vehicle. Object recognition is
from the network is one of the possible four decisions (three being calculated on the input frames and sensors are
directions: forward, forward-left, forward-right and a stop constantly checked for obstacles. From this state it is possible
command). The input for the ANN is the current frame from to transition to any other state depending on the detected
the front facing camera. The AI agent also takes into account object or the data from IR sensors.
If the obstacle is detected the state is changed to
“Obstacle”. In this state, depending on the sensor that is
detecting the obstacle, different predefined routine is
executed. Only the IR sensors are responsible for the obstacle
detection. If the obstacle is detected by the front IR sensor and
is closer that 20cm, the vehicle stops and waits for the
obstacle to be removed. If the obstacle is detected by front-left
or front-right IR sensor then vehicle moves around the object
only if the opposite side of the vehicle has no obstacle.
If the object recognition algorithm detects the stop sign or
the red light and the distance is less than 20cm, the state
machine transitions into “Stop sign / red light” state. The Fig. 7 Neural network illustration
image from the front facing camera is used to detect objects. Number of input neurons was determined by the size of the
Here, depending on whether the sign or the traffic light was image used as the input for the network. Video was recorded
detected, one of two predefined routines is executed. In both at the QVGA (320x240) resolution. Individual frames were
cases the vehicle is stopped. If the sign is detected then the then converted into grayscale format and top half was cut off.
vehicle stops, then checks for the incoming traffic on the Resulting image had 320 by 120 pixels. This was done to
intersection. After that the vehicle transitions back into “Free optimize the algorithm, because the color information and
drive” state. If the traffic light is detected, the vehicle waits anything above the horizon is not important for the lane
for the light to turn green, after which it transitions again into keeping algorithm. Frames were then converted into a NumPy
“Free drive” state. arrays, which were used as an input for the network. This
However, if the direction sign is detected, the state machine process is demonstrated in Figure 8.
transitions into “Direction sign” state. In this state, the vehicle
first must check if there is incoming traffic on the intersecting
road. If the crossroad is clear, the vehicle executes command
i.e. turns in the desired direction. After this procedure, the
state machine is transitioned back into “Free drive” mode and
the control is given back to the ANN.

Fig. 8 Data manipulation


During data collection process these arrays were paired
with the labels. Those labels were the actual user commands.
Labels were recorded in one-hot format as shown in the Table
Fig. 6 Autonomous mode finite state machine.
I.
TABLE I
VI. NEURAL NETWORK LABEL VALUES IN ONE-HOT FORMAT
Development of neural networks was motivated by the
recognition that the human brain works in completely Left [1, 0, 0, 0]
different way than the traditional digital computer. Human Right [0, 1, 0, 0]
brain is extremely parallel and complex information- Forward [0, 0, 1, 0]
processing system. Neural networks represent the attempt to
Stop [0, 0, 0, 1]
emulate human brain on a classical computer. [10]
In this particular case, a simple fully connected feed
forward neural network was used. It consists of three layers. Data paired with the labels was then saved into an “.npz”
Output layer with four neurons, each representing one of the file and later used for training the network. Network was
decisions the network can make (forward, forward-left, trained in OpenCV library using back propagation method.
forward-right and stop). Softmax function was used to select After the training weights were saved into an “.xml” file. To
the neuron with the highest output as the result. Hidden layer generate predictions, the same neural network is reconstructed
with 32 and input layer with 38400 neurons. The number of and loaded from the “.xml” file saved during the training
nodes in the hidden layer was chosen fairly arbitrary. All the process.
neurons used sigmoid activation function.
VII. SIGN DETECTION
Traffic sign detection and classification is obviously
essential for self-driving vehicles. The Haar feature-based
cascade classifier was implemented for the object detection.
The OpenCV library provides tools needed for this in a form
of a trainer and a detector. It was decided to implement only
classifiers for stop sign, traffic lights and direction signs (left,
right, forward and turn back). Since, using this approach,
every object needs its own classifier and because scaling is
only the question of acquiring the appropriate training data, it
was decided that this is enough for the proof of concept.
In the case of a traffic light, first the object (traffic light) is
detected using trained object classifier. The bounding box
containing the detected object is considered to be the region of
interest and rest of the image is discarded. Gaussian blur is
then applied on this cropped region to reduce the noise. After
that, the brightest spot is calculated based on the intensities of
the pixels. Finally, the state of the traffic light is determined
by the position of the brightest spot inside the region of
interest.
The behavior of the vehicle when the sign is detected is Fig 9 Robot autonomously driving on the model road
hard-coded. For all the signs and the traffic light the distance While collecting the data, the user manually operated the
to the object is measured. Only if the distance is below the car, as the driver would in the real situation, using custom
threshold the preprogrammed behavior is executed. If the stop user interface, see Figure 10. The user was obliged to follow
sign is detected, the vehicle stops for 5 seconds and checks if the markings on the road. Otherwise (due to an error), the
there is any traffic on the intersecting road. When the traffic training data was considered invalid and it was necessary to
light is detected and the red light is recognized, the vehicle repeat the run. During this process, the driving conditions
stops until the green light turns on. If the green light is were changed. From natural to artificial light, with or without
showing the traffic light is ignored. If the direction sign is shadows on the road, from the ideal visibility to the fog (a
detected, depending on the sign the appropriate behavior is see-through plastic bag was used to simulate reduced
executed. visibility). The type of the road was changed, from standard,
Distance to the objects is measured using a geometry model black asphalt with white lines, to a brownish background with
for detecting distance to an object using monocular vision white lines. This allowed gathering of a more diverse data and
method proposed by Chu, Ji, Guo, Li and Wang (2004) [11]. enabled autonomous driving even in the conditions where
It was decided to use a single USB camera and monocular markings on the road are not clearly visible, due to fading or
vision to reduce the complexity and hardware requirements reflection.
for the calculations.

VIII. EXPERIMENT
Experimental trials were carried out on a model road, in
controlled conditions. The roadmap of the model was
changed, also the weather and lighting conditions were varied
during the experiments to simulate the real-world conditions.
The experiments consisted of four phases. The first phase is
designing a roadmap, second gathering the training data on
the road model, then training the ANN and finally the
experimental drive (i.e. testing).
Making of a model road was a time-consuming and very
creative part of the experiment. Because the model needed to Fig 10. User interface for the vehicle control during the data collection
simulate real world conditions as close as possible and also After the data was collected, it was saved into an “. npz”
allow easy and fast changes. Example of the road model can file. The training of the ANN was done by executing the
be seen in Figure 9. The model had to have streets, traffic training script, which takes the collected training data and
lights, crossroads, obstacles on the road (static and dynamic) calculates the coefficients of the neural network (assuming
and traffic signs. The driving conditions, including the that the training data is valid). The neural network parameters
wetness of the road, visibility and lighting (artificial and are then saved in an “.xml” file, which is used to generate the
natural sunlight) were also needed to allow quick and easy same ANN in the testing phase. The training was done on the
alteration. From one training run to another, the model was Intel i3 processor with 8GB of RAM and on average it takes
changed, as well as during the testing. All this was done to get only a couple of minutes. This was possible because of the
more realistic data and results. simple architecture of the used ANN. After the training,
accuracy data is calculated showing more than 85% of
matching.
Testing was done by driving the vehicle autonomously on
the model road. The expected outcome was that the
autonomous vehicle should, by itself, drive on the model road.
The behavior of the vehicle was observed and recorded. that in this setup, an autonomous vehicle can drive at any time
Between the testing runs the model road was altered. This of the day, under natural or street light, comply with all the
included changing the roadmap as well as the conditions on traffic signs and regulations and follow the assigned route,
the road by adding barriers, traffic signs and changing the without any unexpected behavior.
intensity of the light and the type of illumination. The test
conditions were incrementally worsened until an unexpected X. CONCLUSION
behavior was detected. It was demonstrated that the simple ANN is able to learn
not so simple task of driving a car. That includes lane
following, collision detection and avoidance and following
traffic regulations in urban traffic setting. All of this was
achieved using a minimum of data and very modest hardware.
That was enough to train the system to drive in diverse
conditions, from day to night, from urban streets to highways,
regardless of the number of other participants in traffic and
comply with all the traffic regulations.
One caveat of any machine learning system is the
unpredictable behavior. But in this experiment, it was
demonstrated that the development of these types of systems
can be simple, especially for the noncritical systems. For the
Fig. 11. Test drive including the other traffic participants
real-world application, additional testing would be required,
All the phases were repeated for several iterations until
as well as some additional safety protocols. But all of this
satisfactory results were obtained. This included that the
should be achievable with the system that was demonstrated.
autonomous vehicle could drive with the other participants in
the traffic, follow the markings on the road, comply with the
ACKNOWLEDGMENT
road regulations, such as the street signs and the traffic lights
and be resistant to the changing road conditions. The research in the paper is funded by the Serbian Ministry
of Education Science and technological development under the
IX. RESULTS grants TR-35003 and III-44008.
The first results were above expectations and very REFERENCES
encouraging showing that the development was moving in the
[1] U.S. Department of Transportation’s National Highway Traffic Safety
right direction. The first successful build was tested using Administration. (2016). Retrieved from United States Department of
only the partial training data. The drive through only one Transportation: crashstats.nhtsa.dot.gov/Api/Public/Publication/812456
street and going only in a one direction was recorded. This [2] Tan, B., Xu, N., & Kong, a. B. (2018). Autonomous Driving in Reality
was done to save time because the first build was not expected with Reinforcement Learning and Image. arXiv preprint
arXiv:1801.05299. Retrieved from https://arxiv.org/pdf/1801.05299.pdf
to work but it did. The robot successfully drove (kept its lane) [3] Massimo, B., & Broggi, A. (1996). Real-time lane and obstacle
in the direction that was recorded. But in the opposite detection on the GOLD system. Intelligent Vehicles Symposium, 1996.,
direction could not stay in its lane. Initially the results varied Proceedings of the 1996 IEEE, 213-218.
from test to test and showed some problems. Sometimes the [4] Bertozzi, M., Broggi, A., Conte, G., & Fascioli, A. (1997). Obstacle and
lane detection on ARGO. Intelligent Transportation System, 1997.
robot would unexpectedly turn and drive of the road or it ITSC'97., IEEE Conference on. IEEE, 1010-1015.
would follow the lane but not stay in the center. But by [5] Wang, H., & Chen, Q. (2016). Real-time lane detection in various
collecting better data and fine-tuning the hyperparameters of conditions and night cases. Intelligent Transportation Systems
the ANN, all of this was solved. Multiple runs were recorded Conference, ITSC '06. IEEE, 1226-1231.
[6] Aly, M. (2008). Real time detection of lane markers in urban streets.
on every route and only the recordings without any user errors Intelligent Vehicles Symposium, IEEE, 7-12.
were kept for training. Also, the number of the neurons in the [7] Bojarski, M., Del Testa, D., Dworakowski, D., Firner, B., Flepp, B.,
hidden layer was varied until it was determined that 32 was Goyal, P., Jackel, L.D., Monfort, M., Muller, U., Zhang, J. and Zhang,
the optimal number. X. (2016). End to end learning for self-driving cars. arXiv preprint
arXiv:1604.07316.
In order for the result to be considered satisfactory, the [8] Krizhevsky, A., Sutskever, I. and Hinton, G.E. (2012). ImageNet
autonomous vehicle had to meet the following conditions: to Classification with Deep Convolutional. Advances in neural information
recognize and comply with the traffic signalization and processing systems, 1097-1105.
regulations and to maintain the intended path on the road. [9] Hopcroft, J. E., Motwani, R., & Ullman, J. D. (2007). Introduction to
Automata Theory, Languages, and Computation. Pearson Education,
Over time, the both conditions have been increased in 37-47.
difficulty by adding traffic lights, other vehicles, pedestrians [10] Simon Haykin, “What is a neural network?” in NEURAL NETWORKS A
and obstacles, a more complex roadmap and different road Comprehensive Foundation, Delhi, India: Pearson Education, 2005., ch.
surface. 1, sec. 1, pp. 23 – 24.
[11] Jiangwei, C., Lisheng, J., & Lie, G. (2004). Study on method of
The final results confirm the hypothesis that this type of detecting preceding vehicle based on monocular camera. Intelligent
neural network, a simple three-layer fully connected, with Vehicles Symposium, IEEE.
very low hardware requirements, can be applied under [12] [Igor Ciganovic]. (2017, October 26). Self-Driving Car AI [Video File].
realistic conditions to solving traffic navigation. It was shown Retrieved from https://www.youtube.com/watch?v=W2hugeCLAKI

View publication stats

You might also like