Asi 02 00005

Article
ROS-Based Human Detection and Tracking from a

Wireless Controlled Mobile Robot Using Kinect
Sidagam Sankar 1,† and Chi-Yi Tsai 2, *,†
1 Department of Electronics and Communication Engineering, Vel tech Rangarajan Dr. Sagunthala R&D
Institution of Science and Technology, No.42, Veltech Road, Avadi, Chennai 600062, Tamil Nadu, India;
sidagamsankar@gmail.com
2 Department of Electrical and Computer Engineering, Tamkang University, No.151, Yingzhuan Road,
Tamsui District, New Taipei City 25137, Taiwan
* Correspondence: chiyi_tsai@mail.tku.edu.tw
† The authors contributed equally to this work.

Received: 30 December 2018; Accepted: 24 January 2019; Published: 29 January 2019
Abstract: Human detection and tracking is an important task in artificial intelligent robotic systems,
which usually require a robust target detector to work in a variety of circumstances. In complex
environments where a camera is installed on a wheeled mobile robot, this task becomes much more
difficult. In this paper, we present a real-time remote-control system for human detection, tracking,
security, and verification in such challenging environments. Such a system is useful for security
monitoring, data collection, and experimental purpose. In the proposed system, a Kinect RGB-D
camera from Microsoft is used as a visual sensing device for the design of human detection and
tracking. We also implemented a remote-control system on a four-wheel mobile platform with a Robot
Operating System (ROS). By combining these two designs, a wireless controlled mobile platform
having the feature of real-time human monitoring can be realized to handle this task efficiently.
Experimental results validate the performance of the proposed system for wirelessly controlling the
mobile robot to track humans in a real-world environment.
Keywords: robot operating system (ROS); human detection; human tracking; wheeled mobile
platform; RGB-D camera
1. Introduction
Automated surveillance for security monitoring will increase more in venues such as airports,
casinos, museums, and public places, etc. In these places, the government usually requires a
surveillance system installed with intelligent software to monitor security and to detect suspicious
behavior of people using cameras. In a remote-control center, human operators could search archived
videos to find abnormal activities of people in public places such as college campuses, transport stations,
and parking lots, etc. Having this type of surveillance system vastly increases the productivity of the
human operator and expands coverage of the security monitoring service. However, the problem of
human detection and tracking in video sequences becomes much harder when the camera is mounted
on a mobile platform moving in a crowded workspace. If the object of interest is a deformable object,
like a human, the problem becomes even more challenging to detect and to track the human.
A modern mobile robot is usually prepared with an onboard digital camera and computer vision
software for human detection and tracking, which can be applied in several real-world applications.
In this paper, a real-time computer vision system for human detection and tracking on a wheeled
mobile platform, such as a car-like robot, is presented. The contribution presented in this paper is
mainly focused on the system design and implementation aspects. In other words, our design aims to
Appl. Syst. Innov. 2019, 2, 5; doi:10.3390/asi2010005 www.mdpi.com/journal/asi

Appl. Syst. Innov. 2019, 2, 5 2 of 12
achieve two main goals: Human detection and tracking, and wireless remote control of the mobile
platform. Human detection and tracking are achieved through the integration of point cloud-based
human detection, position-based human tracking, and motion analysis in the robot operating system
(ROS) framework [1]. On the other hand, the wireless remote control of the mobile platform is achieved
by the HyperText Transfer Protocol (HTTP) [2]. Developing such an intelligent wireless motion
platform is very useful for many practical applications such as remote monitoring, data collection, and
experimental purpose, etc. Experimental results validate the performance of the proposed system for
detecting and tracking of multiple humans in an indoor environment.
The rest of this paper is organized as follows. Section 2 explores the literature review on some
of the related literature to this study. Section 3 explains our system design in detail. In Section 4, we
explain details about the implementation of human detection, human tracking, and remote control
of the mobile platform. Section 5 presents the experimental results of the proposed system with
some discussions. Section 6 concludes the contribution of this paper followed by the future scope of
this study.
2. Literature Survey
In this section, we discuss the literature survey related to the design of human detection and
tracking systems. Mohamed Hussein et al. [3] proposed a human detection module that is responsible
for invoking the detection algorithm. Ideally, the detection algorithm is to be run on each input frame.
However, this will inhibit the system from meeting its real-time requirements. Instead, the detection
algorithm in their implementation is invoked every two seconds. The location of the human targets
at the remaining time is determined by tracking the detected humans using the tracking algorithm.
To further speed up the process, the detection algorithm does not look for humans in the entire frame.
Instead, it looks for humans in the regions determined to be foreground regions. The tracking module
processes frames and detections received from the human detection module to retain information
about all the existing tracks. When a new frame is received, the current tracks are extended by
locating the new bounding boxes for each track location in the new frame. If the received frame is
accompanied with new detections, the new detections are compared to the current tracks. If the new
detection significantly overlaps with one of the existing tracks, it is ignored. Otherwise, the new track
is created as a new detection. A track is terminated if the tracking algorithm fails to extend it in a new
input frame.
Juyeon Lee et al. [4] proposed a human detection and tracking system consisting of seven
modules: Data collector (DC) module, shape detector (SD) module, location decision (LD) module,
shape standardization (SS) module, feature collector (FC) module, motion decision (MD) module,
and a location recognition (LR) module. The DC module acquires color images from the digital network
cameras every three seconds. The SD module analyzes the human’s volume from four images, which
is acquired from the DC module. They used a moving window to extract the human’s coordinates
in each image. The LD module analyses the user’s location through coordinates that are offered by
the SD module. The LD module also analyses whether the human is near to which home appliance
through a comparison between the absolute coordinate of the human and the absolute coordinate of
the appliance in the home. The SS module converts the image into a standard image for recognition of
multi-users’ motion. The FC module extracts the image that is changed by the SS module, and the
MD module predicts the motion of the detected human. The SD module receives raw images from
the DC module to distinguish the human’s image through subtraction between the second image
and the third image, then calculates the human’s absolute coordinates in the home and the human’s
relative coordinates with reference to the furniture and home appliances. The SD module also decides
which furniture the human is near to. If there is a human in an important place that is defined for the
human tracking agent, the SD module calculates relative coordinates of the moving window, which
includes a human, and transmits those to the LD module. The LR module can judge a user’s location
without conversion of the acquired image. Unlike the human’s location, the human’s motion should
be recognized to differ in the case of each person because a person’s appearance differs from each
other. The SS module takes charge of motion recognition in the multi-user situation. To recognize the
human’s motion, they defined feature sets of six standardized motion showing six motions that are
recognized in the human tracking.
Ramesh K. Kulkarni et al. [5] proposed a human tracking system based on video cameras. There
are three steps in the proposed system. The first step is a Gaussian-based background subtraction
process [6] to remove the background of the input image. The second step is the detection of a targeted
human using a matching process between the foreground and the target image based on the Principle
Component Analysis (PCA) algorithm [7], which is a method of transforming a number of correlated
variables into a smaller number of uncorrelated variables based on eigen-decomposition. The final
step is using a Kalman filtering algorithm [8] to track the detected human.
In reference [9], Sales et al. proposed a real-time robotic system capable of mapping indoor,
cluttered environments, detecting people, and localizing them in the map using a RGB-D camera
mounted on top of a mobile robot. The proposed system integrates a grid-based simultaneous
localization and mapping approach with a point cloud-based people detection method to localize both
the robot and the detected people in the same map coordinate system.
In reference [10], Jafari et al. proposed a real-time RGB-D based multi-person detection and
tracking system suitable for mobile robots and head-worn cameras. The proposed vision system
combines two different detectors for different distance ranges. For the close range, the system enables a
fast depth-based upper-body detector, which allows real-time performance on a single core processing
unit (CPU) core. For the farther distance ranges, the system switches to an appearance-based full-body
Histogram-of-Oriented-Gradient (HOG) detector running on the graphic processing unit (GPU).
By combining these two detectors, the robustness of the vision system using an RGB-D camera can be
improved for indoor settings.
Revathy Nair M et al. [11] proposed a stereo vision based human detection and tracking system,
which is integrated with the help of image processing. To identify the presence of human beings,
an HSV-based skin color detection algorithm is used. When the system identifies the presence of
human beings, the human location is then identified by applying a stereo vision-based depth estimation
approach. If the intruder is moving, the centroid position of the human with the orientation angle of
motion is obtained. In the tracking part, the human tracking system is a separate hardware system
connected to two servo motors for targeting the intruder. Both human detection and human tracking
systems are separately coupled through a serial port. If there are several people present on the
video or image, then the centroid of all the regions are estimated and the midpoint of them gives the
required result.
In reference [12], Priyandoko proposed an autonomous human following robot, which comprises
face detection, leg detection, color detection, and person blob detection methods based on RGB-D
and laser sensors. The authors used Robot Operating System (ROS) to perform the four detection
methods, and the result showed that the robot could track and follow the target person based on the
person’s movement.
As to the wireless control of the mobile platforms, Shayban Nasif et al. [13] proposed a smart
wheelchair, which can be useful assistance to physically disabled persons to complete their daily needs.
The previous development of this smart wheelchair system was developed using a personal computer
or laptop in the wheelchair. Because recent developments in embedded systems can be combined to
the design of robotics, artificial intelligence, and automation control, the authors further integrated a
Radio Frequency (RF) communication module into the smart wheelchair system to implement wireless
control function in the wheelchair. Moreover, the authors used an acceleration sensor to implement
a head gesture recognition module, which allows the user to control the wheelchair in directions as
needed to move, using the change of head gesture only.
In reference [14], Patil et al. proposed the hardware architecture for a mobile heterogeneous robot
swarm system, in which each robot has a distinct type of hardware compared to the other robots.
One of the most important design goals for the heterogeneous robot swarm system is providing
lowAppl.
costSyst.
wireless communication for indoor as well as outdoor applications. To achieve4this
Innov. 2019, 2, x FOR PEER REVIEW
goal,
of 12
the authors adopted X-Bee module, Bluetooth Bee module, and PmodWiFi module to create an
ad hoc
this communication
goal, the authorsnetwork, whichmodule,
adopted X-Bee can efficiently accomplish
Bluetooth theand
Bee module, wireless communication
PmodWiFi module to for
create an robot
decentralized ad hoc communication network, which can efficiently accomplish the wireless
swarms.
communication for decentralized robot swarms.
3. System Design
3. System Design
Our design aims to achieve two main goals: Human detection and wireless remote control of a
wheeledOur design
mobile aims to achieve
platform. Humantwo main goals:
detection Human detection
is achieved through and wireless remote
the integration of 3Dcontrol
point ofcloud
a
wheeled mobile platform. Human detection is achieved through the integration of 3D point cloud
processes for human detection, tracking, and motion analysis, in one framework. Wireless remote
processes for human detection, tracking, and motion analysis, in one framework. Wireless remote
control of the mobile platform is achieved by using an HTTP protocol.
control of the mobile platform is achieved by using an HTTP protocol.
Figure 1 shows the architecture of the proposed system design. We use a Kinect device to capture
Figure 1 shows the architecture of the proposed system design. We use a Kinect device to
× 480 24-bit
640 capture 640 × 480 full-color and 320and
24-bit full-color × 240
320 16-bit depthdepth
× 240 16-bit images, which
images, areare
which used
usedasasthe
theinput of the
input of
proposed
the proposed system. A laptop is used as a major image processing unit in this system to receive the the
system. A laptop is used as a major image processing unit in this system to receive
captured visual
captured sensing
visual datadata
sensing through USBUSB
through 2.0 port, which
2.0 port, provides
which a high
provides frame
a high raterate
frame up up
to 30
to frames
30
per second (fps)
frames per for transmitting
second both full-color
(fps) for transmitting and depth
both full-color andimages. The function
depth images. of human
The function detection
of human
and detection
tracking is
and implemented in the laptopintothe
tracking is implemented detect thetolive
laptop motion
detect of the
the live humans
motion of theinhumans
front ofinthe mobile
front
of the This
platform. mobile platform.
function Thisthe
helps function
proposedhelps the proposed
system for usingsystem for using
in real-time in real-timeThe
applications. applications.
output result
Thehuman
of live outputdetection
result of and
live tracking
human detection and tracking
can be observed candisplay
on the be observed
deviceon the display
connected device
to the laptop.
connected to the laptop.
Laptop
Kinect ROS
RGB & Depth Human Detection Detection &

Image & Tracking Tracking Output
Mobile Platform Client

User
Server
Command
Cell
Phone
Four Motors Motor Driver WI-FI Module Internet
Figure 1. System
Figure1. Architecture.
System Architecture.
As for
As the
for mobile platform,
the mobile the Nodemcu
platform, the Nodemcu esp8266 WI-FI
esp8266 module
WI-FI moduleplays a major
plays role.role.
a major ThisThis
module
is integrated
module is with integrated
integrated circuit (IC)
with integrated and (IC)
circuit WI-FI
andshield
WI-FIon it. The
shield IPThe
on it. address of theofWI-FI
IP address module
the WI-FI
must be browsed
module must be inbrowsed
the cell phone, and
in the cell then and
phone, we will
thenget
we the
willpage with
get the pagea control interface.
with a control There are
interface.
four motors used in the mobile platform. Each motor receives the driving signal from the motorthe
There are four motors used in the mobile platform. Each motor receives the driving signal from driver,
whichmotor driver, which
is connected is WI-FI
to the connected to the
module to WI-FI module
help the to controlling
user in help the user theinmobile
controlling the mobile
platform wirelessly.
Noteplatform
that thewirelessly. Note that
WI-FI module the WI-FI
of the mobile module of the
platform mobileconnect
should platformtoshould connect
the same to the same
network with the
network with the cell phone.
cell phone.
4. Implementation
4. Implementation of of HumanDetecting
Human Detectingand
and Wireless
Wireless Mobile
MobilePlatform
PlatformControl
Control
The human detection and tracking function used in the proposed system is implemented based
The human detection and tracking function used in the proposed system is implemented based
on the ROS, which is a robotics middleware widely used in the robotics platform. It is a collection of
on the ROS, which is a robotics middleware widely used in the robotics platform. It is a collection of
software functions for the development of robot software framework. The ROS is not an operating
software functions for the development of robot software framework. The ROS is not an operating
system; it provides services specially designed for a heterogeneous computer cluster such as the
implementation of commonly used functionality, package management, hardware abstract and
message-passing between processes. There are twelve ROS distributions released online, in which we
installed the tenth ROS distribution named as Kinetic Kame in Ubuntu 16.04 Operating system.
4.1. Human Detection and Tracking

Human detection in a heavy traffic environment is complicated. When this task comes on a mobile
platform, it will increase its complexity. In the implementation of this human detection function, we use
the Kinect Xbox 360 device to capture the 3D point cloud data of the scene in front of the mobile
platform through the RGB and depth images. The Kinect device is interfaced by the Open Natural
Interface (OpenNI), which is a type of middleware, including the launch files used for accessing
the devices. OpenNI is compatible with several RGB-D cameras such as the Microsoft Kinect, Asus
XtionPro, etc. These products are suitable for visualization and processing by creating a nodelet graph,
which transforms the raw data into digital images and video outputs by using the device drivers.
Figure 2 shows the block diagram of the human detection process. In Figure 2, the point cloud
data from the environment is first normalized based on the point cloud position in three coordinates
by using the following formulas.
xdiff = xmax − xmin , (1)
ydiff = ymax − ymin, (2)
zdiff = zmax − zmin , (3)
where (xmin , xmax ), (ymin , ymax ), and (zmin , zmax ) denote the maximum and minimum point cloud
position on the x-axis, y-axis, and z-axis, respectively. In this way, a normalization process is applied to
Appl. cloud
point Syst. Innov.
data 2019,
by2, x FOR PEER REVIEW 5 of 12
nx = px /scale, (4)
system; it provides services specially designed for a heterogeneous computer cluster such as the
ny = py /scale,
implementation of commonly used functionality, package management, hardware abstract and (5)
message-passing between processes. There are twelve ROS distributions released online, in which
n = pz /scale, (6)
we installed the tenth ROS distribution namedz as Kinetic Kame in Ubuntu 16.04 Operating system.
where (nx , ny , nz ) denotes the normalized point cloud data and (px , py , pz ) represents the input point
4.1. Human
cloud Detection
data on and Tracking
the x-axis, y-axis, and z-axis, respectively. The scaling factor is determined by the
following formula
Human detection in a heavy traffic environment is complicated. When this task comes on a
mobile platform, it will increase itsscale In, ythe
= max(xdiff
complexity. zdiff )/2
diff , implementation of this human detection (7)
function, we use the Kinect Xbox 360 device to capture the 3D point cloud data of the scene in front
where max(x, y, z) returns the maximum value among x, y, and z. The normalized point cloud data
of the mobile platform through the RGB and depth images. The Kinect device is interfaced by the
are then used to form a normalized histogram through an orthogonal transformation constructed
Open Natural Interface (OpenNI), which is a type of middleware, including the launch files used
from the eigenvectors of the normalized point cloud data. Next, the PCA algorithm is applied to the
for accessing the devices. OpenNI is compatible with several RGB-D cameras such as the Microsoft
normalized
Kinect, Asushistogram
XtionPro,toetc.
reduce
Thesethe dimension
products of the histogram.
are suitable Finally,
for visualization andaprocessing
support vector machine
by creating
(SVM) predictor
a nodelet graph,[15] is applied
which on the
transforms thelow-dimensional normalized
raw data into digital histogram
images and to classify
video outputs human
by using theand
non-human classes
device drivers. with respect to the input point cloud data.
Human Detection
Kinect
Human
Point Cloud Position Normalized PCA SVM
Detection
Data Normalization Histogram Algorithm Classification
Result
Figure 2. Block diagram of human detection process.

Figure 2. Block diagram of human detection process.
Figure 2 shows the block diagram of the human detection process. In Figure 2, the point cloud
data from the environment is first normalized based on the point cloud position in three
coordinates by using the following formulas.
xdiff = xmax – xmin, (1)
ydiff = ymax – ymin, (2)

Appl. Syst. Innov. 2019, 2, x FOR PEER REVIEW 6 of 12
Figure
Figure33shows
showsthethe block
block diagram
diagram ofof the
the human
humantracking
trackingprocess.
process.AsAsto to
thethe human
human tracking
tracking
Figure
function, the 3 shows
output theofblock
data humandiagram of the
detection arehuman
used astracking
the process.
input data toAs
thetonearest
the human tracking
neighbor search
function, the output data of human detection are used as the input data to the nearest neighbor
function,
(NNS) the output data of human detection are used as the input data to the nearest neighbor
searchalgorithm [16], which
(NNS) algorithm [16],searches the nearest
which searches detected
the nearest humanhuman
detected position in the in
position previous frame to
the previous
thesearch
current(NNS) algorithm [16],
detection whichupdate
searches
thethe nearest detected human asposition in the previous
frame to the currentresult and then
detection result and then detected
update the human position
detected human the output
position astracking
the outputresult.
frame to the current detection result and then update the detected human position as the output
tracking
Those result.human
updated Those positions
updated human positions
are feedback areprevious
as the feedbackhuman
as the previous
positionshuman
for the positions for
NNS algorithm
tracking result. Those updated human positions are feedback as the previous human positions for
tothe NNS algorithm
continue the human to continue
tracking the human tracking process.
process.
the NNS algorithm to continue the human tracking process.
Human Tracking
Human Nearest Update Human

Detection Neighborhood Detected Tracking
Result Search Position Result
Previous
Detection
Result
Figure 3. Block diagram of human tracking process.

Figure3.3.Block
Figure Blockdiagram
diagram of human tracking
of human trackingprocess.
process.
4.2.Wireless
4.2. WirelessControlled
ControlledMobile
Mobile Platform
Platform
4.2. Wireless Controlled Mobile Platform
TheNodemcu
The Nodemcuesp8266
esp8266 microcontroller
microcontroller plays
plays a crucial
a crucial partpart
in theinimplementation
the implementation of theofwheeled
the
The Nodemcu esp8266 microcontroller plays a crucial part in the implementation of the
wheeled
mobile mobile
platform, platform, which uses four DC motors interfaced with an LM298N motor driver.
wheeled mobilewhich useswhich
platform, four DCusesmotors
four DCinterfaced
motors with an LM298N
interfaced with anmotor
LM298N driver.
motorThedriver.
control
The control
signal of the signaldirection
motor of the motor
given direction
from the given module
WI-FI from theisWI-FI module
converted intoisthe
converted into the
transistor-transistor
The control signal of the motor direction given from the WI-FI module is converted into the
transistor-transistor logic (TTL) by the motor driver for driving the motors in the required direction.
logic (TTL) by the motor
transistor-transistor logicdriver
(TTL) for driving
by the motorthe motors
driver in the required
for driving the motors direction. Here, direction.
in the required we use the
Here, we use the Arduino IDE for programming the module.
Arduino IDE for programming the module.
Here, we use the Arduino IDE for programming the module.
To remote control the platform wirelessly, the cell phone and WI-FI module must connect to
ToToremote
remotecontrol thethe
control platform
platform wirelessly,
wirelessly,thethecellcell
phonephone andandWI-FI
WI-FImodule
modulemustmustconnect
connecttotothe
the same network. The IP address of the WI-FI module can be browsed in an HTML page. Figure 4
same
thenetwork. The IP The
same network. address of the WI-FI
IP address of themodule can be browsed
WI-FI module in an HTML
can be browsed in an page.
HTML Figure
page.4Figure
shows 4the
shows the HTML page with a control interface for the user to remotely control the mobile platform
HTML
shows page
the with
HTML a control interface
page with for
a control the user
interface to remotely
for the usercontrol the
to remotely mobile
control
in the required direction wirelessly. The user is able to control the mobile platform wirelessly
platform in
the mobile the required
platform
in the
direction required direction wirelessly. The user is able to control the mobile platform wirelessly
throughwirelessly. The user isbased
a server-and-client able to control
HTTP the mobile
protocol, platform
in which wirelessly
the cell phone through
acts as aa client
server-and-client
and the
through
based HTTP a protocol,
server-and-client
in which based
the HTTP
cell phone protocol,
acts as ainclient
which andthethecell phone
WI-FI acts asact
module a client and the
as a server.
WI-FI module act as a server.
WI-FI module act as a server.
Figure 4. Remote control interface shown in the HTML page.

Figure4.4.Remote
Figure Remotecontrol
controlinterface
interfaceshown
shownininthe
theHTML
HTMLpage.
page.
Figure 5 shows the flowchart of the remote control based on the control interface shown in
Figure 5shows
showsthe
theflowchart
flowchartofofthe
theremote
remotecontrol
controlbased
basedon onthe
thecontrol
controlinterface
interfaceshown
shownin in
FigureFigure
4. The 5service set identifier (SSID) and Password are the details of the access point created by
Figure 4. The service set identifier (SSID) and Password are the details of the access point
Figure 4. The service set identifier (SSID) and Password are the details of the access point created bycreated by
thethe
user. Once
user.Once the
Oncethemodule
themodule is connected
moduleisisconnected to the
connectedtototheWI-FI
theWI-FInetwork,
WI-FInetwork, it will check
network,ititwill
willcheck the
checkthe server status
theserver and
serverstatus gives
statusand
and
the user.
an gives
IP address of the module,
an IP address which is
of the module, usedisinused
which the client
in the device to access
client device the serve-and-client
to access based
the serve-and-client
Appl. Syst.
Appl. Syst. Innov.
Innov. 2019, 2, x5 FOR PEER REVIEW
2019, 2, 7 7of
of 12
12
gives an IP address of the module, which is used in the client device to access the serve-and-client
HTTP protocol. This workflow continuously reads the change of the user command. If the user inputs
based HTTP protocol. This workflow continuously reads the change of the user command. If the
a new command from the control interface (Figure 5), then the controller will check all conditional
user inputs a new command from the control interface (Figure 5), then the controller will check all
statements. If the string is equal to the received command, then the controller starts to move the mobile
conditional statements. If the string is equal to the received command, then the controller starts to
platform until the next user command.
move the mobile platform until the next user command.
Start
Initialising If
Yes
pins String ==
Move Forward
Forward
Read SSID and

else
Password If
Yes
String ==
Move Backward
Backward
If
WI-FI No
connected else
Yes If Yes
String == Right Move Right
No If
Server
connected else
Yes
If Yes
String == Left Move Left
Print IP address
else
If No If Yes
Client String == Stop Stop Moving
Connected
Yes
else
Read “String”
Figure 5. Remote control flowchart of the mobile platform.

Figure 5. Remote control flowchart of the mobile platform.
5. Experimental Results
5. Experimental Results
We implemented the ROS-based human detection and tracking system in a laptop equipped with
We Intel
a 2 GHz implemented the i3-5005U
® Core 64-bit ROS-based CPU human
and 2 detection
GB systemand tracking
memory system
running in a laptop
Ubuntu equipped
16.04 operating
with
system. The human detection and tracking result is displayed on the laptop screen through the16.04
a 2 GHz Intel®Core 64-bit i3-5005U CPU and 2 GB system memory running Ubuntu ROS
operating system.
visualizer. Once the The
user human
launchesdetection
the system, and
thetracking resultallisthedisplayed
ROS triggers algorithmsonto the
startlaptop screen
detection and
through the human.
tracking the ROS visualizer. Oncescenarios
Two runtime the user launches the system,
are captured the ROSlocations
at two different triggers in
allan
theindoor
algorithms
room
to start detection
to verify and tracking
the performance the human.
of the proposed Two The
system. runtime
Kinectscenarios
device isare captured
mounted at two
on the different
four-wheeled
locations in an indoor
mobile platform roomimages
to capture to verifyforthe performance
detecting of the proposed
and tracking humans system. The Kinect
in real time. Figuredevice
6 shows is
mounted on the four-wheeled
the configuration mobile platform
of the four-wheeled to capture
mobile platform, images
which is a for detecting
robotic vehicleand tracking
chassis humans
having four
in volts
12 real time. Figureproduced
DC motors 6 shows by theNevonsolutions
configuration ofPvt the
Ltdfour-wheeled
[17]. We further mobile platform,
install a WI-FIwhich
module is ina
robotic vehicle
this vehicle chassis
chassis having
to help four 12 volts
in controlling DC motors
the mobile produced
platform by Nevonsolutions
wirelessly. Moreover, eachPvt DC Ltd [17].
motor is
We further install a WI-FI module in this vehicle chassis to help in controlling the
connected to an LM298N motor driver to help in driving the motors in the required direction according mobile platform
wirelessly. Moreover,
to the received controleach
inputDC motor the
through is connected to an LM298N motor driver to help in driving the
WI-FI module.
motors in the required direction according to the received control input through the WI-FI module.
Figure 6.
Figure Configuration of
6. Configuration of the
the four-wheeled
four-wheeled mobile
mobile platform.
platform.
Figure 6. Configuration of the four-wheeled mobile platform.
Figure
Figure 77 shows
shows the
the experimental
experimental setting
setting used
used in
in the
the experiments.
experiments. TheThe four-wheeled
four-wheeled mobile
mobile
Figure
platform 7 shows the experimental setting used in the experiments. The four-wheeled mobile
platform isis equipped
equipped with
with aa Kinect
Kinect device
device and
and is
is controlled
controlled wirelessly
wirelessly by
by aa cell
cell phone
phone with
with aa control
control
platform
interface. isThe
equipped
Kinect with
mounteda Kinect device
on the
the andplatform
mobile is controlled wirelessly by a cell phone with a control
interface. The Kinect mounted on mobile platform captures live video
captures live video streams
streams toto detect
detect and
and to
to
interface.
track The
humans, Kinect
whom mounted
can be on the
monitored mobile
in the platform
display of captures
the live
laptop. Thevideo
mobilestreams to
platform detect
is and to
controlled
track humans, whom can be monitored in the display of the laptop. The mobile platform is
track
throughhumans, cell whom
thethrough phone can beHTMLmonitoredfor in following
the display of the laptop. The mobile platform is
controlled the via
cell an
phone viapage
an HTML page forthe detected
following and
the tracked
detected human.
and tracked human.
controlled through the cell phone via an HTML page for following the detected and tracked human.
Kinect
Kinect
Cell Phone
Cell Phone
Controller
Controller
Four-wheeled
Four-wheeled
mobile platform
mobile platform
Figure 7.
7. Experimentalsetting:
setting: Thefour-wheeled
four-wheeled mobile platform equipped with a Kinect device
Figure
Figure 7. Experimental
Experimental setting:The The four-wheeled mobile platform
mobile equipped
platform with
equipped a Kinect
with device
a Kinect and
device
and controlled
controlled wirelessly
wirelessly by
by a by a cell
cellaphone phone with
withwith a control
a control interface.
interface.
and controlled wirelessly cell phone a control interface.
The first runtime scenario is an experiment of interaction between one detected human and the
The first runtime scenario is an experiment of interaction between one detected human and the
mobile robot through wireless remote control. Figure Figure 88 shows
shows the
the photos
photos of motion sequences of
mobile robot through wireless remote control. Figure 8 shows the photos of motion sequences of
interaction between the human and the remote-controlled mobile platform platform during
during this
this experiment.
experiment.
interaction between the human and the remote-controlled mobile platform during this experiment.
Figure 8a,c,e is the input images taken from the Kinect on the mobile platform. The sky-blue colored
Figure 8a,c,e is the input images taken from the Kinect on the mobile platform. The sky-blue colored
box in
polygon box inFigure
Figure8b8(b) is the
is the human
human detection
detection result
result ofproposed
of the the proposed system,
system, andmobile
and the the mobile
robot
polygon box in Figure 8(b) is the human detection result of the proposed system, and the mobile
robot started
started movingmoving
forwardforward by the command
by the control control command givencell
given through through
phonecell
by aphone byuser.
remote a remote
Figureuser.
8d,f
robot started moving forward by the control command given through cell phone by a remote user.
Figure the
shows 8d,fmobile
shows platform
the mobile platform
is in motionisfor
in tracking
motion for
thetracking
human. the human.
Figure 8d,f shows the mobile platform is in motion for tracking the human.
(a) (b)
(c) (d)
(e) (f)
Figure 8.
Figure Recorded image
8. Recorded image sequence
sequence of
of interaction
interaction between
between one
one human
human and
and the
the mobile
mobile robot
robot through
through
wireless remote control.
wireless remote control.
The second
The second runtime
runtimescenario
scenarioisisanan
experiment of detecting
experiment andand
of detecting tracking two humans
tracking in thein
two humans room.
the
Figure 9 shows the image sequence of detecting and tracking two humans using the proposed
room. Figure 9 shows the image sequence of detecting and tracking two humans using the proposed system
system during the second experiment. Figure 9a,c is the input data captured from the Kinect, and
during
Figuresthe second
9b,d showexperiment. Figure 9a,c
that two humans areisdetected
the inputand
dataare
captured
trackedfrom the Kinect,inand
individually twoFigure 9b,d
different
show
colored polygons box in a single frame. Therefore, the above experimental results validate box
that two humans are detected and are tracked individually in two different colored polygons the
in a single frame.
performance Therefore,
of the proposed thehuman
above experimental
detection and results validate
tracking the performance
system combined with of the
theproposed
wireless
human detection and
remote-controlled tracking
mobile systemAcombined
platform. video clipwith the above
of the wireless remote-controlled
experimental resultsmobile platform.
can refer to the
A video clip of the above
website of reference [18]. experimental results can refer to the website of reference [18].
(a) (b)
(c) (d)
Figure 9. Recorded image sequence of detecting and tracking two humans in the room.
6. Conclusions and Future Work

6. Conclusions and Future Work
In this paper, we have implemented real-time human detection and tracking function on a
In this paper, we have implemented real-time human detection and tracking function on a
wireless controlled mobile platform, which can be easily adaptable as an intelligent surveillance
wireless controlled mobile platform, which can be easily adaptable as an intelligent surveillance
system for security monitoring in venues like railway stations, airports, bus stations, and museums,
system for security monitoring in venues like railway stations, airports, bus stations, and museums,
etc. Experimental results show that the user can easily use a cell phone to remote control the mobile
etc. Experimental results show that the user can easily use a cell phone to remote control the mobile
platform for the detection and tracking of multiple humans in a real-world environment. This design
platform for the detection and tracking of multiple humans in a real-world environment. This
can improve the security level of many practical applications based on a mobile platform equipped
design can improve the security level of many practical applications based on a mobile platform
with a Kinect device. Moreover, it is also helpful for applications of remote monitoring, data collection,
equipped with a Kinect device. Moreover, it is also helpful for applications of remote monitoring,
and experimental purpose.
data collection, and experimental purpose.
In future work, we will try to implement the human detection and tracking on an ARM-based
In future work, we will try to implement the human detection and tracking on an ARM-based
embedded platform, e.g., Rasberry Pi 3 [19], Beaglebone [20]. Running on an embedded platform helps
embedded platform, e.g., Rasberry Pi 3 [19], Beaglebone [20]. Running on an embedded platform
in decreasing the system complexity and wireless transmission for good communication. Moreover,
helps in decreasing the system complexity and wireless transmission for good communication.
we can add more RGB-D cameras on the proposed wireless controlled mobile platform for 360 degrees
Moreover, we can add more RGB-D cameras on the proposed wireless controlled mobile platform
video capturing of the environment.
for 360 degrees video capturing of the environment.
Author Contributions: Methodology, S.S.; Project administration, C.-Y.T.; Resources, C.-Y.T.; Writing—
original draft, S.S.; Writing—review & editing, C.-Y.T.
Author Contributions: Methodology, S.S.; Project administration, C.-Y.T.; Resources, C.-Y.T.; Writing—Original
Draft, S.S.; Writing—Review & Editing, C.-Y.T.
Funding: This study was funded by the international short-term research internship under TEEP@AsiaPlus
program of the Ministry of Education of Taiwan and by the Ministry of Science and Technology of Taiwan under
Grant MOST 107-2221-E-032-049.
Acknowledgments: The authors would like to thank Narathorn Dontricharoen of Kasetsart University and
Hua-Ting Wei of Tamkang University for their participating in experiments.
Conflicts of Interest: The authors declare no conflict of interest.
References
1. Quigley, M.; Gerkey, B.; Smart, W.D. Programming Robots with ROS, 1st ed.; 1005 Gravenstein Highway North:
Sebastopol, CA, USA, 2015.
2. Zalewski, M. The Tangled Web; 38 Ringgold Street: San Francisco, CA, USA, 2011; Chapter 3.
3. Hussein, M.; Abd-Almageed, W.; Ran, Y.; Davis, L. Real-time human detection, tracking, and verification
in uncontrolled camera motion environments. In Proceedings of the 4th IEEE International Conference on
Computer Vision Systems, New York, NY, USA, 4–7 January 2006; p. 41.
4. Lee, J.; Choi, J.; Shin, D.; Shin, D. Multi-User Human Tracking Agent for the Smart Home; Pacific Rim
International Workshop on Multi-Agents: Guilin, China, 2006; pp. 502–507.
5. Kulkarni, R.K.; Wagh, K.V. Human tracking system. In Proceedings of the International Conference on
Electronics and Communication Systems, Coimbatore, India, 26–27 February 2014; pp. 1–6.
6. Suhr, J.K.; Jung, H.G.; Li, G.; Kim, J. Mixture of Gaussians-Based Background Subtraction for Bayer-Pattern
Image Sequences. IEEE Trans. Circuits Syst. Video Technol. 2011, 21, 365–370. [CrossRef]
7. Jolliffe, I.T.; Cadima, J. Principal component analysis: A review and recent developments. Philos. Trans. R.
Soc. A Math. Phys. Eng. Sci. 2013, 374, 2065. [CrossRef] [PubMed]
8. Welch, G.; Bishop, G. An Introduction to the Kalman Filter; University of North Carolina: Chapel Hill, NC,
USA, 2001.
9. Sales, F.F.; Portugal, D.; Rocha, R.P. Real-time people detection and mapping system for a mobile robot using
a RGB-D sensor. In Proceedings of the 11th International Conference on Informatics in Control, Automation
and Robotics, Vienna, Austria, 2–4 September 2014; pp. 467–474.
10. Jafari, O.H.; Mitzel, D.; Leibe, B. Real-time RGB-D based people detection and tracking for mobile robots
and head-worn cameras. In Proceedings of the IEEE International Conference on Robotics and Automation,
Hong Kong, China, 31 May–7 June 2014; pp. 5636–5643.
11. Nair, M.R.; Deepa, D.; Aneesh, R.P. Real time human tracking system for defense application. In Proceedings
of the International Conference on Next Generation Intelligent Systems, Kottayam, India, 1–3 September
2016; pp. 1–6.
12. Priyandoko, G.; Wei, C.K.; Achmad, M.S.H. Human following on ROS framework for a mobile robot.
SINERGI 2017, 22, 77–82. [CrossRef]
13. Nasif, S.; Khan, M.A.G. Wireless head gesture controlled wheel chair for disable persons. In Proceedings of
the IEEE Region 10 Humanitarian Technology Conference (R10-HTC), Dhaka, Bangladesh, 21–23 December
2017; pp. 156–161.
14. Patil, M.; Abukhalil, T.; Patel, S.; Sobh, T. UB swarm: Hardware implementation of heterogeneous swarm
robot with fault detection and power management. Int. J. Comput. 2015, 14, 1–14.
15. Zhang, Y. Support vector machine classification algorithm and its application. In Proceedings of the
International Conference on Information Computing and Applications, Chengde, China, 14–16 September
2012; Volume 308, pp. 179–186.
16. Clarkson, K.L. Nearest-Neighbor Searching and Metric Space Dimension, 2nd ed.; MIT Press: Murray Hill, NJ,
USA, 2005; Chapter 5.
17. Nevonsolutions Pvt Ltd. Four-Wheel Robotic Vehicle Chassis. Available online: https://nevonexpress.com/
4-Wheel-Robotic-Vehicle-Chassis.php (accessed on 29 January 2019).
18. Experimental Results of the Proposed System. Available online: https://www.youtube.com/watch?v=
AL9yPiNswDw (accessed on 29 January 2019).
19. Andrews, R. Raspberry Pi for Beginners; Richmond House: Richmond Hill, BC, Canada, 2014.
20. Molly, D. Exploring Beagle Bone; 10475 Crosspoint Boulevard: Indianapolis, IN, USA, 2015.
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Asi 02 00005

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Asi 02 00005

Uploaded by

Copyright:

Available Formats

Article

ROS-Based Human Detection and Tracking from a

Appl. Syst. Innov. 2019, 2, 5; doi:10.3390/asi2010005 www.mdpi.com/journal/asi

RGB & Depth Human Detection Detection &

Mobile Platform Client

4.1. Human Detection and Tracking

ydiff = ymax − ymin, (2)

zdiff = zmax − zmin , (3)

Figure 2. Block diagram of human detection process.

ydiff = ymax – ymin, (2)

Human Nearest Update Human

Figure 3. Block diagram of human tracking process.

Figure 4. Remote control interface shown in the HTML page.

Read SSID and

Figure 5. Remote control flowchart of the mobile platform.

6. Conclusions and Future Work

You might also like