You are on page 1of 17

Multi-Robot Patrolling Problem

AN INDUSTRIAL INTERNSHIP REPORT

submitted by

Pranav Warrier

19BLC1023

in partial fulfillment for the award of the degree of

BACHELOR OF TECHNOLOGY

in

ELECTRONICS AND COMPUTER ENGINEERING

Under the guidance of

Dr Moon Kumar Chetry

1
School of Electronics Engineering

DECLARATION BY THE CANDIDATE

I hereby declare that the Industrial Internship Report entitled “Multi-Robotic


Patrolling Problem” submitted by me to VIT University, Chennai in partial
fulfillment of the requirement for the award of the degree of Bachelor of
Technology in Computer Science and Engineering is a record of bonafide
industrial training undertaken by me under the supervision of Dr Moon Kumar
Chetry. I further declare that the work reported in this report has not been
submitted and will not be submitted, either in part or in full, for the award of any
other degree or diploma in this institute or any other institute or university.

Place: Bengaluru Pranav Warrier


Date: 12/09/2020

2
School of Electronics Engineering

BONAFIDE CERTIFICATE

This is to certify that the Industrial Internship Report entitled “Multi-Robotic


Patrolling Problem” submitted by Pranav Warrier (19BLC1023) to VIT
University, Chennai in partial fulfillment of the requirement for the award of the
degree of Bachelor of Technology in Electronics and Computer Engineering is
a record of bonafide internship undertaken by him/her fulfills the requirements as
per the regulations of this institute and in my opinion meets the necessary
standards for submission. The contents of this report have not been submitted and
will not be submitted either in part or in full, for the award of any other degree or
diploma in this institute or any other institute or university.

Dr Moon Kumar Chetry External Examiner


Center for Artificial Intelligence and Robotics
Defense Research and Development Organization

3
ACKNOWLEDGEMENT

I am extremely thankful to CAIR, DRDO for giving me the wonderful opportunity


to work as a training intern in their organisation. I would like to express my
heartfelt gratitude to my mentor Dr. Moon K Chetry, who has provided me
immense support and guidance throughout my internship tenure. I would like to
thank my teachers, mentors and guide for enabling to take on this project and
complete it to the best of my abilities.

Lastly, I would like to take a moment and thank my family and friends for their
constant support that kept me motivated in the current challenging situations. It is
due to their confidence and persistent belief in me that I have been able to give my
best to this project even during the pandemic.

Pranav Warrier
19BLC1023
School of Electronics Engineering
Training Intern, CAIR, DRDO

4
TABLE OF CONTENTS

Chapter Title Page

Cover Page 1
Declaration 2
Bonafide Certificate 3
Acknowledgement 4
Table of contents 5

List of Figures 6

1 Introduction 6
1.1 Problem Addressing 6

2 Literature Review 8
3.1 Game Theory 8
3.2 Swarm Intelligence 9
3.3 Ant Colony Optimization 10

3 Methodology 11
3.1 Path Planning 12
3.2 Path Optimization 13
3.3 Testing 15

4 Conclusion & Future Work 16

5 References 17

5
INTRODUCTION

Problem Addressed and Motivation: It is important to assess the effectiveness of


the full range of surveillance measures that are used to prevent crime in public
places. It is needed to prevent vulnerability of an area, be it a historical monument
or a public area. Surveillance includes regular patrolling of the environment to
keep a continuous monitoring of the protected system. It can be visual or physical
form. The focus here is on physical surveillance by means of robotic patrolling.
The main requirement of a multi-Agent patrolling system is to avoid any idle time
for an attack as well as prevent any pattern of patrolling formed by the robots.

The objective of this project is to optimize the multi robot patrolling problem. To
achieve this, we have considered all the possible ways to approach the problem,
and conducted a comparative analysis to reach a right fit to the solution. With the
addition of specific sensors and more information about the terrain, the algorithms
can be automatically modified whenever necessary, and applied to give the best or
the most optimal solution.

Problem Statement: Given a set of robots and a set of points of interest, the
patrolling (or repeated surveillance) task consists of constantly visiting these points
of interest at irregular time intervals for security purposes. The robots must
navigate through the environment of different locations that are scattered in the
operational space, and they have to coordinate their actions in order to optimize the
time spent to cover all the desired points of interest. One of the key issues of a
surveillance mission for a multi-robot system is how to coordinate their behaviors
in order to optimize the global performance. We need to implement a good
surveillance strategy to reduce the idle time of each site.

Work Description and Methodology: The robots must navigate through the
environment of different locations that are scattered in the operational space, and
they have to coordinate their actions in order to optimize the time spent to cover all
the desired points of interest. One of the key issues of a surveillance mission for a
multi-robot system is how to coordinate their behaviors in order to optimize the
global performance. We need to implement a good surveillance strategy to reduce
the idle time of each site.

The patrolling problem can be viewed as a surveillance model where each robot
visits all possible points(vertices) in the least possible interval without having a
cyclic pattern to make the surveillance less predictable.

This problem is further converted to a distributed graph coverage problem as


G = (V, E), navigation graph

6
V = {vi}, set of points of interest (with different valuation or importance I(vi))
E ={eij}, edge (or connectivity) between two locations vi and vj.

Instantaneous idleness of a vertex vi ε V in time step tk:

Ivi(tk)=tk – tk-1;
tk-1 is the last time instant the vertex visited by any robot.

Average Idleness of a Vertex can be calculated as:

,
where tk-1 is the last time instant the vertex visited by any robot.
Average idleness of the Graph G (V, E):

The average idleness of the graph is an important criterion to evaluate the


performance of different patrol algorithms. Another important criterion is the
maximum average idleness of all vertices in the graph, which is defined as

7
LITERATURE REVIEW

Models
The problem can be approached using 2 models:
1. An n-player non cooperative game theoretical approach
2. Swarm Intelligence Approach

Game Theory
An n-players non-cooperative game can be defined as a tuple Ʃ =(N, A, U) where,
N={1,2,…,n}, finite set of agents (robots)
A=A1 x A2 x … x An , all possible actions taken by all the agents, where
Ai = {ai1 , ai2 , …, aim } is the set of actions of agent i

U: A=A1 x A2 x … x An → R
Ui :A=A1 x A2 x … x An → R,
where a=(a1 , a2 , …, an ) is the global utility ( or reward ) for the whole system.

A utility function is created that keeps in account factors such as idleness time of
each adjacent node, Importance of node, travelling time for robot to reach adjacent
node from previous node, travelling time for other robots to reach adjacent node
from previous node that are within communication distance.

The patrolling problem can be formulated as a dynamic game Ʃ =(N, A, U). At a


time step t, every robot i are positioned in some state s it (which is a node vi ε V),
(state of the environment st = (v1t, v2t, …, vnt). The actions set for each robot i,
Nt(vi) ={vj ε V | eij ε E}, i.e., all the adjacent nodes of the current node v i at time t.
Now in designing a good patrolling strategy for each robot i, the utility or reward
function should be able to capture the following three parameters:
 Idleness time of each adjacent nodes vj.
 Importance I(vi) of node vi
 Cost or travel time Cikj for robot i, in reaching the adjacent node vj from vk,
and
 Distance or cost (travel time) of reaching the node v j for other robots, which
are within the communication range of robot i.

At any given point, a robot will communicate with all possible robots its current
state (Vertex) and its next move (with cost). The remaining robots respond to the
message with their cost of reaching the same node.

Compute utility function for every action ak ε Nt(vi),


Uit (ak , sit ) = I(ak) x Iakavg (t) – Ciik / Ʃ1/ Cjjk ,
8
where Cjjk is the cost of reaching from node vj (robot in state sjt) to ak for all other
robot j.

One of the more intuitive approaches would be to choose an action where the robot
achieves maximum utility. This can also be considered as the best response
solution as well.
ak ε argmaxak Uit (ak , sit)

The problem with this approach is that since the robots always go for the vertex
with highest utility, it ends with the movements being predictable. To include a
randomness to the approach we introduce a probabilistic function of utility.

Pr[ak=ak] = EXP( 1/T Uit (ak , sit)) / Ʃ ak (EXP( 1/T Uit (ak , sit))).

Pr is the probability of the robot to choose a particular action ak . This probability


of selecting an action is exponentially proportional to the utility garnered from that
action. This randomization approach introduced allows the robots to choose sub-
optimal actions with low probability. This is enabled by the introduction of a noise
parameter T.

Swarm Intelligence
Swarm Intelligence is a problem-solving behavior in self-organized systems
through local communication. When the principles of swarm intelligence are
applied to the multi-agent systems to accomplish a task, they are referred to as
swarm robots.

Swarm Robotics is implemented by making use of various swarm strategies to


effectively improve the utility of each robot.A robot computes the utilities when it
gets data about the targets. Utilities depend on the target's costs and qualities.

Cost is determined by the Euclidean distance between the agent or robot and the
target in a plane field. This cost is then used to calculate the visibility of the target
which is the reciprocal of the distance.

The complexity or importance of a particular target is the quality of that particular


target. The quality of the concerned region is an assessed value that occurs in real-
life situations as a consequence of sensor readings or formerly obtained
information.

9
The utility is determined by both cost and quality of the selected target of a robot.

,α and β are used as control parameters.

The utility of a target is defined as a probability that the robot n is assigned to the
target.
The aforementioned strategy is known as the distributed bee’s algorithm.

Ant Colony Optimization


Ants communicate indirectly through environmental stimuli; this form of
communication is named stigmergy. The ants deposit a volatile hormone
(pheromone) to warn other ants that this position should be reached again. The
pheromone thus stands for a signaling system, acting as a means of indirect
communication, and leading to cooperative behaviors such as trail following.

In each cycle of the algorithm, ants are located on different graph nodes. Then, for
each time step, ants choose probabilistically which node to move to next,
according to both the pheromone concentration and length of the links they are
considering crossing: links with more pheromone and shorter distances are
preferred.

The pheromone intensity is updated as follows:


τij (T + 1) = ρ τij (T) + ∆τij
where ρ is the evaporation coefficient, τij (T +1) and τij (T) are the pheromone
intensities on edge (i, j) at cycle T + 1 and cycle T, respectively. ∆τij is the
pheromone quantity deposited on the edge (i, j) by all the ants of all colonies in
this cycle.

10
METHODOLOGY

Among Multi-Robotic Systems, the robot team autonomous patrolling problem is


very challenging: the robots must navigate through the environment so different
locations that are scattered in the operational space, and they have to coordinate
their actions in order to optimize the time spent to cover all the desired points of
interest (POI). One of the critical issues of a surveillance mission for a MRS is
how to coordinate their behaviors in order to optimize the global performance. For
example, monitoring an area of interest requires that the robots move repeatedly
through the environment, and the difficulty is to decide on the paths while
optimizing some performance criteria. Furthermore, because surveillance entails
increasing the number of trips to each node in a particular environment, an
efficient surveillance approach should shorten the duration between visits to the
same place.

Algorithm 1 Patrol mission for Agenti


1: while True do
2: if status == NotBusy then
3: report current position
4: read messages
5: assign infinity to current position cost -f∗ (sti, sti) = ∞
6: compute the utility vector σt
7: find and select the maximum utility strategy
8: report destination
9: assign Busy to its current status
10: start navigation
11: else
12: if position == destination then
13: assign NotBusy to its current status
14: else
15: continue navigation
16: end if
17: end if
18: end while

Each robot in the multi-robot team is modelled as a finite-state machine (FSM).


The team can then be considered as a collection of equivalent and independent
FSMs that communicate via messages and are synchronised through the use of a
specific message known as the token. The FSM model of each robot has five
states, viz. Sense Neighbouring Cells, Wait For Token, Critical Section, Move
Towards Target, and Handle Timeout. The robot fires its sensors and assesses the
occupancy of its adjacent cells (if the occupancy is not already known to the robot)
in the Sense Neighboring Cells stage before transitioning to the Wait For Token

11
state. In this condition, the robot waits for the token to arrive. A timeout is defined
as when the token is not received by the stated deadline.The robot responds to the
timeout by switching to the Handle Timeout state, then returning to the Sense
Neighboring Cells state. If the timeout does not occur, the robot switches to the
Wait For Token state to the Critical Section state. In this state, the robot waits until
it receives the token. If the token is not received until a specified deadline, a
timeout is said to have occurred. The robot handles this timeout by transiting to the
Handle Timeout state, after which it returns to the Sense Neighbouring Cells state.
In case the timeout does not occur, the robot transits from the Wait For Token state
to the Critical Section state. At any particular time instant, only one robot from the
team is allowed to be in this state.

The following assumptions are made by the robot model:

1) There is a safety radius around the robot.


2) The robot can control its acceleration within a certain range.
3) The robot's emergency stop has a maximum deceleration.
4) The robot can only move at a certain speed.
5) The vehicle's velocity in the direction it is travelling on a straight line.

Path Planning
Rapidly-exploring Random Tree (RRT) is an efficient algorithm and data structure
to explore and make path planning in a non-convex and high dimensional space
without colliding with obstacles. The main advantage of RRT algorithm is that it
can find a valid path in a complex domain in most of the times. However, the
found path does not guarantee to be the optimal one. Starting from a single initial
point, the RRT tree will quickly expand the whole configuration space and find a
valid path towards the goal position.

Algorithm 2 RRT
1: startNode ← new MyNode
2: myTree ← new RRT Tree
3: myTree.addNode(startNode, NULL)
4: while not found do
5: randomNode ← getRandomNode(myTree, env)
6: nearestNode ← getNearestNode(myTree, random Node)
7: if validSegment(nearestNode, randomNode) do
8: newNode ← createNewNode(nearestNode, randomNode, maxLen)
9: myTree.addNode(newNode, nearestNode)
10: nearestNode ← newNode
11: if hit Target do
12: found ← true
13: return path

12
14: end
15: end

RRT Algorithm

Typically, in surveillance tasks with small drones, the map is split into a grid of
small cells For each of these N × N grids, the random strategies are simulated with
k robots, 1 ≤ k < N2. For every grid and each value k, 10 repetitions of the
experiment are carried out, choosing the starting position of each robot uniformly
at random from the synchronized positions. Each experiment on an N × N grid ran
for 4N2 units of time. In the following, E(i) N,k denotes the i-th repetition of an
experiment on an N × N grid using k robots.

Path Optimization
A spanning tree construction algorithm, Create Tree. This algorithm creates
spanning trees while considering the initial location of all robots in the team and
the objective described above, i.e., it tries to minimize the maximal distance
between any two adjacent robots on the tree. The general algorithm, is composed
of two stages. In the first stage, a subtree is created gradually for each robot
starting from the initial position of the robot, such that in each cycle either one or
two cells are added to each subtree. In the second stage of the algorithm, after the k
subtrees have already been generated they only have to be connected.
_
Algorithm 3 Create Tree
1: Build k subtrees as follows.
2: for every robot Ri , 1 ≤ i ≤ k do
3: for each possible next cell (up, down, right, left) do
4: Compute the Manhattan distance from the current location of all other
robots.
5: if more than one possible next move exists then
6: pick the one whose minimal distance to any other robot is maximized.
7: if there is no next possible move then
8: Perform Hilling Procedure from the last main branch.
13
9: if failed to find an unoccupied cell in Hilling then
10: Branch-out in a BFS manner from the main branch
11: Find all possible bridges between the k trees.
12: for i = 0 to max{k α , N} do
13: At random, find a valid set of bridges Bi between trees such that they create
one tree with all N vertices.
14: Compute the set Si of distances between every two consecutive robots on the
tree.
15: Best Result is initialized with S0.
16: if the maximal value in Si is lower than the maximal value in Best Result
then
17: Best Result ← Si.
18: Return the tree associated with Best Result.

Hilling

When each robot has finished covering a cell, it will communicate the new graph
information to the rest of the team. The robot will synchronise its own graph with
the new information whenever it receives the new graph information. As a result,
everyone in the team has a shared global perspective on the situation of the globe.
They are able to coordinate their efforts in order to accomplish the coverage in the
lowest amount of time. In the event of attrition, the remaining crew can continue to
cover the area without interruption. However, if communication is lost, coverage
will be repeated.

14
Testing

Single Robot Coverage without obstacles and with obstacles( eg. a wall).

Two Robots Coverage

CONCLUSION & FUTURE WORK


15
Multi-robot patrolling is a problem that requires solutions integrating both
efficiency and unpredictability. Literature works suggest that cycles and
partitioning principles allow developing patrolling models that perform better than
patrolling models based on other principles to solve this problem.

While game theory focused on covering the problem as a distributed graph with
vertices as targets, swarm intelligence approach takes in quality factors of the
targets to be visited and then chooses high utility targets. Although the utility
calculation remains same for both, the factors affecting utility changes. Game
theoretical approach also introduced a randomization factor also known as log
linear learning, while ant colony optimization approaches the paths with
pheromone intensity.

The implementation of the problem was carried out with a general algorithm for
maximising utility of each agent and then followed by optimization of the path
with the use of Random Tree algorithms and the reducing the total distance
between 2 point of the spanning tree using hilling and BFS.

Among others, future work can focus on overcoming weaknesses such as: the
absence of studies on scalability, flexibility, resource utilization, interference,
communication load or workload among robots when performing the patrolling
task; the lack of experimental work using teams of robots in real scenarios;
simplifications and infeasibility of simulation approaches towards real life
experiments; lack of comparisons of the actual time spent between different
strategies in their patrolling cycles; lack of diversity and classification of the
environments tested and the deterministic nature of many centralized approaches.

Further optimization in the system can also be achieved by narrowing down the
application and the terrain. This project offers a general solution to all the
patrolling scenarios and is therefore suited to a large set of problems. By
specifying the specific constraints, it is possible to make this dynamic patrolling
solution more optimal for the selected purpose.

The work has been shown in the form of images, programs and algorithms that
were generated, tested and utilized for the development of this project. This project
was completed under the guidance of Dr Moon K Chetry, scientist at CAIR,
DRDO.

REFERENCES

16
1. Khan, M. E. 2007. Game theory models for pursuit evasion games. Technical re-
port, University of British Columbia, Vancouver.
2. Erik Hernández, Jaime del Cerro and Antonio Barrientos: Game Theory Models
for Multi-Robot Patrolling of Infrastructures in International Journal of Ad-
vanced Robotic Systems, 14 June 2012.
3. Aguirre, O., Llausas, R., Lucero, C., Taboada, H., Espiritu, J. and Kiekintveld, C.
(2011). A multi‐ objective evolutionary algorithm for intelligent patrolling, In
Proceedings of the 41st International Conference on Computers & Industrial
Engineering.
4. Yehuda Elmaliach, Noa Agmon and Gal A. Kaminka for Multi-Robot Area Patrol
under Frequency Constraints, 2007 IEEE International Conference on Robot-
ics and Automation Roma, Italy, 10-14 April 2007.
5. Avinash Gautam, Bhargav Jha, Gourav Kumar, J. Krishna Murthy, SP Arjun Ram,
Sudeept Mohan 6: Synchronous Frontier Allocation for Scalable Online Multi-
Robot Terrain Coverage in Springer Science+Business Media Dordrecht 2016
6. Zhi Yan, Nicolas Jouandeau and Arab Ali Cherif: A Survey and Analysis of Multi-
Robot Coordination for International Journal of Advanced Robotic Systems.

17

You might also like