Professional Documents
Culture Documents
submitted by
Pranav Warrier
19BLC1023
BACHELOR OF TECHNOLOGY
in
1
School of Electronics Engineering
2
School of Electronics Engineering
BONAFIDE CERTIFICATE
3
ACKNOWLEDGEMENT
Lastly, I would like to take a moment and thank my family and friends for their
constant support that kept me motivated in the current challenging situations. It is
due to their confidence and persistent belief in me that I have been able to give my
best to this project even during the pandemic.
Pranav Warrier
19BLC1023
School of Electronics Engineering
Training Intern, CAIR, DRDO
4
TABLE OF CONTENTS
Cover Page 1
Declaration 2
Bonafide Certificate 3
Acknowledgement 4
Table of contents 5
List of Figures 6
1 Introduction 6
1.1 Problem Addressing 6
2 Literature Review 8
3.1 Game Theory 8
3.2 Swarm Intelligence 9
3.3 Ant Colony Optimization 10
3 Methodology 11
3.1 Path Planning 12
3.2 Path Optimization 13
3.3 Testing 15
5 References 17
5
INTRODUCTION
The objective of this project is to optimize the multi robot patrolling problem. To
achieve this, we have considered all the possible ways to approach the problem,
and conducted a comparative analysis to reach a right fit to the solution. With the
addition of specific sensors and more information about the terrain, the algorithms
can be automatically modified whenever necessary, and applied to give the best or
the most optimal solution.
Problem Statement: Given a set of robots and a set of points of interest, the
patrolling (or repeated surveillance) task consists of constantly visiting these points
of interest at irregular time intervals for security purposes. The robots must
navigate through the environment of different locations that are scattered in the
operational space, and they have to coordinate their actions in order to optimize the
time spent to cover all the desired points of interest. One of the key issues of a
surveillance mission for a multi-robot system is how to coordinate their behaviors
in order to optimize the global performance. We need to implement a good
surveillance strategy to reduce the idle time of each site.
Work Description and Methodology: The robots must navigate through the
environment of different locations that are scattered in the operational space, and
they have to coordinate their actions in order to optimize the time spent to cover all
the desired points of interest. One of the key issues of a surveillance mission for a
multi-robot system is how to coordinate their behaviors in order to optimize the
global performance. We need to implement a good surveillance strategy to reduce
the idle time of each site.
The patrolling problem can be viewed as a surveillance model where each robot
visits all possible points(vertices) in the least possible interval without having a
cyclic pattern to make the surveillance less predictable.
6
V = {vi}, set of points of interest (with different valuation or importance I(vi))
E ={eij}, edge (or connectivity) between two locations vi and vj.
Ivi(tk)=tk – tk-1;
tk-1 is the last time instant the vertex visited by any robot.
,
where tk-1 is the last time instant the vertex visited by any robot.
Average idleness of the Graph G (V, E):
7
LITERATURE REVIEW
Models
The problem can be approached using 2 models:
1. An n-player non cooperative game theoretical approach
2. Swarm Intelligence Approach
Game Theory
An n-players non-cooperative game can be defined as a tuple Ʃ =(N, A, U) where,
N={1,2,…,n}, finite set of agents (robots)
A=A1 x A2 x … x An , all possible actions taken by all the agents, where
Ai = {ai1 , ai2 , …, aim } is the set of actions of agent i
U: A=A1 x A2 x … x An → R
Ui :A=A1 x A2 x … x An → R,
where a=(a1 , a2 , …, an ) is the global utility ( or reward ) for the whole system.
A utility function is created that keeps in account factors such as idleness time of
each adjacent node, Importance of node, travelling time for robot to reach adjacent
node from previous node, travelling time for other robots to reach adjacent node
from previous node that are within communication distance.
At any given point, a robot will communicate with all possible robots its current
state (Vertex) and its next move (with cost). The remaining robots respond to the
message with their cost of reaching the same node.
One of the more intuitive approaches would be to choose an action where the robot
achieves maximum utility. This can also be considered as the best response
solution as well.
ak ε argmaxak Uit (ak , sit)
The problem with this approach is that since the robots always go for the vertex
with highest utility, it ends with the movements being predictable. To include a
randomness to the approach we introduce a probabilistic function of utility.
Pr[ak=ak] = EXP( 1/T Uit (ak , sit)) / Ʃ ak (EXP( 1/T Uit (ak , sit))).
Swarm Intelligence
Swarm Intelligence is a problem-solving behavior in self-organized systems
through local communication. When the principles of swarm intelligence are
applied to the multi-agent systems to accomplish a task, they are referred to as
swarm robots.
Cost is determined by the Euclidean distance between the agent or robot and the
target in a plane field. This cost is then used to calculate the visibility of the target
which is the reciprocal of the distance.
9
The utility is determined by both cost and quality of the selected target of a robot.
The utility of a target is defined as a probability that the robot n is assigned to the
target.
The aforementioned strategy is known as the distributed bee’s algorithm.
In each cycle of the algorithm, ants are located on different graph nodes. Then, for
each time step, ants choose probabilistically which node to move to next,
according to both the pheromone concentration and length of the links they are
considering crossing: links with more pheromone and shorter distances are
preferred.
10
METHODOLOGY
11
state. In this condition, the robot waits for the token to arrive. A timeout is defined
as when the token is not received by the stated deadline.The robot responds to the
timeout by switching to the Handle Timeout state, then returning to the Sense
Neighboring Cells state. If the timeout does not occur, the robot switches to the
Wait For Token state to the Critical Section state. In this state, the robot waits until
it receives the token. If the token is not received until a specified deadline, a
timeout is said to have occurred. The robot handles this timeout by transiting to the
Handle Timeout state, after which it returns to the Sense Neighbouring Cells state.
In case the timeout does not occur, the robot transits from the Wait For Token state
to the Critical Section state. At any particular time instant, only one robot from the
team is allowed to be in this state.
Path Planning
Rapidly-exploring Random Tree (RRT) is an efficient algorithm and data structure
to explore and make path planning in a non-convex and high dimensional space
without colliding with obstacles. The main advantage of RRT algorithm is that it
can find a valid path in a complex domain in most of the times. However, the
found path does not guarantee to be the optimal one. Starting from a single initial
point, the RRT tree will quickly expand the whole configuration space and find a
valid path towards the goal position.
Algorithm 2 RRT
1: startNode ← new MyNode
2: myTree ← new RRT Tree
3: myTree.addNode(startNode, NULL)
4: while not found do
5: randomNode ← getRandomNode(myTree, env)
6: nearestNode ← getNearestNode(myTree, random Node)
7: if validSegment(nearestNode, randomNode) do
8: newNode ← createNewNode(nearestNode, randomNode, maxLen)
9: myTree.addNode(newNode, nearestNode)
10: nearestNode ← newNode
11: if hit Target do
12: found ← true
13: return path
12
14: end
15: end
RRT Algorithm
Typically, in surveillance tasks with small drones, the map is split into a grid of
small cells For each of these N × N grids, the random strategies are simulated with
k robots, 1 ≤ k < N2. For every grid and each value k, 10 repetitions of the
experiment are carried out, choosing the starting position of each robot uniformly
at random from the synchronized positions. Each experiment on an N × N grid ran
for 4N2 units of time. In the following, E(i) N,k denotes the i-th repetition of an
experiment on an N × N grid using k robots.
Path Optimization
A spanning tree construction algorithm, Create Tree. This algorithm creates
spanning trees while considering the initial location of all robots in the team and
the objective described above, i.e., it tries to minimize the maximal distance
between any two adjacent robots on the tree. The general algorithm, is composed
of two stages. In the first stage, a subtree is created gradually for each robot
starting from the initial position of the robot, such that in each cycle either one or
two cells are added to each subtree. In the second stage of the algorithm, after the k
subtrees have already been generated they only have to be connected.
_
Algorithm 3 Create Tree
1: Build k subtrees as follows.
2: for every robot Ri , 1 ≤ i ≤ k do
3: for each possible next cell (up, down, right, left) do
4: Compute the Manhattan distance from the current location of all other
robots.
5: if more than one possible next move exists then
6: pick the one whose minimal distance to any other robot is maximized.
7: if there is no next possible move then
8: Perform Hilling Procedure from the last main branch.
13
9: if failed to find an unoccupied cell in Hilling then
10: Branch-out in a BFS manner from the main branch
11: Find all possible bridges between the k trees.
12: for i = 0 to max{k α , N} do
13: At random, find a valid set of bridges Bi between trees such that they create
one tree with all N vertices.
14: Compute the set Si of distances between every two consecutive robots on the
tree.
15: Best Result is initialized with S0.
16: if the maximal value in Si is lower than the maximal value in Best Result
then
17: Best Result ← Si.
18: Return the tree associated with Best Result.
Hilling
When each robot has finished covering a cell, it will communicate the new graph
information to the rest of the team. The robot will synchronise its own graph with
the new information whenever it receives the new graph information. As a result,
everyone in the team has a shared global perspective on the situation of the globe.
They are able to coordinate their efforts in order to accomplish the coverage in the
lowest amount of time. In the event of attrition, the remaining crew can continue to
cover the area without interruption. However, if communication is lost, coverage
will be repeated.
14
Testing
Single Robot Coverage without obstacles and with obstacles( eg. a wall).
While game theory focused on covering the problem as a distributed graph with
vertices as targets, swarm intelligence approach takes in quality factors of the
targets to be visited and then chooses high utility targets. Although the utility
calculation remains same for both, the factors affecting utility changes. Game
theoretical approach also introduced a randomization factor also known as log
linear learning, while ant colony optimization approaches the paths with
pheromone intensity.
The implementation of the problem was carried out with a general algorithm for
maximising utility of each agent and then followed by optimization of the path
with the use of Random Tree algorithms and the reducing the total distance
between 2 point of the spanning tree using hilling and BFS.
Among others, future work can focus on overcoming weaknesses such as: the
absence of studies on scalability, flexibility, resource utilization, interference,
communication load or workload among robots when performing the patrolling
task; the lack of experimental work using teams of robots in real scenarios;
simplifications and infeasibility of simulation approaches towards real life
experiments; lack of comparisons of the actual time spent between different
strategies in their patrolling cycles; lack of diversity and classification of the
environments tested and the deterministic nature of many centralized approaches.
Further optimization in the system can also be achieved by narrowing down the
application and the terrain. This project offers a general solution to all the
patrolling scenarios and is therefore suited to a large set of problems. By
specifying the specific constraints, it is possible to make this dynamic patrolling
solution more optimal for the selected purpose.
The work has been shown in the form of images, programs and algorithms that
were generated, tested and utilized for the development of this project. This project
was completed under the guidance of Dr Moon K Chetry, scientist at CAIR,
DRDO.
REFERENCES
16
1. Khan, M. E. 2007. Game theory models for pursuit evasion games. Technical re-
port, University of British Columbia, Vancouver.
2. Erik Hernández, Jaime del Cerro and Antonio Barrientos: Game Theory Models
for Multi-Robot Patrolling of Infrastructures in International Journal of Ad-
vanced Robotic Systems, 14 June 2012.
3. Aguirre, O., Llausas, R., Lucero, C., Taboada, H., Espiritu, J. and Kiekintveld, C.
(2011). A multi‐ objective evolutionary algorithm for intelligent patrolling, In
Proceedings of the 41st International Conference on Computers & Industrial
Engineering.
4. Yehuda Elmaliach, Noa Agmon and Gal A. Kaminka for Multi-Robot Area Patrol
under Frequency Constraints, 2007 IEEE International Conference on Robot-
ics and Automation Roma, Italy, 10-14 April 2007.
5. Avinash Gautam, Bhargav Jha, Gourav Kumar, J. Krishna Murthy, SP Arjun Ram,
Sudeept Mohan 6: Synchronous Frontier Allocation for Scalable Online Multi-
Robot Terrain Coverage in Springer Science+Business Media Dordrecht 2016
6. Zhi Yan, Nicolas Jouandeau and Arab Ali Cherif: A Survey and Analysis of Multi-
Robot Coordination for International Journal of Advanced Robotic Systems.
17