Professional Documents
Culture Documents
Agent Systems
Tomas Salamon1
1
Department of Information Technologies, University of Economics, Prague, Czech
Republic. salamont@vse.cz
1. Introduction
On the first layer we test the individual agents. Coelho et al. [5] offered a method
of using Mock Agent based on the concept of Mock Object by Mackinnon et al.
[11] in object-oriented programming. Mock Agent is a dummy agent without inner
functionality. It possesses a proper interface to communicate with the tested agent.
Mock Agent can send and receive messages to and from the tested agent and a
programmer can evaluate whether the reaction of the tested agent is correct. Dur-
ing testing we can observe the internal states of the test agent using tools like
agent inspector, logging and others if they are present in the agent framework.
The actual process of testing is as follows: the testing environment creates the
tested agent, the mock agents (Coelho et al. [5] recommend to create one mock
agent for every role of the tested agent) and all other components of the system
and starts a Test Case where tests to perform are defined.
Then the mock agent sends messages to the tested agent according to the test
plan and receives its responses. The tested agent is monitored by agent monitor (if
such tool is present in the system) which logs its activity for the test case. Test
Case compares records with the plan to find out whether the agent’s behavior is
correct.
4 Tomas Salamon
Although it is seldom achievable to test all the possible situations, because the
agent’s state space is too large, all the relevant types of situations, both desirable
and undesirable, should be tested.
A drawback of this approach is that it is too much inspired by its “object oriented”
ancestor. In an object-oriented world, there is no uncertainty about the result of the
actual operation (method). Besides various errors and program states covered by
exceptions, the behavior of the object is fully predictable. The object has no ability
to reject a request to perform a required action. The agent has such ability and it
generally can reject or simply ignore an action which is asked for. Although many
multi-agent systems are based on simple, reactive agents that don’t possess even
unsophisticated artificial intelligence but simply react on the information from the
surroundings, agents are often defined as entities with their own intelligence and
autonomous decision-making ability [21]. Thus their behavior need not be fully
predictable, and we should not test such agents in a traditional, discrete way. A
stochastic approach should be adopted instead.
One can have an objection that although the agent’s behavior as a black box
could outwardly look ambiguous and unpredictable, if we can monitor the agent
“inside” and watch its internal states (so-called glass-box testing), there could be
no uncertainty.
Nonetheless, agent behavior could be still inherently random in some cases. A
good example is agent-based simulation in the social sciences where agents’ inter-
nal states are often represented by random variables from various probability dis-
tributions. Another example of inherent randomness in agents’ behavior could be
certain methods of deadlock resolution. For example, suppose a traffic manage-
ment multi-agent simulation, where three car-agents come to an intersection of
three roads from three directions together and there is no priority sign. It is a clas-
sical example of deadlock, and one of the solutions is that one of the cars simply
drives away after a random time period. If agents’ decision-making was perfectly
deterministic, all three cars would either stay there forever or always start driving
at once and their smashup would be inevitable.
Due to the reasons mentioned above, we propose using stochastic testing where
suitable. The main difference between traditional and stochastic testing is that with
“normal” unit testing, the test case is always carried out only once and it is consid-
ered successful if all the tests in the test case are passed. In the case of stochastic
testing, the test should be repeated many times and its results are recorded. The
test is considered successful when the empirical probabilistic distribution of test
results fits the expected distribution function. The evaluation of results is done by
statistical means (the Kolmogorov-Smirnov test is one of the possibilities). The re-
sults of the test cannot be 100% correct by definition, but neither can the results of
A Three-Layer Approach to Testing of Multi-Agent Systems 5
“traditional” unit testing. We can control the required significance level by adjust-
ing the number of attempts.
A more detailed description of the method of stochastic agent unit testing is as
follows:
Step 1: In the first step a testing platform and all its components are created in a
test suite and a test case is launched.
Step 2: The test case creates mock agent(s) and the tested agent.
Step 3: Mock agents start to communicate with a tested agent. Resulting states
are recorded. Step 3 is repeated multiple times (the number of required
repetitions depends on a required level of significance).
A product of such testing is an empirical distribution of results. The test envi-
ronment performs a goodness-of-fit test to compare whether expected theoretical
distribution matches the empirical distribution of test results.
We see two possible drawbacks in this approach. First, it could be sometimes
difficult to devise the theoretical distribution function of test results to compare
with the empirical one. The second disadvantage is the speed of such testing,
which is indeed substantially lower than in the case of the traditional, discrete ap-
proach.
The testing approach based solely on mock agents is suitable for multi-agent sys-
tems where the only mean of agents’ information gathering is their mutual com-
munication. This limitation is a valid assumption for most agent-based simulations
in social sciences, where there is no outer environment that the agent should per-
ceive and the only information it could gain is information from the other agents.
In some other types of multi-agent systems such is not a case. For example, the
agents in multi-agent systems for traffic management simulation [7] can not only
communicate with other ones, but they gather percepts from their surroundings as
well.
Suppose a traffic management simulation where there are various vehicles,
roads, buildings and other entities in the model. There can be two ways of design-
ing such a model. One possibility is that all entities in the simulation would be
represented by agents. Another option is that just the entities of our interest (e.g.,
cars and other vehicles) could be embodied by agents and the rest can be a part of
the environment. In the second case, agents gain their information not only by
their mutual communication, but they also need to perceive the environment. So,
beside their messaging ability, they also need some perception function that must
be taken into account in the test environment. In our hypothetical traffic manage-
ment simulation, the perception function should, for example, allow the simulated
car-agent to watch the road and nearby buildings.
6 Tomas Salamon
Even if we were able to test the individual agents perfectly, the whole multi-agent
system could be still erroneous. A multi-agent system is not just an aggregate of
its parts; it possesses emergent properties, too, which begin to appear as soon as
two or more agents begin to interoperate. For this reason, failures could show up
even in a system with 100% error-free agents. Typical examples of errors of this
kind are deadlocks, livelocks and various other kinds of conflicts.
Deadlock is described as a state of the system when two or more concurrent ac-
tions that share a common resource are waiting for the other to finish, but it never
occurs. This was first discussed in the case of multitasking in operating systems
[6], but it is a pervasive phenomenon in most types of distributed systems includ-
ing multi-agent systems. Livelock is a similar system state when the actions in-
volved in the livelock hand the activity onto one another forever without any pro-
gress.
Testing these situations is a challenging task, due to its problematical predict-
ability. In the literature we can find a few works dealing with this specific problem
in multi-agent systems. Burkhard [4] discusses the problem of deadlocks from
theoretical aspects. Poutakidis et al. [15] suggest a method of testing agent interac-
tion protocols through their transformation into Petri Nets. Unfortunately, this
does not solve the particular problem of deadlocks and similar situations.
We suggest another method of detecting problems on the second layer. This
method is based on the stochastic concept as well, so results cannot be guaranteed.
A Three-Layer Approach to Testing of Multi-Agent Systems 7
However, as deadlocks are unpredictable, it is still probably the most feasible and
a quite simple solution.
It is based on watching and logging the functions of the existing multi-agent
system. We create a table with one line for every type of agent interaction in the
system. Every agent is able to make a relationship with at least one another agent.
By the interaction we mean a single negotiation scenario. By the negotiation sce-
nario we mean a set of requests and answers in order to solve one type of issue
with one other agent (using UML we can depict such a scenario with Sequence
Diagram). Due to the limitations of artificial intelligence, the agents are still not
able to perform the negotiations and collaborations with other agents arbitrarily.
They always use some scenarios that are “hardcoded” into their program. There is
typically a higher number of interactions than agents but this doesn’t need to be
always true.
We log every executed interaction in the running system (see Fig. 1), which is
defined by a source agent, a destination agent and a scenario. Every interaction
has a score. When the interaction occurs, we examine the table to see whether
there is already such an interaction recorded. If so, we increase its score by a so-
called s-value; otherwise we add it into the table with the score of s-value (there is
an s-value of 2 used in Fig. 1). After every simulation round, we decrease the
score of every agent by 1. The aforementioned s-value is an integer value deter-
mining the sensitivity of the test. We have observed the best results with the s-
values between 2 and 4. In the Fig. 1. we can see an example with communication
between A1–B1, A1–A2 and A2–C1. In the case of A1–B1, there was communi-
cation in two foregoing rounds as well. In the case of B1–C1 there was communi-
cation in the previous round, but there is no communication in the actual round.
The interactions with the highest (and growing) scores are candidates for dead-
locks. We can simply transform this table to a graph where the agents are nodes,
the interactions are edges and every edge is valued by its score. If there are cycles
with high and growing scores on their edges in the graph, it is a probable spot of
deadlock.
8 Tomas Salamon
If deadlock candidates are found this way, developers should explore the agents
more deeply, to reveal the particular cause of the deadlocks and make agents more
resistant.
On the third layer we test problems regarding the entire system. Although we have
unveiled all the errors in the individual agents and agent interactions, there could
still be problems of the whole system that emerge when the system is populated
with a multitude of agents.
Although multi-agent systems belong to distributed systems, there are often some
central points in the system that can turn out into bottlenecks. These points could
be messaging services, yellow-pages services (searching for a particular agent ac-
cording to its characteristics), white-pages services (mapping symbolic names to
addresses of particular agents) and others. These services could be working per-
fectly with the individual agents or even with a certain amount of agents, but as
there are more and more agents, the load of these “hot spots” could slow down the
entire system.
There could be three main reasons for these problems. First, there could be im-
proper design of such services. If the bottleneck is caused only by this reason, the
solution is typically feasible and relatively undemanding. We should optimize the
code of this service by common means.
Second, the bottleneck could be caused by improper design of agents. Typically
this problem occurs when agents send many more requests to the service than nec-
essary for the purpose of the certain agent-based model or multi-agent system
[16]. We have the experience with this problem from research on a particular
agent-based simulation. We designed an agent-based model with agents represent-
ing people and businesses in a virtual economy. Businesses hired employees. In
the first version of the system, during recruitment, business-agents tried to send
their offers to all people-agents in the simulation. In spite of a relatively low num-
ber of both businesses (1,000) and people (10,000) in the model, the number of
messages after the first simulation round was about 107, as every business sent a
message to every person. The messaging service was flooded with requests and
the entire system almost stopped. The problem was indeed in the design of the
agents. In fact, there is no need to send so many messages, because sending a mes-
sage to every single agent in the system constitutes the state of perfect information
that isn’t possible in reality. Individuals cannot communicate perfectly, because
A Three-Layer Approach to Testing of Multi-Agent Systems 9
they are limited by resources (at least by time constraints), so they have no chance
to send more than a certain number of messages in every simulation round. After
the implementation of such constraints to our multi-agent system, the number of
messages in this particular simulation plummeted to about 5000, a number which
was manageable without any impact on system function. However, sometimes the
developer needs to distribute information in such massive manner. Then it is better
to use another kind of service that is more suitable for mass use (multicast).
Finally, if there is a problem neither in the code of the bottleneck nor in the
code of the agents, but the entire system is so huge that bottlenecks are simply un-
avoidable, then the only solution is to change the architecture of the system – e.g.,
to turn it into a more advanced distributed architecture and split the load among
more hubs. This kind of solution is typically the most difficult.
Testing of the multi-agent system on the third layer should be focused mainly
on measuring the performance of the system. The duration of one simulation
round, its variance through time, the speed of message delivery, the length of mes-
sage queues, etc. could be used as measures.
Another issue that should be tested on the third layer is the stability of the system
as a whole when a great number of agents crash at once. A multi-agent system is a
regular piece of software and it could indeed fail as any other system. If the pro-
gram is correctly designed, the crash of agents should not have any impact on the
system as a whole. Although most multi-agent environments are able to absorb a
crash of one or a few agents, when a massive collapse of a great number of agents
does occur, the environment need not absorb it. The problem is often in handling a
large number of exceptions that are typically highly resource-consuming. Multi-
10 Tomas Salamon
times, and for the outcome we need to compare its empirical results with those ex-
pected on the basis of the theoretical probability distribution function of the agent.
Those multi-agent systems where agents perform some kind of perception of
the environment are not fully testable solely by the use of mock agents. Therefore
we propose a principle of a “stage” that is an environment or a part of an environ-
ment for agent testing. However, using stages is problematic due to their complex-
ity and the fact that they can become a source of additional failures. In the future
we plan to elaborate the concept of stages in greater detail.
On the second layer, we are searching for the errors coming from agent interac-
tions that were not possible to find on the first layer. The most common type of
such errors is deadlocks and similar phenomena; and the detection that we propose
is based on searching for suspiciously repetitious interactions among a small
group of agents. Our future work is focused on ways in which the agents could de-
tect and get out of the deadlock automatically.
On the third layer we should test the errors of the entire system that weren’t
possible to unveil on the lower levels. There can be problems of bottlenecks in the
multi-agent system, despite its distributed character, where there are often high-
load hubs and central points that can slow down the entire system. Future work
should be done on distributed architectures of multi-agent environments.
Furthermore we recommend stress testing of the system, in order to determine
behavior and stability in the case of mass collapse of a great number of agents.
Even in such a situation, the multi-agent system should remain stable, without
substantial impact on its performance and its global behavior.
The individual methods of our approach were developed during several agent-
based simulation projects and used with favorable results. Future work should be
done on deeper elaboration of testing methodology and its further evaluation.
This approach is general and it could be used in various types of multi-agent
systems, although it was primarily devised for agent-based simulations.
6. Acknowledgements
The paper is supported by the European Union in the frame of Unified Program
Document 3 under the European Social Fund together with the Czech Republic
and the City of Prague.
References
Workshop on Intelligent Agents VII. Agent Theories Architectures and Languages, LNAI,
Springer Verlag, London, pp. 89–103.
3. Bresciani, P., Giorgini, P., Giunchiglia, F., Mylopoulos J. & Perini, A. (2002) Tropos: An
Agent-oriented software development methodology. Autonomous Agents and Multi-Agent
Systems 8, pp. 203–236.
4. Burkhard, H. (1993) Liveness and Fairness Properties in Multi-Agent Systems. In: Bajcsy, R.
(Ed.), Proceedings of the 13th International Joint Conference on Artificial Intelligence
(IJCAI 93), Morgan Kaufmann Publ., pp. 325–330.
5. Coelho, R., Kulesza, U., Staa, A. & Lucena, C. (2006) Unit Testing in Multi-agent Systems
using Mock Agents and Aspects. In: Proceedings of the 2006 international workshop on
Software engineering for large-scale multi-agent systems, ACM, New York, pp. 83–90.
6. Coffman, E.G., Elphick, M. & Shoshani, A. (1971) System Deadlocks. Computing Surveys 2,
pp. 67 78.
7. Dresner, K. & Stone, P. (2006) Multiagent Traffic Management: Opportunities for Multi-
agent Learning. In: Tuyls, K. (ed), LAMAS 2005, LNAI, Springer Verlag, Berlin,
pp. 129 138.
8. Gatti, M.A.C. & Staa, A.v. (2006) Testing & debugging multi-agent systems: a state of the
art report. Departamento de Informática, PUC-Rio, Rio de Janeiro.
9. Jonker, C.M. & Treur, J. (1998) Compositional Verification of Multi-Agent Systems: a For-
mal Analysis of Pro-activeness and Reactiveness. In: Langmaack, H., Pnueli, A. &
De Roever, W.P. (eds), Proceedings of the International Workshop on Compositionality,
COMPOS'97, LNCS, Springer Verlag, pp. 350–380.
10. Liedekerke, M. & Avouris, N. (1995) Debugging multi-agent systems. Information and Soft-
ware Technology 37(2), pp. 103–112.
11. Mackinnon, T., Freeman, S. & Craig, P. (2000) EndoTesting: Unit Testing with Mock Ob-
jects. In: eXtreme Programming and Flexible Processes in Software Engineering
- XP2000, May 2000.
12. Ndumu, D., Nwana, H., Lee, L. & Collins, J. (1999) Visualising and debugging distributed
multi-agent systems. In: Proceedings of the Third Annual Conference on Autonomous Agents,
pp. 326–333.
13. Omicini, A. (2000) SODA: Societies and Infrastructures in the Analysis and Design of
Agent-Based Systems. In: First international workshop, AOSE 2000 on Agent-oriented soft-
ware engineering, Springer-Verlag New York, pp. 185–193.
14. Padgham, L. & Winikoff, M. (2002) Prometheus: A Methodology for Developing Intelligent
Agents. In: Proceedings of the first international joint conference on Autonomous agents and
multiagent systems: part 1. July 15–19, 2002, Bologna, Italy.
15. Poutakidis, D., Padgham, L. & Winikoff, M. (2002) Debugging Multi-Agent Systems Using
Design Artifacts: The Case of Interaction Protocols. In: Proceedings of the First Interna-
tional Joint Conference on Autonomous Agents and Multi Agent Systems, AAMAS'02. July
15–19, 2002, Bologna, Italy.
16. Salamon, T. (2008) Dealing with Complexity in a Multiagent System (in Czech). In: Fi-scher,
J. (ed), Proceedings of 12th annual workshop, University of Economics, Prague, pp. 60 68
17. Sycara, K. (1998) Multiagent systems. AI Magazine 19(2): Summer, pp. 79–92.
18. Tesfatsion, L. (2002) Agent-Based Computational Economics: Growing Economies from the
Bot-tom Up. ISU Economics Working Paper No. 1., Iowa State University.
19. Tiryaki, A.M., Oztuna, S., Dikenelli, O. & Erdur, R.C. (2006) SUNIT: A Unit Testing
Framework for Test Driven Development of MASs. In: L. Padgham & F. Zambonelli (eds),
AOSE 2006, LNCS, Springer Verlag, pp. 156–173.
20. Woodridge, M., Jennings, N. & Kinny, D. (2000) The Gaia Methodology for Agent-Oriented
Analysis and Design. Journal of Autonomous Agents and Multi-Agent Systems 3,
pp. 285 312.
21. Woodridge, M. An introduction to multiagent systems. John Wiley & Sons, Chichester.