You are on page 1of 15

A Puzzle-Based Defense Strategy Against

Flooding Attacks Using Game Theory


Mehran S. Fallah
AbstractIn recent years, a number of puzzle-based defense mechanisms have been proposed against flooding denial-of-service
(DoS) attacks in networks. Nonetheless, these mechanisms have not been designed through formal approaches and thereby some
important design issues such as effectiveness and optimality have remained unresolved. This paper utilizes game theory to propose a
series of optimal puzzle-based strategies for handling increasingly sophisticated flooding attack scenarios. In doing so, the solution
concept of Nash equilibrium is used in a prescriptive way, where the defender takes his part in the solution as an optimum defense
against rational attackers. This study culminates in a strategy for handling distributed attacks from an unknown number of sources.
Index TermsClient-puzzle approach, flooding DoS attack, game theory, reliability, availability, and serviceability.

1 INTRODUCTION
A
VAILABILITY of services in a networked system is a
security concern that has received enormous attention
in recent years. Most researches in this area are on
designing and verifying defense mechanisms against
denial-of-service (DoS) attacks. A DoS attack is character-
ized by a malicious behavior, which prevents the legitimate
users of a network service from using that service. There are
two principal classes of these attacks: flooding attacks and
logic attacks [1], [2], [3].
A flooding attack such as SYN flood [4], Smurf [5], or
TFN2K [6] sends an overwhelming number of requests for a
serviceofferedbythevictim. Theserequests deplete somekey
resources at the victim so that the legitimate users requests
for the same are denied. A resource may be the capacity of a
buffer, CPUtimetoprocess requests, theavailablebandwidth
of a communication channel, etc. The resources exhausted by
a flooding attack revive when the attack flood stops. A logic
attack such as Ping-of-Death [7] or Teardrop [8] forges a fatal
message accepted and processed by the victims vulnerable
software and leads to resource exhaustion at the victim.
Unlike flooding attacks, the effects of a logic attack remain
after the attack until some appropriate remedial actions are
adopted. A logic attack can be thwarted by examining the
contents of messages received and discarding the unhealthy
ones. This is due tothe fact that anattackmessage differs from
a legitimate one in contents. In flooding attacks, on the
contrary, such a distinction is not possible. This causes
defense against flooding attacks to be an arduous task. This
paper will focus solely on flooding attacks.
A large number of defenses have been devised against
flooding attacks. According to [9], a defense mechanism
may be a reactive or preventive one. A reactive mechanism
such as pushback [10], traceback [11], [12], [13], [14], [15], or
filtering [16], [17], [18] endeavors to alleviate the impact of a
flooding attack on the victim by detecting the attack and
responding to it. A preventive mechanism, on the other
hand, enables the victim to tolerate the attack without
denying the service to legitimate users. This is usually done
by enforcing restrictive policies for resource consumption
[19], [20], [21]. A method for limiting resource consumption
is the use of client puzzles [22], [23], [24], [25], [26], [27], [28].
In general, reactive mechanisms suffer from the scal-
ability problem and difficulty of attack traffic identification.
This is not the case in the client-puzzle approach, where the
defender treats incoming requests similarly and need not
differentiate between the attack and legitimate requests.
Upon receiving a request, the defender produces a puzzle
and sends it to the requester. If it is answered by a correct
solution, the corresponding resources are then allocated. As
solving a puzzle is resource consuming, the attacker who
intends to use up the defenders resources by his repeated
requests is deterred from perpetrating the attack.
Nonetheless, an attacker who knows the defenders
possible actions andtheir corresponding costs may rationally
adopt his own actions to defeat a puzzle-based defense
mechanism. For example, if the defender produces difficult
puzzles, the attacker responds them at random and with
incorrect solutions. In this way, he may be able to exhaust the
defenders resources engaged in solution verification. If the
defender produces simple puzzles, the mechanism is not
effective in the sense that the attacker solves the puzzles and
performs an intense attack. Moreover, even if the defender
enjoys efficient low-cost techniques for producing puzzles
and verifying solutions, he should deploy the effective
puzzles of minimum difficulty levels, i.e., the optimum
puzzles, to provide the maximum quality of service for the
legitimate users. Hence, the difficulty level of puzzles should
be accurately adjusted in a timely manner to preserve the
effectiveness and optimality of the mechanism. Although some
mechanisms suchas [27] and[28] haveattemptedtoadjust the
difficulty level of puzzles according to the victims load, they
are not basedona suitable formalismincorporating the above
trade-offs and, therefore, the effectiveness and optimality of
those mechanisms have remained unresolved.
The above issues indicate that a puzzle-based defense
mechanism involves antagonistic elements and, therefore, it
can be effectively studied using game theory. In this paper,
IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010 5
. The author is with the Department of Computer Engineering, Amirkabir
University of Technology (Tehran Polytechnic), 15914 Tehran, Iran.
E-mail: msfallah@aut.ac.ir.
Manuscript received 30 June 2006; revised 29 Apr. 2007; accepted 11 Dec.
2007; published online 12 Feb. 2008.
For information on obtaining reprints of this article, please send e-mail to:
tdsc@computer.org, and reference IEEECS Log Number TDSC-0092-0606.
Digital Object Identifier no. 10.1109/TDSC.2008.13.
1545-5971/10/$26.00 2010 IEEE Published by the IEEE Computer Society
it is shown that the interactions between the attacker who
perpetrates a flooding attack, and the defender who
counters the attack using a puzzle-based defense mechan-
ism can be modeled as a two-player infinitely repeated
game with discounting. The solution concept of perfect
Nash equilibrium is then applied to the game. This leads to
the description of players optimal strategies.
This paper uses the concept of Nash equilibrium not only
in a descriptive way but also a prescriptive one. In doing so,
the difficulty level of puzzles, random number generators,
and the other parameters of a puzzle-based defense are so
adjusted that the attackers optimum strategy, prescribed by
the Nash equilibrium, does not lead to the exhaustion of
defenders resources. If the defender takes his part in the
Nash equilibrium prescription as his defense against flood-
ing attacks, the best thing for the attacker to do is to be in
conformity with the prescription as well. In this way, the
defense mechanism is effective against the attack and
provides the maximum possible payoff for the defender. In
other words, the defense mechanismis optimal. This notionis
applied to a series of increasingly sophisticated flooding
attack scenarios, culminating in a strategy for handling a
distributed attack from an unknown number of sources. It is
worth noting that two-player game models can also be used
in the case of distributed attacks, where the attackers are
modeled as a single player with the capabilities of the attack
coalition.
In recent years, some researches have utilized game-
based approaches to describe but not to design puzzle-based
defense mechanisms. Bencsath et al. have used the model of
single-shot strategic-form game to find the defenders
equilibrium strategy [29]. As proposed in the relevant
literature, e.g., [25], [27], and [28], the defender, at a time,
may choose his action according to what he knows about the
attackers previous actions. These history-dependent strate-
gies cannot be modeled by a single-shot strategic-form
game. The problem of finding optimum puzzles is not either
addressed in their research.
In [30], Mahimkar and Schmatikov have used the logic
ATL and the computation model ATS to analyze the puzzle-
based mechanisms proposed in [27] and [28]. In ATL and
ATS, the notions of players, strategies, and actions are added
to traditional branching time temporal logic in such a way
that the requirements for a general open system can be
specifiedandverified. Suchanapproachis useful inverifying
the desiredproperties of a givenmechanism, but it canbarely
be used in designing. By ATL and ATS, it can be checked that
whether the defender has a winning strategy to the game or
not, while the optimality of a given strategy cannot be
decided.
The current paper employs the solution concepts of
infinitely repeated games to find the defenders optimum
history-dependent strategies against rational attackers. In
this way, it resolves the deficiencies stated above.
This paper goes on as follows: Section 2 provides a model
of networked systems. Section 3 identifies the games well
suited to the modeling of possible interactions between a
defender and an attacker in a flooding attack-defense
scenario. Section 4 describes the game of the client-puzzle
approach in details. Section 5 explains the technique of
designing puzzle-based defense mechanisms using game-
theoretic solution concepts. Section 6 discusses the defense
mechanisms proposed in this paper andcompares themwith
the earlier puzzle-based defenses. It also outlines future
researches in the game-theoretic study of the client-puzzle
approach. Section7 concludes the paper. The two appendices
give the proofs of the theorems stated in this paper.
2 A NETWORK MODEL
This section introduces a formal model of networked
systems, which helps us identify the games reflecting
possible interactions between a defender and an attacker
in a flooding attack-defense scenario. It is also deployed in
the formal specification of games.
A network consists of two principal classes of elements,
active entities and resources. An active entity is an abstraction
of a program, process, or set of processes acting on behalf of
human beings. In this sense, an active entity is capable of
choosing different actions at different times. The interpreta-
tionof a resource maybe the capacityof a temporarybuffer or
a long-term memory, CPU time to process requests, the
throughput of data bus in clients hardware facilities, the
bandwidth of a communication channel, and so forth. When
an active entity performs an action, it engages a number of
resources.
Each resource is associated with a variable of a certain
domain, which represents the current available amount of
that resource. At a time, the state of the network is
characterized by the values taken by resource variables. At
a state, each active entity has a nonempty finite set of actions
available to him, his action space at that state, from which he
can choose one action. If there are i active entities, the chosen
actions canbedescribedbyani-tuple, calledanactionprofile.
Whenanactionprofile is picked, a state transitiontakes place.
An execution of a networked system begins with an initial
state andproceeds bytraversing a number of states according
to the action profiles chosen at those states.
In the case of flooding attacks, the set of active entities
comprises legitimate users, attackers, and the defender. A
defense mechanism against flooding attacks should enforce
those executions in which the legitimate users requests are
successfully served. The formal model of a network is given
below.
Definition 1. A network is a septuple ` 1. 1. c. Q. . . `,
where we have the following:
. 1 is a nonempty finite set of active entities.
. 1is a nonempty finite set of resources. To each resource
i 2 1, a variable r
i
of domain of a metric space A
i
. d
i

is assigned, where d
i
is a metric and returns a positive
real value for any pair r
1
. r
2
2 A
2
i
.
. c is a function c : 1 ! 11, where 11 is the
power set of 1. This function assigns a subset of
resources to each active entity.
. Q is the state space that equals the Cartesian product of
all resource domains, i.e., Q
i21
A
i
. An element of
Q is called a state and denoted by c.
. is a function : 1 Q ! 1, which determines
the actions available to an active entity at a state, where
is the set of all available actions. The set c

c21
c. c is the space of action profiles at c. The set
[
c2Q
c is the overall space of action profiles.
. ` is a transition function ` : Q

! Q, where
Q

fc. o 2 Qjo 2 cg.


6 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
The function returns a subset of as the set of actions
available to an active entity at a state. For example, assume
fo
1
. o
2
. o
3
g, c
1
. c fo
1
. o
2
g, and c
2
. c fo
2
. o
3
g. In
such a case, at the state c, c
1
and c
2
can choose their actions
from fo
1
. o
2
g and fo
2
. o
3
g, respectively. When an action
profile o 2 c
c21
c. c is chosen at c, the system
moves to c
0
`c. o. The metric d
i
quantifies changes in r
i
such that the amount of resource i spent in transition from
c to c
0
is d
i
hci
i
. hc
0
i
i
, where hci
i
and hc
0
i
i
are the values of
r
i
at c and c
0
. Furthermore, cc is the set of resources the
active entity c owns or he shares with the other entities.
Therefore, the cost of an action profile to an active entity is
defined in a straightforward manner.
Definition 2. Let 1Q

fc. c. o21Qjo2cg.
The cost to an active entity, when an action profile is chosen at
a state, is obtained from the function : 1 Q

! IR,
which is defined by
c. c. o
X
i2cc
d
i
`c. o h i
i
. hci
i

1
0
i
. 1
where 1
0
i
is the reference distance in A
i
. d
i
, and hci
i
is the
value of r
i
at c.
By adopting appropriate reference distances, one can
compare the costs to active entities in a meaningful manner.
Such a comparison is useful, especially in the cases where
active entities pursue different or opposing goals. This
happens in flooding attack-defense scenarios, where the
attackers intend to consume the defenders resources as
much as possible, while the defender intends to save his
resources for legitimate users. Assume the defender can
process 10 requests using his main resource in a time
period of length 5.0 js, and the attacker can solve a puzzle
in 2.5 js. Now, consider the action profile in which the
defender issues a puzzle and the attacker returns its correct
solution. By adopting reference distances 10 for the
defenders main resource and 5.0 js for the resource the
attacker engages in solving a puzzle, the costs of this action
profile to the defender and to the attacker are 0.1 and 0.5,
respectively. Therefore, such a puzzle is effective because
0.1 is less than 0.5.
It is assumed that the transition function is quasi-
isometric under metric spaces in the sense that for any
c. c
0
2 Q and o 2 c \ c
0
, we have d
i
h`c. oi
i
. hci
i

d
i
h`c
0
. oi
i
. hc
0
i
i
1o. i for any i 2 1. This assumption
is reasonable because the amount of resources consumed by
an action profile does not depend on the state at which it is
chosen. By this assumption, (1) is reduced to
c. c. o
X
i2cc
1o. i
1
0
i
c. o. 2
In other words, the cost to an active entity c is the same at
different states if the same action profile o is chosen at
those states.
3 PREVENTIVE MECHANISMS AND GAME THEORY
Preventive mechanisms against flooding attacks can be
effectively studied through game theory. This is mainly
owing to the several trade-offs existing in a flooding attack-
defense scenario. For an attacker, there is a trade-off between
the severity of his attack and the amount of resources he uses
to do so; the more damage an attacker intends to cause, the
more amounts of resources he should spend. For a defender,
on the other hand, there is a trade-off between the effective-
ness of his defense and the quality of service he provides for
legitimate users; the more difficult it becomes to exhaust the
defenders resources, the more workload, and hence, less
quality of service is imposed on legitimate users. A trade-off
also exists between the effectiveness of the defense and the
amounts of resources a defender expends.
Another reason for using game theory in designing
flooding prevention mechanisms is that the underlying
assumptions of game theory hold in a network. The main
assumption is that the players are rational, i.e., their planned
actions at any situation and point in time must actually be
optimal at that time and in that situation given their beliefs.
This assumption holds in a network, where players are the
active entities created and controlled by human beings.
Therefore, a defense mechanism that implements the
defenders strategyobtainedfroma game-theoretic approach
assures its designer that the best possible sequence of actions
is performed against a rational attacker. This would be the
case if all the factors affecting the desirability of an action
profile for a player were reflected by his payoff function.
In what follows, some fundamental concepts of game
theory such as history and strategy are defined in terms of
the network model in Definition 1. The game model of a
flooding attack-defense scenario is then extracted.
Definition 3. Suppose that ` 1. 1. c. Q. . . ` is a net-
work, and c
0
2 Q is its initial state. A history at c 2 Q is a
sequence of action profiles /
c
ho
0
. o
1
. . . . . o
i
i such that there
are c
i
2 Qfor 1 i iwith `c
i
. o
i
c
i1
for 1 i i 1
and `c
i
. o
i
c. The set of all possible histories at c is called
the history space at c and denoted by H
c
.
If players, i.e., the active entities, know the past events,
and hence, the histories at any state, the game among them
is of perfect information in which a player can condition his
action choice at later states on the actions taken by the other
players at previous states. Moreover, players may correlate
their actions at a state through a public event, which is
observable by all of them at that state. In such a case, a
player may condition his choice on the action profiles
chosen at previous states as well as the public event at the
current state. Therefore, a history can be augmented by the
sequence of public events occurred at c
0
. c
1
. . . . . c
i
and c. An
augmented history and the space of such histories at c are
denoted by /
c
no
and H
c
no
, respectively.
If the players cannot perfectly observe the action profiles
chosen earlier, the game among them is of imperfect
information in which a player can condition his choice on
the observed random variables, which depend on the action
profiles chosen at previous states. More precisely, a players
information is in the form of a combined history, including
the histories of his own actions, public events, and private
random signals he takes from the actions chosen by the
other players at previous states. In such a case, a player may
not know the system state at a time, but he knows the
number of state transitions, which equals the number of
times he has chosen an action. Each of these transitions is
called a period. At the beginning of each period, a player
would have his own combined history as his belief about
the action profiles already chosen and, therefore, he can
condition his choice on it. These periods are denoted by
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 7
0. 1. 2. . . . . t. . . . . T. A combined history and the space of
such histories at t for an active entity c are denoted by /
t
c
and H
t
c
, respectively. If the game is of perfect information,
H
t
c
H
c
no
for any c 2 1, where c is the state at period t.
Definition 4. A pure strategy for an active entity c 2 1 is a
function :
c
that maps any period t to a function from the
combined history space at that period to the actions available to
c therein. Indeed, :
c
t is a function :
c
t : H
t
c
! c. c, where
c is the state at period t. The action chosen by c at t is :
c
t/
t
c
,
also written :
t
c
/
t
c
, if the combined history at t is /
t
c
. A mixed
strategy for an active entity c at a period t is a function
o
t
c
: H
t
c
! c. c, where c. c is the set of all
probability distributions over c. c. The set of all pure and
mixed strategies for c are denoted by o
c
and
c
, respectively.
Note that a player knows the actions available to him at
any period, even if the game is of imperfect information.
Therefore, in Definition 4, c. c still represents the actions
available toa player at a givenperiod. If c. cc. c
0
c
for any c, c
0
2 Q, the action profiles are chosen from the same
set c in the stage-game played at any period. In such a
case, the overall game among the players is a repeated game,
where the same stage-game is played repeatedly. If it is
possible for the game to be continued at any period, it will be
an infinitely repeated game.
Now, consider the scenario of launching a flooding attack
in the presence of a puzzle-based defense mechanism. The
attacker repeatedly sends his requests for a service offered by
the defender. Upon receiving a request, the defender chooses
a puzzle from the set of given puzzles and sends it to the
attacker. The attacker then picks one of his available actions.
Such an interaction can be modeled as a single-shot strategic-
form game. As the attacker sends a large number of requests
to the defender, and the action profiles are the same in the
corresponding strategic-form games, the overall game
between the defender and the attacker is a repeated game.
During the attack, the probability of receiving a new request
fromthe attacker is not zero. Therefore, the overall game is an
infinitely repeated one.
Moreover, the defender may receive a number of new
requests before discerning the attackers decisions at pre-
vious periods. Thus, the game is of imperfect information.
Nevertheless, the defender couldobserve the randomsignals
containing uncertaininformationabout the actions chosenby
the attacker at previous periods. For example, if a difficult
puzzle has not been answered yet, there are two possibilities:
either the attacker has decided to solve the puzzle, or he has
quitted the protocol. Hence, the interactions between an
attacker who perpetrates a flooding attack and a defender
who counters the attack using a puzzle-based defense is
modeled as a two-player infinitely repeated game of imperfect
information.
If the game continues at each period with a probability
less than the unity, it is also of discounted payoffs, where the
future payoffs are lowered using a discount factor. The
reason for discounting is that the players are unsure about
howlong the game will continue. In other words, they would
prefer a valuable thing in the current period rather than a
promise of one in the next because they are not sure if it will
really come. For example, a defender who knows the game
will continue with a great probability may postpone the
intense punishment of a prospective attacker for the sake of a
higher quality of service for legitimate users. Therefore, the
players beliefs about the likelihood of game length affect
their decisions at a period. In this sense, the discount factor is
interpreted as the probability of the game continuation. In
the case of flooding attacks, the discount factor is considered
very close to unity because the attacker sends a huge number
of requests to the defender as the only way he may use up the
defenders resources. Thus, the game will continue with a
great probability.
In game theory, a player is defined as the one who can
adopt different actions at different situations. In the defense
mechanisms proposed in this paper, a legitimate user
always solves the puzzles and returns their correct answers.
Thus, he is not considered as a player in the game of a
flooding attack-defense scenario. Nevertheless, the defender
may mistake the messages sent by a legitimate user for the
ones sent by the attacker, causing the defender to make
wrong decisions. This is dealt with by the mechanisms
proposed in this paper.
4 THE GAME OF THE CLIENT-PUZZLE APPROACH
As stated above, a flooding attack-defense scenario is
modeled as a two-player infinitely repeated game. There-
fore, in the stage-game played at any period t, the defender
and the attacker, i.e., the active entities 1, 2 2 1, choose from
their action spaces 1 and 2 and cause the game to
arrive at period t 1. In the client-puzzle approach, the set
of possible actions for the defender is
1
f1
1
. 1
2
. . . . . 1
i
g,
and the one for the attacker is
2
fQT. 1. Cg. The
action 1
i
, 1 i i, stands for issuing a puzzle of difficulty
level i. It is assumed that the puzzle of level i is less difficult
than the one of level , if i < ,. The actions QT, 1, and C
stand for quitting the protocol (no answer), answering the
puzzle randomly, and answering the puzzle correctly. It is
also assumed that a legitimate user always solves the
puzzles and returns correct answers.
At a period, the attacker knows the action chosen by the
defender at that period. Thus, the stage-game is indeed an
extensive-form game. In order to convert this game into its
equivalent strategic-formgame, it is sufficient to consider the
action spaces as 1
1
and 2
i
2
, where
i
2
is the
Cartesianproduct of
2
together itself itimes. For example, if
the defender can choose between 1
1
and 1
2
, one of possible
actions for the attacker is C. QT, whichmeans choose C
when the defender chooses 1
1
, and QT when he chooses 1
2
.
It is worth noting that a players strategy for the repeated
game is obtained from the functions :
c
and o
c
stated in
Definition 4, where a player chooses his action according to
the history of events he knows.
The model of the stage game is completed by the players
payoff functions. The underlying notion of a puzzle-based
defense is that the workload of the attacker should be higher
than of the defender [21]. In addition, the defender should
care about the level of quality of service he provides for
legitimate users. Therefore, an action profile is more prefer-
able for the defender if it results in more cost to the attacker,
less cost to the defender, and less cost to legitimate users.
Similarly, anactionprofile is more desirable for the attacker if
it causes more cost tothedefender andless cost totheattacker.
Hence, the players stage-game payoffs are obtained from
o
1
o 1. o 2. o in. o. and
o
2
o 2. o 1. o.
3
8 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
where is the cost function defined by (2), n. o is the
cost to a legitimate user when the action profile o is chosen,
and i 2 0. 1 is the level of quality of service the defender is
willing to provide for legitimate users. As will be seen, a
low quality of service is inevitable when the attacker enjoys
high capabilities.
As stated in Definition 1, the function c determines the
set of resources an active entity owns or shares with the
other entities. In studying a flooding attack-defense
scenario, we confine our attention to those resources
engaged in the scenario. According to (1), a resource that
is not engaged in the scenario has no effect on the cost to
the players. In the client-puzzle approach, the defender
engages two types of resources, one for producing puzzles
and verifying solutions, denoted by i
j
, and the other for
providing the requested service. The latter, denoted by i
i
,
is the main resource the defender wishes to protect against
flooding attacks. Therefore, c1 fi
j
. i
i
g. Similarly, for
the attacker, c2 fi
:
g, where i
:
is the resource he uses to
solve a puzzle. Finally, for a legitimate user, the active
entity n, cn fi
i
g in which i
i
is the resource he engages
in solving a puzzle. Thus, by using (2), (3) is reduced to
o
1
o c
i
j
.o
c
i
i
.o
c
i
:
.o
ic
i
i
.o
. and
o
2
o c
ij.o
c
ii.o
c
i:.o
.
4
where c
i.o
1o. i1
0
i
for any i 2 1 and o 2
1

i
2
.
As
1

i
2
, an arbitrary action profile is of the
form o o
1
; /
1
. /
2
. . . . . /
i
, where o
1
2
1
and /
i
2
2
for
1 i i. In this action profile, the attacker will play /
i
if
the defender plays 1
i
. As /
i
2 fQT. 1. Cg, there are
only three classes of action profiles listed in Table 1. The
values c
i.o
are the same for the elements of each class. In
puzzle-based approaches, the main resource is allocated
when a requester returns the correct solution to the issued
puzzle. The allocated amount of the main resource is
released after a bounded time, say, T. This is one of the
features distinguishing a flooding attack from a logic one.
The reference distance of the main resource may be
considered as the number of requests that can be served
by the main resource in a time period of length T. The
reference distance of the resource i
j
can also be defined
as T. By such a choice, c
11
i
and c
\ 1
i
would be the ratios
of times taken for producing a puzzle and verifying a
puzzle solution of level i to T. By adopting the same
reference distance for i
:
and i
i
, c
o1
i
would be the ratio of
time the attacker or a legitimate user spends on solving a
puzzle of level i to T. For any i < ,, it is assumed that
c
o1i
< c
o1,
, c
11i
c
11,
, and c
\ 1i
c
\ 1,
.
Inthe case of distributedattacks, a flooding attack-defense
scenario can be modeled as a two-player game in which
the attackers are modeled as a single player with the
capabilities of the attack coalition. More precisely, if there
exist i machines in the attack coalition andthe cost of solving
a puzzle in a single machine is c
o1
, the costs to the attacker
and to a legitimate user in solving this puzzle are c
o1
i
and c
o1
, respectively. Thus, the payoff functions in (4) can
still be used if c
o1
and i are replaced by c
o1
i and ii.
Assume
1
f1
1
. 1
2
g. In such a case, the stage-game is
represented by the bimatrix shown in Fig. 1. The top element
of a cell in this bimatrix is the defenders payoff, while the
bottom one is the attackers. These payoffs are obtained
from (4) using the corresponding values in Table 1.
As stated in Section 3, the repeated game between the
defender andthe attacker is of discountedpayoffs. Therefore,
a discount factor j 2 0. 1 is used as a weighting factor in the
weighted sum of payoffs. More precisely, the player is
payoff for the repeated game, when the mixed strategy
profile o o
1
; o
2
is played, is defined by
n
i
o
X
1
,0
j
,
o
i
o
,
/
,



X
1
,0
j
,
o
i
o
,
1
/
,
1

; o
,
2
/
,
2

. 5
It is also more convenient to transform the repeated game
payoffs to be on the same scale as the stage-game payoffs.
This is done by multiplying the discounted payoff in (5) by
1 j. Thus, the player is average discounted payoff for the
repeated game is
n
i
o 1 jn
i
o. 6
Another considerable issue in a repeated game is the
feasibility of stage-game payoffs. As said, the space of pure
action profiles is
1

i
2
. Therefore, the set o
fo
1
o. o
2
ojo 2 g contains all the stage-game payoff
vectors supported by pure action profiles. If the defender
and the attacker make their mixed stage-game strategies
independently, the set of possible stage-game payoff vectors
would be a subset of the convex hull of o, i.e., a subset of
coi. o
n
. 2 IR
2
j 9u
1
. u
2
. . . . . u
i
2 0. 1
X
i
/1
u
/
1 ^ .
X
i
/1
u
/
.
/
h io
.
7
where o f.
1
. .
2
. . . . . .
i
g. On the other hand, if the two
players correlate their actions, any element of coi.o can
be supported as a payoff vector. In doing so, the players can
condition their actions on the value produced by a public
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 9
TABLE 1
Classes of Action Profiles o and Their Corresponding Values
c
o.i
1o. i1
0
i
for a Resource i
Fig. 1. The stage-game of the client-puzzle approach, where the action
spaces are
1
f1
1
. 1
2
g and
2
2
fQT. 1. Cg
2
. The bimatrix is a
2 9 matrix, and the payoffs are obtained from
i1
c
11i
ic
o1i
,

i2
c
11i
c
\ 1i
ic
o1i
,
i3
c
i
c
11i
c
\ 1i
i 1c
o1i
,

0
i1
c
11i
,
0
i2
c
11i
c
\ 1i
, and
0
i3
c
i
c
11i
c
\ 1i
c
o1i
for
i 1, 2.
randomizing device. Indeed, the output of this device is a
random variable . of domain [0, 1], and then, an arbitrary
payoff vector .
P
i
/1
u
/
.
/
is supported by the player is
mixed strategy:
o
i
.
o
1
i
. . 2 0. u
1
.
. . . . . .
o
/
i
. . 2
P
/1
|1
u
|
.
P
/
|1
u
|
h i
.
. . . . . .
o
i
i
. . 2
P
i1
|1
u
|
. 1
h i
.
8
>
>
>
>
<
>
>
>
>
:
8
where .
/
o
1
o
/
. o
2
o
/
, o
/
o
/
1
; o
/
2
, and for any
/ 2 f1. 2. . . . . ig, o
/
1
2
1
and o
/
2
2
i
2
. Hence, any payoff
vector in the convex hull of o is feasible under correlated
actions. In the defense mechanisms proposed in this paper,
the defender takes the role of the public randomizing device.
5 DEFENSE STRATEGIES
This section employs the solution concepts of infinitely
repeated games with discounting to design the optimum
puzzle-based defense strategies against flooding attacks. In
general, the strategies prescribed by such solutions are
divided into two categories: history independent (open loop)
and history dependent (closed loop).
The defense strategies proposed in this section are based
on the concept of Nash equilibrium. For the ease of
reference, this concept is repeated here. Let o

1
; o

2
be a
mixed-strategy Nash equilibrium for the two-player in-
finitely repeated game developed in Section 4. Then,
n
1
o

1
; o

2
! n
1
o
1
; o

2
for any o
1
2
1
, and n
2
o

1
; o

2
!
n
2
o

1
; o
2
for any o
2
2
2
, where n
i
is the player is payoff
function in (5). This means that any unilateral deviation
from the strategy profile stated by the Nash equilibrium
has no profit for its deviator.
The concept of Nash equilibrium is often used in a
descriptive way, where it describes the players optimum
strategies in a game. In this sense, it makes predictions
about the behaviors of rational players. In this section, on
the contrary, the concept of Nash equilibrium is employed
in a prescriptive way in which the defender picks out a
specific Nash equilibrium and takes his part in that profile.
The attacker may know this, but the best thing for him to do
is to be in conformity with the selected equilibrium. If he
chooses another strategy, he gains less profit (the attackers
payoff function, defined in (5) and (6), reflects the attackers
profit from a flooding attack). In the defense mechanisms
proposed in this section, the defender adopts the Nash
equilibrium prescription that brings him the maximum
possible repeated game payoff while preventing the attack.
In this way, the defense mechanism would be optimal.
5.1 Open-Loop Solutions
In an open-loop strategy, the action profiles adopted at
previous periods are not involvedina players decisionat the
current period. More formally, in the repeated-game of the
client-puzzle approach, o
t
i
: H
t
i
! i is an open-loop
strategy for player i if 8t 2 ZZ
!0
8/
t
i
.
~
/
t
i
2 H
t
i
o
t
i
/
t
i
o
t
i

~
/
t
i
,
where i 1, 2, 1
1
, and 2
i
2
.
One of the open-loop solutions to an infinitely repeated
game is to play any one of the stage-game Nash equilibria at a
period regardless of what actually happened in the corre-
sponding history. In other words, let o
1
; o
2
be an open-loop
strategy profile for the infinitely repeated game such that
o
t
1
/
t
1
c
t
1
and o
t
2
/
t
2
c
t
2
for all histories /
t
1
2 H
t
1
and
/
t
2
2 H
t
2
. If c
t
1
; c
t
2
is a stage-game Nash equilibrium for any
t, then o
1
; o
2
is a subgame-perfect equilibrium for the
repeated game [31].
In a flooding attack-defense scenario, the defender may
not perfectly know the actions taken by the attacker at
previous periods. Thus, adopting an open-loop strategy, as
stated above, may be the simplest way he can attain an
equilibrium. The followingtheoremidentifies the stage-game
Nash equilibria for the game of the client-puzzle approach.
Theorem 1. Assume the client-puzzle approach uses i puzzles
1
1
. . . . . 1
i
such that c
o1
1
< . . . < c
o1
/
< c
i
< c
o1
/1
<
. . . < c
o1
i
. In addition, for any 1 i i, assume c
11
i

c
11
, c
\ 1i
c
\ 1
, and jc
i
c
o1i
j c
\ 1
. Then, for 0 <
i < 1, a stage-game Nash equilibrium is of the form 1) c1
/

1 c 1
/1
; c
2
f or 0 < c < 1, where c
2
2 f/j/ 2

i
2
. // C. // 1 1g a n d i c
i
c
o1/

c
o1
/1
c
o1
/
, 2) 1
/
; c
2
for some c
2
2 f/j/ 2
i
2
.
// Cg, where i c
i
c
o1
/
c
o1
/1
c
o1
/
, or
3) 1
/1
; c
2
for some c
2
2f/j/2
i
2
. //11g, where
i<c
i
c
o1
/
c
o1
/1
c
o1
/
.
Proof. See Appendix A. tu
In Theorem 1, it is assumed that c
11i
c
11
, c
\ 1i
c
\ 1
for any 1 i i. According to the current techniques used
in producing puzzles and verifying solutions, this is a
reasonable assumption [27]. It can also be shown that for
i 0 and under the hypotheses in Theorem 1, c
1
; c
2
is a
stage-game Nash equilibrium if c
1
2 f1
/1
. . . . . 1
i
g and
c
2
2 f/j/ 2
i
2
. /i 1. i ! / 1g. For i 1, c
1
; c
2

is a stage-game Nash equilibrium if c


1
2 f1
1
. . . . . 1
/
g
and c
2
2 f/j/ 2
i
2
. /i C. i /g. Moreover, for
every case in Theorem 1, 1
/
and 1
/1
are the only puzzles
involved in a stage-game Nash equilibrium. Thus, in
designing a defense mechanism based on an open-loop
solution, one can consider only two types of puzzles, a
simple puzzle 1
1
satisfying c
o11
< c
i
and a difficult puzzle
1
2
with c
i
< c
o12
.
As seen above, for any i and for any set of puzzles
satisfying the hypotheses in Theorem 1, stage-game Nash
equilibria and, consequently, open-loop solutions to the
repeatedgame of the client-puzzle approachalways exist. It is
evident that implementing the strategies prescribed by such
solutions, e.g., a strategy that uses simple puzzles, does not
necessarily prevent flooding attacks. Hence, in designing
defense mechanisms against flooding attacks, a fairness
function is considered, which determines if a strategy
preserves the defenders resources from being exhausted by
the attackers requests. Apparently, the strategy implemen-
ted by a defense mechanism should be fair. Note that
adopting the set of puzzles is not part of the defenders
strategy. At a period, the defender chooses the best puzzle, or
the best combinationof puzzles inthe case of mixedstrategies
from the set of given puzzles. In what follows, the fairness
function is developed.
Assume that r; n represents the class of stage-game pure
strategies in which the defender chooses r 2
1
f1
1
. 1
2
g,
and the attacker responds to it by n 2
2
fQT. 1. Cg.
For example, 1
1
; C represents the class of strategies 1
1
; /,
where / 2
2
2
and /1 C. Then, in a strategy profile of
the form
10 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010

11
1
1
; QT
12
1
1
; 1
13
1
1
; C

21
1
2
; QT
22
1
2
; 1
23
1
2
; C.
9
1
i
; QT, 1
i
; 1, and 1
i
; C, i 1, 2, are chosen with
probabilities
i1
,
i2
, and
i3
, respectively. In the repeated
game of the client-puzzle approach, the on-the-equilibrium
path can be considered as an infinite sequence of strategy
profiles of the form (9). Then, the average discounted on-the-
equilibrium-pathstrategyprofile is definedby1 j
L
1
,0
j
,

o
,
/
,
, where o
,
/
,
is the on-the-equilibrium-path strategy
profile at period , in the form of (9). On the average, the
players play this profile infinitely many times whenthey take
their parts in the equilibrium.
As stated in Section 4, the reference distance of the
resources i
j
, i
:
, and i
i
is considered as the maximum time T
the allocated amount of the main resource can be kept in use
by a request. Moreover, the reference distance of the main
resource is the number of requests that can be served by that
resource in a period of length T. By adopting such reference
distances, the attacker can solve 1c
o1
puzzles in a time of
length T. The defender can either produce 1c
11
puzzles or
verify 1c
\ 1
puzzles in this time, but he cannot do both of
them in a single period, because the same resource is
engaged in those actions. Finally, the defender can process
1c
i
requests using his main resource in such a period. By
this modeling, a fair solution can be defined as follows in
which ` is the number of requests the attacker can produce
in a time of length T. Note that only two puzzles are
considered, 1
1
as a simple puzzle and 1
2
as a difficult one.
Definition 5. A strategy profile is a fair solution to the client-
puzzle approach if the conditions in one of the following cases
hold for its average discounted on-the-equilibrium-path
strategy profile in (9).
Co:c 1.

`
X
2
i1

i3
c
o1i
1
!
:
`
X
2
i1

i3
c
i
1. oid
`
X
2
i1

i1
c
11

X
2
i1

i2

i3
c
11
c
\ 1

!
1.
Co:c 2.

`
X
2
i1

i3
c
o1i
1
!
:
X
2
i1

i3
0
X
2
|1

|3
!
c
i
c
o1
i
1. oid
`
X
2
i1

i1

i3
c
11

X
2
i1

i2
c
11
c
\ 1

X
2
i1

i3
0
X
2
|1

|3
! !
c
\ 1
c
o1
i
1.
In Case 1, the attacker can solve all the puzzles that should
be correctly answered. The number of such puzzles is
`
13

23
. In order to prevent the main resource from
being exhausted by the attackers requests, this number
shouldbe less thanor equal tothe number of requests that can
be served by the main resource in a time of length T, i.e., less
than 1c
i
. This is the first condition in Case 1. Similarly, the
second condition in Case 1 protects the resource i
j
used in
producing puzzles and verifying solutions. In Case 2, the
attacker cannot solve all the puzzles that should be answered
by correct solutions. In this case, using the entire resource i
:
,
he can solve
13

13

23
1c
o1
1
simple puzzles and

23

13

23
1c
o1
2
difficult puzzles on the average.
Again, the first and second conditions are to preserve the
main resource and the resource used in producing puzzles
and verifying solutions from being exhausted by the
attackers requests.
Now, consider an open-loop solution of type 1 in
Theorem 1. The average discounted on-the-equilibrium-
path strategy profile for such a solution is " o c
1
1
; C 1 c 1
2
; 1. Through comparing this with
(9), we have
13
c,
22
1 c, and
i,
0 otherwise.
Therefore, an open-loop solution of type 1 is fair if
`cc
o11
1, `cc
i
1, and `c
11
c
\ 1
1 (Case 1)
or if `cc
o11
1, c
i
c
o11
1, and `c
11
1 cc
\ 1

c
\ 1
c
o1
1
1 (Case 2). It is evident that the conditions in
Case 2 are not satisfied, while the ones in Case 1 hold if
c 1`c
i
and `c
11
c
\ 1
1. Similarly, a solution
of type 2 is fair if `c
i
1, a condition that can rarely be
satisfied in actual cases. Finally, a solution of type 3 is fair
if `c
11
c
\ 1
1, a condition that can usually be
satisfied. Intuitively, a designer prefers a fair solution that
uses simple puzzles. Hence, the best open-loop fair
solution to the client-puzzle approach is the one employ-
ing the strategies of type 1.
Fig. 2 shows a puzzle-based defense mechanism (PDM1)
against flooding attacks derived from the open-loop
solution concept of discounted infinitely repeated games.
In this mechanism, the random number generator and the
puzzles are designed as follows:
Step 1. For a given desirability factor of quality of service
0 < i < 1, choose puzzles 1
1
and 1
2
in such a way that
c
o1
1
< c
i
< c
o1
2
.
c
\ 1
< c
i
c
o11
.
c
\ 1
< c
o12
c
i
. and
i c
i
c
o1
1
c
o1
2
c
o1
1
.
Step 2. Choose c such that c 1`c
i
, where ` is the
number of requests an attacker can send in the time the
defender requires to process a request using his main
resource, i.e., in T (it is assumed that `c
11
c
\ 1
1.)
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 11
Fig. 2. PDM1The puzzle-based defense mechanism against flooding
attacks derived from the open-loop solution concept of discounted
infinitely repeated games.
Step 3. Create a random number generator that
produces a random variable x with 1ix 0 c and
1ix 1 1 c.
Note that Fig. 2 only shows the core of a puzzle-based
defense mechanism, which chooses optimal difficulty
levels. The other components can be the same as the ones
in known mechanisms, e.g., [27].
There are two other noteworthy issues in PDM1. In the
case of distributed attacks, if a single machine can produce
` requests and there exist i machines in the attack
coalition, the attacker is modeled as the one capable of
producing i` requests. A designer then runs Steps 1-3
stated above using c
o1
i and ii instead of c
o1
and i.
More precisely, for a given desirability factor of quality of
service i, the puzzles should satisfy c
o11
i<c
i
<
c
o1
2
i and ii c
i
c
o1
1
ic
o1
2
i c
o1
1
i. More-
over, if the defender should be able to process 1c
i
legitimate requests by his main resource, the defense
mechanism considers the amount of the main resource in
such a way that the defender can process 2c
i
requests, a
half for the defense and the other for legitimate users. This
extra resource may be allocated dynamically when the
defender is under attack, i.e., the number of requests is
more than the one assumed for the legitimate users.
5.2 Closed-Loop Solutions
In a fair open-loop solution, the defenders maximum
average payoff is c
11
c
\ 1
ic
o1
2
. However, there are
many payoff vectors in the convex hull with greater payoffs
for the defender. Thus, here, a natural question arises: Is
there a better fair solution to the game, which results in a
greater payoff to the defender? As proven in [32], in the
games of perfect information, there is a large subset of the
convex hull whose payoff vectors can be supported by
perfect Nash equilibria provided that suitable closed-loop
strategies are adopted. This subset is denoted by \

, and its
elements are called strictly individually rational payoffs (SIRP).
In the game of the client-puzzle approach
\

.
1
. .
2
2 coi. o j .
1
.

1
. .
2
.

2

.
where
1

i
2
, and .

1
. .

2
is the minmax point
defined by
.

1
iii
c
2
2
i
2

ior
o
1
2
1
o
1
o
1
; c
2
.
.

2
iii
c121
ior
o22
i
2
o
2
c
1
; o
2
.
inwhichA is theset of all probabilitydistributions over A.
Furthermore, the mixed strategies resulting in .

1
and .

2
are
denoted by `
1
`
1
1
; `
1
2
and `
2
`
2
1
; `
2
2
, respec-
tively. The strategy `
2
1
is the player 1s minmax strategy
against the player 2. Similarly, `
1
2
is the player 2s minmax
strategy against the player 1.
Fig. 3 shows the convex hull of payoff vectors for the game
of the client-puzzle approach when c
i
0.2, c
o11
0.15,
c
o1
2
0.23, c
11
0.01, c
\ 1
0.02, and i 0.5. As seen in
Fig. 3, the defenders maximumaverage payoff in PDM1, i.e.,
c
11
c
\ 1
ic
o1
2
, is 0.145, though many payoffs greater
than 0.145 can be supported if the game is of perfect
information and suitable closed-loop strategies are adopted.
The following theorem characterizes the set of payoff
vectors that can be supported by perfect Nash equilibria in
an infinitely repeated game of observable actions and
complete information where the payoffs are discounted.
This reflects those attack-defense circumstances in which
the player involved in the defense mechanism knows his
opponents payoff function as well as the actions chosen by
his opponent at previous periods. It is worth noting that the
puzzles can be designed in such a way that the amounts of
resources a machine uses to solve a puzzle are independent
of the machines processing power [33]. Therefore, except
for flooding attacks from an unknown number of sources, it
is reasonable to assume that the defender knows the
attackers payoff function.
Theorem 2 (Fudenberg/Maskin [32]). For any payoff vector in
the set of SIRP, i.e., .
1
. .
2
2 \

, there is j
0
2 0. 1 such
that for all j 2 j
0
. 1, there exists a subgame perfect
equilibrium of the infinitely repeated game in which the
player is payoff is .
i
when players have the discount factor j.
Fudenberg and Maskin also defined the equilibrium
strategies resulting in the payoff vectors stated in Theorem2.
Inthe game of the client-puzzle approach, these strategies are
obtained as follows: Take
" .
i
ior
c
1
2
1
. c
2
2
i
2

o
i
c
1
; c
2
.
For .
1
. .
2
2 \

, choose t
0
and j
0
such that for i 1, 2
.
i
" .
i
1 j
0
j
0
.

i
. 10
where
.

i
1 j
0
t
0
o
i
`
2
1
; `
1
2

j
0
t
0
.
i
.
with
.

i
.

i
. 11
There are j
0
and t
0
satisfying (10) and (11), i.e.,
.

i
< 1 j
0
t
0
" o
i
j
0
t
0
.
i
<
.
i
" .
i
1 j
0

j
0
.
where " o
i
o
i
`
2
1
; `
1
2
. This necessitates
j
0

" .
i
.
i
" .
i
.

i
. 12
12 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
Fig. 3. The convex hull of payoff vectors and SIRP in the game of the
client-puzzle approach when c
i
0.2, c
o11
0.15, c
o12
0.23,
c
11
0.01, c
\ 1
0.02, and i 0.5.
and for j
0
satisfying (12), t
0
is obtained from
.

i
" o
i
.
i
" o
i
< j
0
t
0
<
.
i
" .
i
j
0
" .
i
" o
i

j
0
.
i
" o
i

. 13
subject to t
0
to be a positive integer. Clearly, for any j j
0
,
there exists a corresponding tj such that (12) and (13)
hold for j. tj.
Let c

1
; c

2
be a correlated stage-game strategy profile
in (8) for which o
i
c

1
; c

2
.
i
for i 1, 2. Then, the
following players strategies, i 1, 2, make perfect Nash
equilibrium for the repeated game:
1. Play c

i
at each period as long as c

1
; c

2
was played at
the last period.
After any deviation from Phase 1, we have
2. Play `
,
i
, , 6 i, tj times and then start again with
Phase 1. If there is any deviation while in Phase 2, then
begin Phase 2 again.
Fig. 4 shows the convex hull, minmax point, and SIRP for
the game of the client-puzzle approach when
1
f1
1
. 1
2
g,
c
i
0.2, c
o11
0.01, c
o12
0.3, c
11
0.003, c
\ 1
0.006,
and i 0.5. As seen, 1
1
is very simple. Moreover,
.

1
. .

2
0.159. 0.009, " o .

, and " ." .


1
. " .
2

0.008. 0.199. Therefore, (12) and (13) are reduced to
j
0

0.008 .
1
0.151
.
j
0

0.199 .
2
0.190
.
0 < j
0
t
0
<
.
1
0.008 j
0
0.151
j
0
.
1
0.159
. and
0 < j
0
t
0
<
.
2
0.199 j
0
0.190
j
0
.
2
0.009
.
Fig. 5 shows the number of required punishment
periods in Phase 2 as a function of the discount factor
and the equilibrium payoff vector. As seen, the number of
times a defender should punish an attacker increases with
his payoff and decreases with the discount factor. For the
discount factors near the unity and the payoffs (0.0197,
0.0147), (0.0216, 0.0166), and (0.0235, 0.0185), the
number of required punishment periods is 33, 25, and 20,
respectively.
The above results are valid when a player knows his
opponents decisions at previous periods, i.e., when the
game is of perfect information. Suppose that the defender
should condition his action at the current period on the
answer to Has the attacker correctly solved the difficult
puzzle of the last period? To obtain the answer, the
defender should wait for a relatively long time. Thus,
during this period, he receives a large number of new
requests for which he cannot decide on suitable actions. As
seen in Fig. 4, the strategy profiles leading to a payoff
vector .
1
. .
2
considered in Fig. 5 and to o`
2
1
; `
1
2
are
c 1
1
; C 1 c 1
1
; 1 and 1
2
; 1, respectively.
As the attacker can quickly produce a random answer or
solve a very simple puzzle, the defender will know the
decision taken by the attacker very soon. This helps the
defender compensate for his wrong decisions resulting
from his imperfect information. The following theorem
states a sufficient condition under which the best thing for
the attacker to do in the punishment phase is to answer the
puzzles randomly.
Theorem 3. In the game of the client-puzzle approach, assume
the puzzles are of two difficulty levels and satisfy the
conditions stated in Theorem 1. Then, `
2
1
; `
1
2
2 1
2
; 1
if i < c
i
c
o1
1
c
o1
2
c
o1
1
.
Proof. See Appendix B. tu
As o
2
1
2
; 1 o
2
1
2
; c
2
for any c
2
2
2
2
, choosing
the punishment strategy `
2
1
1
2
enforces conformity by
the attacker in Phase 2 even if the defender does not punish
the attackers deviations in that phase. This is a direct
result of the one-stage-deviation principle [31]: the attacker
cannot profit by deviating from his prescribed equilibrium
strategy in Phase 2 at a single period and then returning to
that strategy for the remaining periods.
Assume the defender receives at most one request before
discerning the decision made by the attacker at the previous
period. As 1
1
is very simple (it can be sending one specific
bit to the defender), this is a reasonable assumption. In
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 13
Fig. 4. The convex hull and the set of SIRP in the client-puzzle approach
when puzzles are of two difficulty levels, c
i
0.2, c
o11
0.01,
c
o12
0.3, c
11
0.003, c
\ 1
0.006, and i 0.5. The minmax point
is 0.159. 0.009.
Fig. 5. The number of required punishment periods t as a function of the
discount factor j and the equilibrium payoff vector .
1
. .
2
for the game
in Fig. 4.
addition, each full-length punishment phase in Theorem 2
removes the attackers profit from a single deviation in
Phase 1. Thus, the defender can adopt the following closed-
loop strategy. Upon receiving the first request, he produces
and sends 1
1
. When the second request is received, the
defender checks if he knows the decision made by the
attacker at the first period. If the defender discerns any
deviation in the first period, he runs Phase 2 in Theorem 2.
Otherwise, he issues 1
1
again. If the defense mechanism is in
Phase 1 and the defender receives the third request, he
checks the attackers decisions at the previous two periods.
Now, he certainly knows the attackers decision at the first
period, but he may not know the attackers decision at the
second period. If only one deviation is discerned in the
previous two periods, the defender runs Phase 2. If two
deviations are discerned, he runs the action prescribed by
Phase 2 twice the number of times stated in Theorem 2.
Otherwise, he goes on by Phase 1. If the defense mechanism
is in Phase 2 and a deviation concerning Phase 1 is discerned,
a full-length Phase 2 is considered at the end of the current
punishment phase. When the defense mechanism finishes
the punishment phase, it returns to Phase 1, and then repeats
the actions stated above. In this way, the attacker gains
nothing by deviating from the said closed-loop strategy. In
other words, this strategy makes an equilibrium. The
defense mechanism derived from the above game-theoretic
approach (PDM2) is shown in Fig. 6.
Note that the random number generator and the puzzles
used in PDM2 are derived as follows:
Step 1. For a given desirability factor of quality of service
0 < i < 1, choose puzzles 1
1
and 1
2
in such a way that
c
o1
1
< 1` < c
i
< c
o1
2
.
c
\ 1
< c
i
c
o11
.
c
\ 1
< c
o12
c
i
. and
i < c
i
c
o1
1
c
o1
2
c
o1
1
.
where ` is the number of requests an attacker can send in
the time the defender requires to process a request using his
main resource. The requirement c
o11
< 1` states that 1
1
is
very simple in such a way that the attacker can solve it in a
time less than the time he needs to produce a request.
Step 2. Choose c in such a way that the equilibrium
strategy profile c 1
1
; C 1 c 1
1
; 1 is fair.
This necessitates c 1`c
i
. It is also assumed that
`c
11
c
\ 1
1.
Step 3. Create a random number generator, which
produces a random variable x with 1ix 0 c and
1ix 1 1 c.
Since a legitimate user always solves the puzzles
correctly, his action in Phase 1 may be considered as a
deviation. To avoid this, one can amend PDM2 to ignore a
single deviation in a time period of length Tc
i
and,
collectively, 1c
i
deviations in a time period of length T,
where T is the time the defender requires to process a request
using his main resource. The parameter is so adjusted that
the defense mechanism remains in Phase 1 with a great
probability when there is no attack. For example, it can easily
be shown that if c
i
0.2 and 0.5, and the number of
requests produced by legitimate users in a time period of
length T is of a Poisson distribution with parameter 2.5, the
defense mechanism remains in Phase 1 with a probability
greater than 0.95 when there is no attack. Such an
amendment can be implemented by Check Deviation blocks
14 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
Fig. 6. PDM2The defense mechanism against flooding attacks
derived from the closed-loop solution concept of discounted infinitely
repeated games. (a) The defense mechanism, where t is the number of
punishment periods obtained from Theorem 2, i is the number of
remaining punishment periods, and r is the value of the random variable
produced by the defenders random number generator and sent to the
requester, i.e., the value of public randomizing device. In addition, t

is
the maximum time the defender waits for the attackers response.
(b) The structure of Check Deviation A in (a), where C, and \,
stand for the Correct Answer and Wrong Answer to the puzzle
numbered ,. A correct answer here is the one prescribed by the
equilibrium strategy. The Check Deviation B has the same structure as
in (b) without the loop in the Check Deviation A.
in Fig. 6. As stated in PDM1, in a time of length T, the
defender should be able to process double the number of
legitimate users requests at that period using his main
resource. Hence, this amendment does not menace the
fairness of PDM2.
5.3 Considerations for Distributed Attacks
PDM1 treats a distributed attack as a single-source attack,
where the attackers are modeled as a single attacker with the
capabilities of the corresponding attack coalition. The same
approach can be adopted for closed-loop solutions, but some
further issues should be considered there. In a distributed
attack, the requests come fromdifferent machines, andit is no
longer reasonable to assume that the defender receives only a
small number of requests before receiving the correct or
random answer to an issued puzzle. Indeed, a large number
of requests are produced by the attack coalition, whereas a
small proportion of themis of a single machine. Therefore, in
the time a machine is involved in computing the answer, the
defender may receive a large number of requests from the
other machines in the coalition.
Imitating PDM2, a possible solution to this problem may
be to postpone the transition from the normal to the
punishment phase for a time during which the defender can
certainly discern the decisions made by the attacker. This
time is called the reprieve period. In a distributed attack, the
defender may receive a large number of requests during the
reprieve period. Thus, if he uses simple puzzles in this
period, the attacker solves them and performs an intense
attack. To avoid this, the normal phase strategy profile c
1
1
; C 1 c 1
1
; 1 can be replaced by u
1
2
; 1 1 u c 1
1
; C 1 c 1
1
; 1 i n
which some difficult puzzles are used in the normal phase.
If the defender should wait for i requests before discerning
a possible deviation, i.e., playing his part in the above
strategy profile for i periods, the fairness condition implies
that 1 u 1ic
i
or u ! 1 1ic
i
. Clearly, u 0
if i 1c
i
. Note that the length of the reprieve period i is
obtained from an increasing function 1i, where i is the
size of the attack coalition. Therefore, the following defense
mechanism (PDM3) is proposed against distributed attacks
in which it is assumed that the defender should wait for a
duration consisting of at most i requests before discerning
a possible deviation.
PDM3 (known coalition size). Upon receiving a request in
Phase 1, the defender runs a random number generator producing
the random variable x with 1ix 0 1 c1 u. 1ix
1 c1 u, and 1ix 2 u. Then, he produces the puzzle
according to the value of xand sends the puzzle and the value of xto
the requester (the value of x is considered as the output of the public
randomizingdevice). As inPDM2, the defender considers the action
taken by the requester as a deviation if he receives no response in the
maximum waiting time calculated on the basis of the coalition size,
or if he receives a response that is not in conformity with the
equilibrium prescription. If / ! 1 deviations are discerned, when
the defense mechanismis in Phase 1, it goes to the punishment phase
withthe lengthof /times the lengthidentifiedinTheorem2. If it is in
the punishment phase and a deviation of Phase 1 is discerned, a full-
length Phase 2 is added to the current punishment phase. When
the defense mechanism finishes the punishment phase, it goes to
Phase 2, and then, it is repeated as above.
The final remark is about estimating the size of the attack
coalition. As seen, a difficult puzzle is designed in such a
way that c
o1
2
c

o1
2
i c
i
, where c

o1
2
is the cost of
solving the difficult puzzle in a single machine, and i is the
number of machines in the attack coalition. Therefore, if one
assumes a fixed coalition size, say, the largest possible one,
he may unnecessarily choose a very difficult puzzle for the
punishment phase that imposes a low quality of service on
legitimate users. Hence, some procedure should be adopted
to estimate the size of the attack coalition. More precisely, in
this case, the game would be of incomplete information, i.e.,
a Bayesian game, in which a player does not completely
know his opponents payoff function, here, the value of
c
o12
c

o12
i. In a repeated Bayesian game, a player
gradually learns his opponents payoff function through
examining the actions taken by his opponent at previous
periods. Although there are some complicated models of
infinitely repeated games that identify the equilibrium
strategies in the case of incomplete information, e.g., [34],
[35], and [36], the following approach is adopted here.
The defender has an estimation of the minimum number
of requests
^
`
iii
that a single machine can send in the time
the defender requires to process a request using his main
resource T. Then, he makes an estimation of the coalition
size ^ i as less than or equal to the number of requests
received in the time interval of length T
^
`
iii
. Note that,
the difficulty level of the puzzle used in the punishment
phase is obtained from c

o12
! ^ ic
i
, and so, if the defender
overestimates the size of the attack coalition, he uses a
puzzle more difficult than the one required for the actual
size. Thus, the defense mechanism acts safely for smaller
coalitions. Furthermore, the parameter i is calculated on
the basis of ^ i, and thus, u is chosen in such a way that the
existence of the reprieve period does not lead to the
exhaustion of defenders resources.
The above argument is not valid if the attacker does not
apply his maximum power throughout the attack. For
example, the attack coalition may send a small number of
requests to deceive the defender in his estimation of the
coalition size and then benefiting from solving ineffective
difficult puzzles wrongly designed for the punishment
phase. To resolve this, the following approach is adopted in
which, through a fair learning process, the defender obtains
an effective estimation of the coalition size that leads to an
equilibrium in the infinitely repeated game of the client-
puzzle approach.
Assume the actual coalition size is i

, and the defender


presumes i1 different coalition sizes ^ i
0
<^ i
1
<. . . <^ i
i
^ i
1
in his learning process, where ^ i
0
1 and ^ i
1
is the size of
the largest possible attack coalition (the defense mechanism
is effective when the size of attack coalition is less than ^ i
1
.)
From the above arguments, a defense mechanism is
immediately found if the defender knows the size of the
attack coalition, where he finds c
o11
, c
o12
, the length of
Phase 2 in Theorem 2 t, i, u, and c and then follows
PDM3 to attain an equilibrium payoff vector .
1
. .
2
. If the
defender does not know the coalition size, he runs PDM4.
PDM4 (unknown coalition size)
Step 1. Put i 0, , 0, ~ t 0, and ^ i ^ i
0
. Set the elements
in o c
o11
. c
o12
. t. i. u. c. i
j
for PDM3 on the basis of the
coalition size ^ i, where i
j
is the number of remaining punishment
periods that is set to 0 in this step.
Step 2. Run PDM3 according to o. If a bad event c 2 fc
1
. c
2
g
occurs, save o, and go to Step 3. The event c
1
occurs when the
number of received requests shows a coalition size larger than ^ i.
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 15
The event c
2
occurs when a deviation from the action profile
prescribed for the punishment phase is discerned. Note that c
1
can
occur in both normal and punishment phases, while c
2
can occur
only in the punishment phase. If the defense mechanism remains in
this step for a long time, say, T
0
|T, and the number of requests
received during this period is less than |c
i
, go to Step 1. This
resumes the protocol with simple puzzles when the attack
terminates. (Note that the defense mechanism usually employs
nonces to guarantee the freshness of received messages. Therefore, if
the attacker saves his solutions to the puzzles and sends themafter a
long time, the defender discards them.)
Step 3. If c c
1
, find the smallest value 1 i i for which the
new estimate of the coalition size ~ i satisfies ~ i < ^ i
i
( ~ i is the
number of requests received in the time interval of length T
^
`
iii
),
and set , i. Otherwise, set , , 1. Then, put ^ i ^ i
,
and
i
j
i
j
~ t and obtain the new ~ t using (15) with the belief that
the actual coalition size is ^ i. Adjust o on the basis of ^ i and ~ t, and
go to Step 2. The adjustment of o is straightforward except for the
length of the remaining punishment periods that is done as follows:
The new estimate ^ i reveals a new length of punishment periods in
Theorem 2, say, t
0
. The number of remaining punishment periods
is then readjusted as i
0
j
t
0
ti
j
~ t.
PDM4 starts with the normal phase when the defender
believes that the coalition size is ^ i
0
, and he checks this belief
continually. If the belief is correct, the defense mechanism
goes on as it was initiated, i.e., with the parameters adjusted
for the coalition size ^ i
0
. Otherwise, it goes to the punishment
phase with the belief that the coalition size is ^ i
,
, and the
parameters are readjusted for this new coalition size. More-
over, a number of extra punishment periods ~ t is considered
to remove the benefit obtained by the attacker during the
learning process. This procedure continues until arriving at a
firm decision about the coalition size, where the parameters
of the defense mechanism are certainly known.
The learning process lasts for 1^ i
i
0
1^ i
i
1
. . .1^ i
i
:

periods, where i
0
. i
1
. . . . . and i
:
are the values the variable ,
takes before determining the actual coalition size i

(evi-
dently, i
0
0). In other words, ^ i
i
:
is the last estimation for
which a bad event has occurred. The attackers benefit from
performingtheattackwithless thanhis maximumpower, i.e.,
his profit duringthe learningprocess, is calculatedas follows:
The attackers payoff when he randomly answers the puzzle
of the punishment phase designed on the basis of the actual
coalition size i

is c
11
c
\ 1
(as stated earlier, the cost to the
defender in producing or verifying a puzzle is almost
independent of puzzles difficulty level, and so, we have
used c
11
and c
\ 1
instead of c
11i
and c
\ 1i
.) Therefore, the
attackers maximum benefit during the learning process is
X
:
/0
1 ^ i
i
/
.
/
2
c
11
c
\ 1

.
where .
/
2
is the attackers payoff insolving the difficult puzzle
designed on the basis of the coalition size ^ i
i/
. Thus, for a
discount factor near the unity, the extra punishment periods
with the actual difficult puzzle 1

2
that causes the attacker to
comply with the equilibrium prescription is obtained from
X
:
/0
1 ^ i
i/
.
/
2
c
11
c
\ 1

~ t.
2
c
11
c
\ 1
.
or
~ t !
P
:
/0
1 ^ i
i
/
.
/
2
c
11
c
\ 1

.
2
c
11
c
\ 1
. 14
where .
2
is the attackers payoff in playing the strategy
profile u1
2
; 11uc1
1
; C1c1
1
; 1
with the parameters obtained from the actual coalition size.
It is evident that .
2
c
11
c
\ 1
c1 uc
i
c

o11
,
where c

o1
1
is the simple puzzle for the actual coalition
size. Therefore, (14) is reduced to
~ t !
P
:
/0
1 ^ i
i/
.
/
2
c
11
c
\ 1

c1 u c
i
c

o1
1
. 15
By suchextra punishment periods, the attacker gains nothing
from performing the attack with less than his maximum
power. Note that a fair learning process should satisfy
X
:
/0
1 ^ i
i/
1c
i
.
This implies that the number of presumed coalition sizes
i 1 should decrease with c
i
. Fig. 7 shows PDM4.
6 DISCUSSION
This section discusses some aspects of the puzzle-based
defense mechanisms proposed in this paper and outlines
future researches in the game-theoretic study of the client-
puzzle approach. It also compares these mechanisms with
some of the earlier puzzle-based defenses against flooding
attacks.
Foremost, the network model of Section 2 is abstract
enough to be employed in designing the puzzle-based
mechanisms being usable in different contexts and applica-
tions. For example, its abstract metric spaces and reference
distances enable a mechanism to deploy different types of
puzzles. If a puzzle imposes a number of computational
steps, the resource engaged by the attacker is the CPU time,
and the corresponding metric reflects the number of CPU
clocks the attacker spends on solving the puzzle. If a puzzle
imposes memoryaccesses, the metric identifies the amount of
the memory used. It is worth noting that a designer should
specifytheresources, their metric spaces, their corresponding
reference distances, and the capabilities of a prospective
attacker according to the application he intends to protect
against flooding attacks. This implies that the practical
implementation of a puzzle-based mechanism may vary
from an application to another one.
There are some weaknesses in the earlier puzzle-based
mechanisms that are resolved in the current paper. In the
challenge-response approach [27], upon receiving a request,
the defender is required to respond with a puzzle of the
current highest level of difficulty. The defender allocates
resources only if he receives the correct solution from the
requester. By adapting the puzzle difficulty levels propor-
tional to the current load, the defender can force clients to
solve puzzles of varying difficulty levels. As stated earlier,
suchanapproachdoes not care about the qualityof service for
the legitimate users. Furthermore, when the defenders
current load is low, he may produce simple puzzles for a
large number of incoming requests sent by the attack
coalition, since the requests themselves do not change his
load. The attacker can then solve these ineffective puzzles to
deplete the defenders resources. This is not the case for the
mechanisms proposed in this paper (see Section 5.3). A
conservative defender, who uses the challenge-response
mechanism, may resolve this by issuing the puzzles more
16 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
difficult than the ones indicated by his current load. Never-
theless, lackof asuitable procedure for this issue mayresult in
unnecessary difficult puzzles and, consequently, a low
quality of service for the legitimate users.
In the puzzle auctions protocol [28], given a request, the
defender, according to his current load, can either accept
the request and continue with the protocol or send a
rejection to the requester. The latter may then increase the
difficulty level of the puzzle and send the request back to
the defender. Upon the receipt of a rejection, the requester
solves the puzzle with double the difficulty level of the last
rejected one (he computes one further zero bit using hash
operations). In this way, a legitimate user may encounter
repeated rejections until reaching an acceptable difficulty
level. Therefore, the approach should be supplemented by a
suitable mechanism to estimate the appropriate difficulty
level. Furthermore, in the puzzle auctions protocol, the
requester is required to choose a maximum number of hash
operations he is willing to perform for solving a puzzle. If
this number is not accurately chosen, the requester may not
attain the service when there is a flooding attack.
The mechanisms proposed in the current paper are based
on a number of assumptions. For example, it is assumed that
`c
11
c
\ 1
1. In other words, the defender should be
able to produce the puzzles and verify the solutions in an
efficient way. According to the fairness property stated in
Definition 5, this is a necessary condition for a defense
mechanism to be effective. A similar assumption has also
been made in the earlier puzzle-based defenses.
Another assumption is that the defender is at least
capable of sending reply messages to the origins of
incoming requests. All the earlier puzzle-based defenses
are also based on such an assumption [22], [28]. This
seemingly restricts the applicability of the proposed
mechanisms in the case of bandwidth exhaustion attacks
in which the attacker sends a huge number of service
requests to deplete the victims bandwidth. However, it can
be envisioned that by coordinating multiple routers
installed with the defense mechanisms proposed in this
paper, one can restrain the attack flows before they
converge to the victim. The proposal of such an approach
has been suggested in [30]. Nevertheless, the game-theoretic
approach employed in the current paper is not sufficient for
handling such a case. For example, providing incentives for
the routers to cooperate in the defense is an important issue
deserving further researches. More specifically, it can be
studied through the cooperative game theory.
Another assumption made in this paper is the complete
rationality of the players. Evidently, the defense strategies
proposed in this paper may not be optimal if the attacker
has a bounded level of rationality. In other words, the
defender can gain payoffs better than the ones attainable by
the mechanisms of this paper when his opponent is not
completely rational. Again, game theory has specific
solutions for such circumstances as well.
7 CONCLUSION
This paper utilizes game theory to propose a number of
puzzle-based defenses against flooding attacks. It is shown
that the interactions between an attacker who launches a
flooding attack and a defender who counters the attack
using a puzzle-based defense can be modeled as an
infinitely repeated game of discounted payoffs. Then, the
solution concepts of this type of games are deployed to find
the solutions, i.e., the best strategy a rational defender can
adopt in the face of a rational attacker. In this way, the
optimal puzzle-based defense strategies are developed.
More specifically, four defense mechanisms are pro-
posed. PDM1 is derived from the open-loop solution
concept in which the defender chooses his actions regard-
less of what happened in the game history. This mechanism
is applicable in defeating the single-source and distributed
attacks, but it cannot support the higher payoffs being
feasible in the game. PDM2 resolves this by using the
closed-loop solution concepts, but it can only defeat a
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 17
Fig. 7. PDM4The defense mechanism against distributed flooding
attacks where the size of attack coalition is unknown. For a given coalition
size, the function 1 returns the elements of o c
o11
. c
o12
. t. i. u. c. i
j

except for i
j
. The function 1 determines the length of the reprieve period
for a given coalition size.
single-source attack. PDM3 extends PDM2 and deals with
distributed attacks. This defense is based on the assumption
that the defender knows the size of the attack coalition.
Finally, in PDM4, the ultimate defense mechanism is
proposed in which the size of the attack coalition is
assumed unknown.
The mechanisms proposed in this paper can also be
integrated with reactive defenses to achieve synergetic
effects. A complete flooding attack solution is likely to
require some kind of defense during the attack traffic
identification. The mechanisms of this paper can provide
such defenses. On the other hand, the estimations made by
a reactive mechanism can be used in tuning the mechan-
isms proposed in this paper.
APPENDIX A
PROOF OF THEOREM 1
A mixed strategy for the defender is of the form
c
1
c
1
1
1
c
2
1
2
. . .c
i
1
i
, where c
1
c
2
. . .c
i
1.
The attackers best response to c
1
is / 2
i
2
with /i Cfor
i 2 1
1
f,j, /. c
,
6 0g and /i 1 for i 2 1
H
f,j, !
/ 1. c
,
6 0g. Therefore, any probability distribution over
these pure best responses makes an attackers mixed best
response to c
1
denoted by c
2
. In such a case, the profile
c
1
; c
2
is a Nash equilibrium if for any 1 i, , i with c
i
,
c
,
6 0, o
1
1
i
; c
2
o
1
1
,
; c
2
. Since o
1
1
i
; c
2
c
i
c
11i

c
\ 1
i
ic
o1
i
c
o1
i
for i 2 1
1
and o
1
1
i
; c
2
c
11
i
c
\ 1
i

ic
o1
i
for i 2 1
H
, this condition necessitates i 1 if there are
two or more distinct indices in 1
1
. Similarly, it necessitates
i 0 if there are two or more distinct indices in 1
H
. Thus, the
only possible cases are 1) j1
1
j j1
H
j 1, 2) j1
1
j 1, and
j1
H
j 0, and 3) j1
1
j 0 and j1
H
j 1. In Case 1, c
1
c
1
i
1 c 1
,
, where i 2 1
1
and, 2 1
H
. We prove that i /
and , / 1. If c
1
c 1
i
1 c 1
,
; c
2
is a Nash
equilibrium, 1
i
and 1
,
are the only defenders best responses
to c
2
. Thus, o
1
1
i
; c
2
o
1
1
,
; c
2
o
1
1
|
; c
2
for any | 6 i, ,.
For i 6 /, we have o
1
1
i
; c
2
o
1
1
/
; c
2
or c
i
c
11i

c
\ 1i
ic
o1i
c
o1i
uc
11/
ic
o1/
u
0
c
11/
c
\ 1/

ic
o1
/
u
00
c
i
c
11
/
c
\ 1
/
ic
o1
/
c
o1
/
, where uu
0

u
00
1. This is reduced to u
0
c
\ 1
/
u
00
c
i
c
\ 1
/
c
o1
/

c
i
c
\ 1i
ic
o1i
c
o1i
ic
oj/
. As c
i
c
\ 1/
ic
o1/
<
c
i
c
\ 1
i
ic
o1
i
c
o1
i
ic
oj
/
for any i 2 0. 1, the in-
equalitycannot besatisfied. This is acontradiction, andhence,
i /. By a similar argument and under the hypothesis
c
o1/1
c
i
c
\ 1
, it is proved that , / 1. Furthermore,
o
1
1
i
; c
2
o
1
1
,
; c
2
yields i c
i
c
o1
/
c
o1
/1
c
o1
/
.
This completes the proof in Case 1, which leads to the profiles
of type 1 in Theorem1. The same arguments culminate in the
profiles of type 2 for Case 2, andthe ones of type 3 for Case 3.tu
APPENDIX B
PROOF OF THEOREM 2
For an arbitrary defenders strategy c 1
1
1 c 1
2
, the
attackers maximum payoff is cc
11
c
\ 1
c
i
c
o1
1

1 cc
11
c
\ 1
, which is minimized at c 0. Therefore,
`
2
1
1
2
. On the other hand, the general formof an attackers
strategy is
c
2
u
1
QT. QT u
2
QT. 1 . . . u
9
C. C.
Thus, o
1
1
1
; c
2
and o
1
1
2
; c
2
take their minimums, i
1
and
i
2
, at c
2
u
7
C. QT u
8
C. 1 u
9
C. C
a n d c
2
u
2
QT. 1 u
5
1. 1 u
8
C. 1,
respectively. The only strategy profile minimizing both o
1
1
1
; c
2
and o
1
1
2
; c
2
is c
2
C. 1. Therefore, any c
2
6
C. 1 that does not minimize o
1
1
1
; c
2
or o
1
1
2
; c
2

cannot participate in minmaxing. If i < c


i
c
o11
c
o12

c
o11
, then i
2
i
1
. Hence, since o
1
1
2
; c
2
! i
2
for any c
2
,
we have iorfo
1
1
1
; c
2
. o
1
1
2
; c
2
g ! i
2
with equality at
some of the points for which c
2
u
2
QT. 1 u
5

1. 1 u
8
C. 1. As iorfo
1
1
1
; C. 1. o
1
1
2
;
C. 1g i
2
, we have `
2
1
; `
1
2
2 1
2
; 1. tu
ACKNOWLEDGMENTS
The author would like to thank the anonymous reviewers
for their helpful comments.
REFERENCES
[1] D. Moore, C. Shannon, D.J. Brown, G.M. Voelker, and S. Savage,
Inferring Internet Denial-of-Service Activity, ACM Trans.
Computer Systems, vol. 24, no. 2, pp. 115-139, May 2006.
[2] A. Hussain, J. Heidemann, and C. Papadopoulos, A Frame-
work for Classifying Denial of Service Attacks, Proc. ACM
SIGCOMM 03, pp. 99-110, 2003.
[3] A.R. Sharafat and M.S. Fallah, A Framework for the Analysis
of Denial of Service Attacks, The Computer J., vol. 47, no. 2,
pp. 179-192, Mar. 2004.
[4] C.L. Schuba, I.V. Krsul, M.G. Kuhn, E.H. Spafford, A. Sundaram,
and D. Zamboni, Analysis of a Denial of Service Attack on TCP,
Proc. 18th IEEE Symp. Security and Privacy, pp. 208-223, 1997.
[5] Smurf IP Denial-of-Service Attacks. CERT Coordination Center,
Carnegie Mellon Univ., 1998.
[6] Denial-of-Service Tools. CERT Coordination Center, Carnegie
Mellon Univ., 1999.
[7] Denial-of-Service Attack via Ping. CERT Coordination Center,
Carnegie Mellon Univ., 1996.
[8] IP Denial-of-Service Attacks. CERT Coordination Center, Carnegie
Mellon Univ., 1997.
[9] J. Mirkovic and P. Reiher, A Taxonomy of DDoS Attacks and
DDoS Defense Mechanisms, ACM SIGCOMM Computer Commu-
nication Rev., vol. 34, no. 2, pp. 39-53, Apr. 2004.
[10] J. Ioannidis and S. Bellovin, Implementing Pushback: Router-
Based Defense Against DDoS Attacks, Proc. Network and
Distributed System Security Symp. (NDSS 02), pp. 6-8, 2002.
[11] S. Savage, D. Wetherall, A. Karlin, and T. Anderson, Practical
Network Support for IP Traceback, Proc. ACM SIGCOMM 00,
pp. 295-306, 2000.
[12] D. Song and A. Perrig, Advanced and Authenticated Marking
Schemes for IP Traceback, Proc. IEEE INFOCOM 01, pp. 878-886,
2001.
[13] A. Yaar, A. Perrig, and D. Song, PI: A Path Identification
Mechanism to Defend Against DDoS Attacks, Proc. IEEE Symp.
Security and Privacy, pp. 93-109, 2003.
[14] D. Dean, M. Franklin, and A. Stubblefield, An Algebraic
Approach to IP Traceback, ACM. Trans. Information and System
Security, vol. 5, no. 2, pp. 119-137, May 2002.
[15] A.C. Snoeren, C. Partridge, L.A. Sanchez, C.E. Jones,
F. Tchakountio, S.T. Kent, and T. Strayer, Hash-Based IP Trace-
back, Proc. ACM SIGCOMM 01, pp. 3-14, 2001.
[16] A. Yaar, D. Song, and A. Perrig, SIFF: A Stateless Internet Flow
Filter to Mitigate DDoS Flooding Attacks, Proc. IEEE Symp.
Security and Privacy, pp. 130-146, 2004.
[17] J. Mirkovic and P. Reiher, D-WARD: A Source-End Defense
Against Flooding Denial-of-Service Attacks, IEEE Trans. Depend-
able and Secure Computing, vol. 2, no. 3, pp. 216-232, July/Sept. 2005.
[18] P. Ferguson and D. Senie, Network Ingress Filtering: Defeating
Denial of Service Attacks which Employ IP Source Address Spoofing,
RFC 2267, Jan. 1998.
[19] C. Meadows, A Cost-Based Framework for Analysis of
Denial of Service in Networks, J. Computer Security, vol. 9,
nos. 1-2, pp. 143-164, Jan. 2001.
18 IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING, VOL. 7, NO. 1, JANUARY-MARCH 2010
[20] J. Leiwo and Y. Zheng, A Method to Implement a Denial of
Service Protection Base, Proc. Australian Conf. Information Security
and Privacy, pp. 90-101, 1997.
[21] J. Leiwo, P. Nikander, and T. Aura, Towards Network Denial of
Service Resistant Protocols, Proc. 15th Intl Information Security
Conf., pp. 301-310, 2000.
[22] A. Jules and J. Brainard, Client-Puzzles: A Cryptographic
Defense Against Connection Depletion, Proc. Network and
Distributed System Security Symp. (NDSS 99), pp. 151-165, 1999.
[23] D. Dean and A. Stubblefield, Using Client Puzzles to Protect
TLS, Proc. 10th Ann. Usenix Security Symp., pp. 1-8, 2001.
[24] W. Feng, The Case for TCP/IP Puzzles, Proc. ACM SIGCOMM
Workshop Future Directions in Network Architecture, pp. 322-327,
2003.
[25] T. Aura, P. Nikander, and J. Leiwo, DoS-Resistant Authenti-
cation with Client Puzzles, Proc. Eighth Security Protocols
Workshop, pp. 170-178, 2000.
[26] B. Waters, A. Jules, J. Halderman, and E. Felten, New Client
Puzzle Outsourcing Techniques for DoS Resistance, Proc. ACM
Conf. Computer and Comm. Security, pp. 246-256, 2004.
[27] W. Feng, E. Kaiser, W. Feng, and A. Luu, The Design and
Implementation of Network Puzzles, Proc. 24th Ann. Joint Conf.
IEEE Computer and Comm. Societies, pp. 2372-2382, 2005.
[28] X. Wang and M. Reiter, Defending Against Denial-of-Service
Attacks with Puzzle Auctions, Proc. IEEE Security and Privacy,
pp. 78-92, 2003.
[29] B. Bencsath, I. Vajda, and L. Buttyan, A Game Based Analysis of
the Client Puzzle Approach to Defend Against DoS Attacks, Proc.
11th Intl Conf. Software, Telecomm., and Computer Networks, pp. 763-
767, 2003.
[30] A. Mahimkar and V. Shmatikov, Game-Based Analysis of Denial-
of-Service Prevention Protocols, Proc. 18th Computer Security
Foundations Workshop, pp. 287-301, 2005.
[31] H. Gintis, Game Theory Evolving: A Problem-Centered Introduction to
Modeling and Strategic Behavior. Princeton Univ. Press, pp. 129-130,
2000.
[32] D. Fudenberg and E. Maskin, The Folk Theorem for Repeated
Games with Discounting and Incomplete Information, Econo-
metrica, vol. 54, no. 3, pp. 533-554, May 1986.
[33] M. Abadi, M. Burrows, M. Manasse, and T. Wobber, Moderately
Hard, Memory-Bound Functions, Proc. Network and Distributed
System Security Symp. (NDSS 03), pp. 25-39, 2003.
[34] F. Forges, Note on Nash Equilibria in Infinitely Repeated Games
with Incomplete Information, Intl J. Game Theory, vol. 13, no. 3,
pp. 179-187, Sept. 1984.
[35] S. Hart, Nonzerosum Two-Person Repeated Games with
Incomplete Information, Math. of Operations Research, vol. 10,
no. 1, pp. 117-153, Feb. 1985.
[36] F. Forges, Repeated Games of Incomplete Information: Non-Zero
Sum, Handbook of Game Theory, R. Aumann and S. Hart, eds., vol. 1,
pp. 155-177, Elsevier Science, 1992.
Mehran S. Fallah received the PhD degree
from Tarbiat Modares University in 2002. He is
currently an assistant professor in the Depart-
ment of Computer Engineering, Amirkabir
University of Technology (Tehran Polytechnic),
Iran. His research interests include information
security, system modeling, and formal specifi-
cation and verification of computing systems.
> For more information on this or any other computing topic,
please visit our Digital Library at www.computer.org/publications/dlib.
FALLAH: A PUZZLE-BASED DEFENSE STRATEGY AGAINST FLOODING ATTACKS USING GAME THEORY 19

You might also like