(IJCSIS) International Journal of Computer Science and Information Security,Vol. 11, No. 6, June 2013relation to competitive multiarmed bandit problem. The armbandit problem is well understood for single CR, which wishesto opportunistically exploit the availability of resources in thespectrum. For multiple users however the outcomes are notclear. This difﬁculty is in particular due to the interactivebehavior of the underlying processes of decision makingin dynamic environment. The authors proposed a Bayesianapproach to identify a tradeoff metric between exploring theavailability of other free channels/time slots and exploiting theopportunities identiﬁed. In [10], a game theoretic learning andpricing have been proposed. In the above references, stochasticformulations of the medium access problem are examined.These formulations often lead to intractable conﬁgurations.Authors in [19], proposed a new twostep game where sensing and opportunistic access are jointly considered. A fullcharacterization of the Nash equilibria and analysis of theoptimal pricing policy, from the network owner view, for bothcentralized setting and decentralized setting, are also provided.Next, a combined learning algorithm that is fully distributedand allows the cognitive users to learn their optimal payoffsand their optimal strategies in both symmetric and asymmetriccases is proposed.
B. contribution
In this paper, we propose to associate game theory tolearning strategies in cognitive medium access, to ﬁnd theequilibrium sensing time. To the best of our knowledge,this is the ﬁrst paper devoted to analyze distributed sensing.Meanwhile, the related literature usually considers the sensingtime from an optimization perspective [18] [13] [5]. Incontrast to the classical literature of medium access games,which does not focus on the random nature of cognitive radios,we propose a fully distributed strategic learning to learn theequilibrium payoff and the associated equilibrium strategies.Moreover, we provide many insightful results to understandthe possible relationship between sensing time and transmitprobability. Next, we analyze the impact of starting pointand speed of learning on convergence to Nash equilibrium.Finally we propose a comparison in terms of sensing time andthroughput between the proposed solution and a centralizedone.
C. Organization of the paper
This paper will be organized as follows: In section II,the system model, the main notations and spectrum sensingpreliminaries are presented. In section III, we describe theutility function of the game and equilibrium analysis, insection IV, we propose a distributed learning algorithm, andin section V, a comparison between the proposed solution anda centralized one. Performance evaluation and results analysisare provided in section V.II. S
YSTEM
M
ODEL
A
ND
M
AIN
N
OTATIONS
A. System model
We consider a secondary network that coexists with a primary network where each PU is licensed to transmit wheneverhe/she wishes for most of time except for the case when thechannel is occupied by another PU. The duration of a primaryframe is denoted T. We consider that we have N SU tryingto access the spectrum of the PU. Throughout this work thefollowing consideration are taken into account:
•
Energybased spectrum sensing: the primary network activity is determined by measuring the signal strengthtraveling over the channel. If the received signal powerexceeds some given threshold, the channel is declaredbusy, it is declared idle in the other case.
•
Imperfect sensing in the sense that SUs may declare abusy channel while it is idle (false alarm).
•
Random access for data transmission of CRs. Here, weconsider that SUs follow a slotted alohalike protocol totransmit data.During primary user’s activities, each SU i receives some givensignal. SU i samples the received signal at sampling frequencyf
s
without loss of generality, we assume that all SUs use thesame sampling frequency. The discrete received signal at theSU i can be represented as:
Y
i
=
h
i
.S
(
t
) +
n
(
t
)
: Hypothesis H
1
(
Busy
)
n
(
t
)
: Hypothesis H
0
(
Idle
)
(1)Where h
i
is the channel gain experienced by SU i and n(t)is an circular symmetric complex Gaussian noise with mean 0and variance
E
[

n
(
t
)
2

] =
σ
. The channel state is consideredas the binary hypothesis test H
0
and
H
1
.
B. Energy based spectrum sensing
Spectrum sensing is often considered as a detection problem. Many techniques were developed in order to detect theholes in the spectrum band. Focusing on each narrow band,existing spectrum sensing techniques are widely categorizedinto energy detection [6] and feature detection [7]. Althoughthis is not a restriction of our work, we will use energydetection throughout the paper.Let
τ
i
be the sensing time and N the number of consideredsamples. Thus, we have
N
= [
τf
s
]
.it follows that the averageenergy detected by SU i is:T
i
(
Y
) =
1
N
N
1

Y
i
(
t
)

2
(2)
C. Imperfect sensing
We consider throughout this work a scenario where thespectrum sensor has imperfect detection performance. Inother terms, each SU i has a false alarm probability
P
fi
, i.e.,the probability that the channel is sensed to be busy while itis actually idle. Let
ǫ
denotes the threshold which speciﬁesthe collision tolerance bound of PUs. Then:
P
fi
(
ǫ,τ
i
) =
Pr
(
T
i
(
Y
)
> ǫ
\
H
0
)
(3)2
26http://sites.google.com/site/ijcsis/ISSN 19475500