Professional Documents
Culture Documents
The El Farol Bar Problem is a classic computational economics problem in which agents attempt
to attend a weekly event at a bar only if it is not too crowded. Each agent has access to multiple
competing strategies that may be used to predict whether attendance will be above the tolerance
threshold. Many different variations of the El Farol Bar Problem have been published, with a
particularly broad variation in the choice of meta-strategy, the algorithm by which each agent selects
the best-performing strategy. This paper discusses the varying mechanisms of self-organization
within the system when different meta-strategies are used, including the existence and location of
fixed points in the limit of an infinite number of agents. Informed by these mechanisms, we present
an evolutionary paradigm that reduces the likelihood of degenerate cases where attendance is always
arXiv:2306.07885v1 [nlin.AO] 13 Jun 2023
too high or too low, while reducing the amplitude of fluctuations in attendance. This evolutionary
paradigm sheds light on the critical role of heterogeneity of strategies selected in the emergence of
a stable steady state in the El Farol Bar Problem.
FIG. 1: Attendance Nt (solid blue curve) versus t for the binary decision meta-strategy, with A = 100, and various
values of T (orange dashed line), m, and s.
for larger values of m and s, attendance does fluctuate For a given strategy a, if Nt−j = N ∗ for all j ∈
around the threshold, but fluctuations may be very large. {1, 2, ..., m}, the prediction for each time point in the
Fig. 2 shows broader trends in the relationship be- memory window is simply
tween m, s, and the size of fluctuations. The color of
each cell shows the standard deviation of attendance at
m
steady state for the given values of m and s. Fixed points X
(i.e., Nt remains constant for t greater than some t0 ) are N̂i,t = N ∗ aij . (4)
i=1
observed for some very small values of m and s, visible
as red cells in the lower left corner, while neighboring
regimes have much larger fluctuations. Large fluctua- According to the binary decision meta-strategy, if there
tions are also an undesirable outcome since low atten- is any strategy available to the agent such that N̂i,t ≥ T ,
dance means resources (enjoyable bar seats) go to waste it will be selected. So the agent will attend the bar only
while high attendance means a large number of unsatis- if all strategies have N̂i,t < T . Thus the probability Pgo
fied patrons. As m and s grow, the size of fluctuations that an agent attends is
appears to decrease, as illustrated in Fig. 1(f). This pat-
tern is strikingly different from what is observed in some
other variations of the problem, in which volatility in- Pgo = P (N̂i,t < T ∀ i ∈ {1, 2, ..., s})
creases as s grows [7, 15]. = P (N̂1,t < T )s
First, we address the situation observed in Fig. 1 (a), s (5)
m
i.e., we determine for which parameters can attendance X T
settle at a fixed point Nt = N ∗ . We start by assuming =P a1j < ∗ .
N
that there is a fixed point N ∗ ≥ T . The case where N ∗ < j=1
s
T
Pgo = P 2X − m < ∗
N
s
T m
=P X< +
2N ∗ 2
m s
T m
⌊ 2N ∗ + 2 ⌋ FIG. 3: Solutions to Eq. (6) (blue circles) as a function
1 X m T m
= (−1)k + −k . of m for s = 2 (above) and as a function of s for m = 2
m! k 2N ∗ 2 (below) with A = 1, 000 and T = 600. Error bars show
k=0
the locations of fixed points in simulation when a fixed
Setting Pgo = N ∗ /A as discussed above, we obtain point is theoretically predicted (e.g. predicted fixed point
is above T ). When fixed points are not predicted, error
⌊ T∗ +m⌋
bars show the range of observed values in simulation.
N∗ 1 2NX 2
k m
= (−1)
A m! k
k=0 (6)
T m
m s in simulation. For these values, the error bars show the
+ −k , N∗ ≥ T . full spread of the observed oscillations over the last 50
2N ∗ 2 trials.
In Fig. 3, the blue circles show the solutions to Eq. (6) The analysis for N ∗ < T is similar (see Appendix A),
as m and s increase. The dashed line shows the thresh- and it predicts there are no fixed points below the thresh-
old T . For sufficiently low m and s values, solutions to old, matching what we observe in simulation.
Eq. (6) are above T , a necessary assumption to justify Our analysis of the Binary Decision meta-strategy
Eq. (5). Error bars on the plots in Fig. 3 show the min- shows that, unless m and s are extremely large, atten-
imum and maximum values over the last 50 time steps dance can either converge to a fixed point away from the
over 5 simulations of duration 100. In the case of fixed threshold or undergo strong oscillations about it. As far
points very close to T , such as m = 5, s = 2, some tri- as we know, fixed points for the Binary Decision meta-
als failed to converge and were discarded. The error bar strategy have not been studied previously. For large m
represents the spread of 5 trials for which attendance did and s, our numerical results suggest that attendance does
converge above T . As s or m grow sufficiently, the solu- indeed converge to the threshold T . Overall, our results
tion drops below T , and we no longer observe fixed points for the binary decision meta-strategy highlight the im-
5
FIG. 4: Attendance Nt (solid blue curve) versus t for the error minimization meta-strategy, with A = 100, and various
values of T (orange dashed line), m, and s.
portant effects of the number of strategies s and the mem- more strategies for each agent (larger s) lead to larger
ory window m on the collective dynamics of the agents. oscillations in attendance?
To address question (i), we solve for fixed points in
weekly attendance by assuming a fixed point, N ∗ , ex-
B. Error Minimization Meta-Strategy ists and determining the probability, Pgo that each agent
would attend the bar. As we did with the binary decision
Now we study the error minimization meta-strategy. meta-strategy, we consider the limit A → ∞ and identify
In Fig. 4 we show attendance Nt versus time t for different Pgo with N ∗ /A to derive an implicit equation for N ∗ .
values of m, s, and T . As in the binary decision meta- Supposing attendance every week is N ∗ , that is Nt = N ∗
strategy, there are some cases where attendance settles for all t, the cost function for each week simplifies to
at an approximately constant value. When it does so
fluctuations are, in general, smaller than in the binary
decision case. On the other hand, when the number of 2
strategies s sufficiently exceeds the memory window m, Xm
as it does in Fig. 4(f), attendance can have very large Ct (a) = m(N ∗ )2 ai,j − 1 . (7)
oscillations, in some cases alternating between 0 and A. j=1
This is illustrated in Fig. 5, which shows the standard
deviation of steady state attendance as a function of m
and s. Volatility appears to increase as s grows relative to Hence, the optimal strategy Pismsimply the one that min-
a fixed m, consistent with patterns observed by Johnson imizes the quadratic form ( j=1 ai,j − 1)2 . In the fol-
et. al and Collins [7, 15]. In the remainder of this section, lowing, we denote by a∗ the strategy weights selected by
we investigate the following two questions: (i) why does an arbitrary agent among their s available choices, and
attendance sometimes settle around values other than by f ∗ their probability density function. In Appendix B,
the threshold T ?, and (ii) why does the availability of we show that this probability density is
6
FIG. 5: Standard deviation of attendance at steady state Hence, we compute Pgo by integrating Eq. (10) over
under the minimize squared errors meta-strategy. All the region satisfying Eq. (12). With s = 2 and m = 2, as
simulations have N = 10, 000, T = 6, 000. assumed in Eq. (10), this evaluates to
h 2 3
i
1
24 6 + 12 NT∗ + 3 NT∗ − 2 NT∗ 0< T
N∗ ≤1
Z !s−1 Pgo =
s 1
h i
1 −4 + 36 T − 15 T 2 + 2 T 3
f ∗ (a∗ ) = 1− da , (8) 24 N ∗ N ∗ N ∗ otherwise.
2m E(a∗ ) 2m
(13)
∗
where E(a ) is the region of strategy space that would We are now prepared to construct an implicit equation
have a lower cost than a∗ , i.e., for N ∗ by setting Pgo = N ∗ /A:
h 2 3
i
E(a∗ ) = {a : C(a) < C(a∗ )}. (9) N∗
1
24 6 + 12 NT∗ + 3 NT∗ − 2 NT∗ 0< T
N∗ ≤1
= h i
In order to simplify the calculations while still gaining A 1 −4 + 36 T − 15 T 2 + 2 T 3 otherwise.
24 ∗N ∗N N∗
insight into the dynamics of the system, we study the
(14)
case where m = 2 in detail. In this case, the PDF of the
Eq. (14) is an implicit equation for the fixed point N ∗
weights, given by Eq. (8), evaluates to
as a function of the threshold T , for m = 2, s = 2.
While the calculations become more cumbersome as s
s−1
is increased, we have obtained analogous expressions for
∗ ∗ s 1 ∗
f (a ) = 1 − g(a ) , (10) m = 2 and s = 3, 4, 5 (see Appendix C). Plotting the
4 4
fixed point N ∗ as a function of T in Fig. 6, we note
where that the fixed point can be either above or below the
threshold, depending on other parameters. Therefore, for
∗ ∗ this particular meta-strategy, attendance does not self-
2|a1 + a2 − 1|
if |a∗1 + a∗2 − 1| < 1,
∗ 1 ∗ ∗
organize around the threshold, as is often claimed for
g(a ) = 2 [6|a1 + a2 − 1|− El Farol-type problems. In Refs. [7, 8] it was observed
(a∗ + a∗ − 1)2 − 1] otherwise.
1 2 numerically that in certain versions of El Farol Bar Prob-
(11) lem the mean attendance can differ from the attendance
At the fixed point, an agent will attend the bar if their threshold. Even though our implementation of the El
chosen strategy, a∗ , satisfies Farol Bar Problem is different, our model and results
provide an example where this issue can be explored the-
m
X oretically.
N∗ a∗j < T . (12) While we obtain implicit expressions for fixed points,
j=1 N ∗ , our analysis does not give any information on their
7
FIG. 7: Solutions to Eq. (14) (blue circles) as a function FIG. 8: The effect of evolution in different parameter
of s for m = 2, A = 1, 000, and T = 300 (above) and ranges. Evolution begins at t = 300. All simulations
T = 800 (below). Error bars show the observed range of were run with λ = 0.2, ϵ = 0.7, A = 1, 000.
values Nt in simulation over time steps t = 50 through
t = 100 in five independent simulations.
the strategy a∗ minimizes the cost function. This mini-
mizer is unique since the cost function is quadratic. For
stability. In some cases, instead of the fixed points pre- all other choices of a∗ , this integral will be strictly be-
dicted from our analysis, we observe large fluctuations in tween 0 and 1. Hence, as s approaches ∞, f ∗ approaches
attendance, and we hypothesize that this is because the the Dirac delta function centered at the optimal strat-
fixed points are unstable. This is illustrated in Fig. 7, egy. Thus, we expect that as s approaches infinity all
which shows our theoretical predictions for the fixed agents will make the same choice and every iteration will
points (blue circles), the threshold T (dashed line), and have either all agents or no agents attending. Interest-
the range over which attendance varies over the last 50 ingly, improved individual predictive accuracy leads to
time steps of 5 different simulations of length 100 (error decreased global utility.
bars). When this range is small, suggesting a fixed point,
the numerical values of Nt agree well with our theoretical
result. For larger values of s, the fixed point appears to IV. EVOLUTION
lose stability and large fluctuations are observed.
We now address question (ii), namely, why do large Using the binary decision meta-strategy, attendance
oscillations occur as s becomes large. Recall that the fluctuates around the threshold in all but a few anoma-
probability density for strategies selected by an agent is lous regimes with very few strategies, but the size of fluc-
given by Eq. (10). We R observe that the area of E(a∗ ) tuations remains large unless both m and s are extremely
1
and thus the value of E(a∗ ) 2m da will equal 0 only when large. Meanwhile, the error minimization meta-strategy
8
FIG. 9: Fraction of time above T at steady state with FIG. 10: Average distance from the threshold at steady
the error minimization meta-strategy, with T = 6, 000, state with the binary decision meta-strategy, without
A = 10, 000, ϵ = 0.5, λ = 0.2, without (top) and with (top) and with (bottom) evolution.
(bottom) evolution.
nize.
Instead, we propose a different approach to evolution
in the agent-based system; rather than refining individual
strategies, agents may re-draw their own m and s values
if they are not getting sufficient utility in the present
regime. Every time they change m and s, the weights
for each strategy are drawn once again from a uniform
distribution in [−1, 1]. The strategies then remain fixed
until the next time the agent redraws m and s.
The agent gives each strategy an initial trial period
of length 2m, whatever their current m value is. After
that, on each step, the agent computes their performance
over a window of length m. If their fraction of correct
choices is less than ϵ, the agents redraw m and s with
probability λ. In the simulations that follow, we give
each agent the same starting m0 and s0 values, and allow
the agents to draw uniformly at random from the range
s ∈ [s0 /2, 3s0 /2] and m ∈ [m0 /2, 3m0 /2]. Fig. 8 shows
the effect of evolution in different parameter regimes, for
λ = 0.2, ϵ = 0.7 (as we will discuss below, choices of λ
and ϵ do not significantly affect the results). In all panels,
the simulation is run without evolution for the first 300
steps. Then evolution is turned on a t = 300 (vertical
dashed line). The solid blue line shows the attendance,
Nt , and the horizontal dashed line shows the threshold,
T.
Evolution appears to both reduce the amplitude of
large fluctuations and force attendance to cross the
threshold when it otherwise would not. If attendance
were to remain above T for a sufficiently long period
of time relative to the agents’ m values, more than T
agents would be dissatisfied with their current regime,
and thus they would be likely redraw their parameters
until the behavior of the systems switches. Likewise, if
attendance remains below T for sufficiently long, then
A − T agents would be dissatisfied at each time step,
thus redrawing their parameters. Evolution also appears
to limit the size of fluctuations. In a scenario with large
fluctuations, many agents will be consistently dissatisfied
with their outcomes, as they will be attending in over-
crowded weeks and absent in weeks with capacity. Thus,
scenarios with high fluctuations will be unstable.
Fig. 9 shows the fraction of time attendance is above T
at steady state with the error minimization meta-strategy FIG. 11: Average distance from the threshold at steady
for a range of parameters without (above) and with (be- state with the error minimization meta-strategy, without
low) evolution. When an evolutionary paradigm is used, (top) and with (bottom) evolution.
a larger region of strategy space has attendance close
above T around half the time. Even in regions where
attendance rarely crosses T , the fraction never reaches 0,
while without evolution there are large regions of strategy fluctuations. Note that while these plots look similar to
space for which attendance never rises above the thresh- Figs. 2 and 5, distance from threshold is a different met-
old. With the binary decision meta-strategy, the fraction ric than standard deviation of attendance.
of time above the threshold remains close to 1/2, with or Our evolutionary model does introduce two new pa-
without evolution (not shown). rameters, λ and ϵ, but neither appears to have a dramatic
Figs. 10 and 11 show the effect of evolution on the av- effect on the behavior of the system. Fig. 12 summarizes
erage distance from the threshold, T , at steady state, cal- the deviation from the threshold at different values of
culated as the time average of |Nt − T |. Evolution brings ϵ for a range of m and s values. The effects are not
attendance closer to T , particularly in regions with larger dramatic for ϵ ≥ 0.2. The effect of λ is even less pro-
10
FIG. 12: Mean, minimum, and maximum distance from This analysis follows similar logic to the case where
threshold, T , at steady state for different values of ϵ. Five the fixed point is above T , discussed in Section III A. If
trials each were run on m and s values of 5, 10, 20, and attendance is below the threshold, an agent will attend
50. if they have at least one strategy that predicts N̂i,t < T .
So in this case, we have
nounced for λ > 0, but including this parameter rather Pgo = P (∃ i ∈ {1, 2, ..., s} , N̂i,t < T )
than setting it to 1 allows for a smoother beginning to = 1 − P (N̂i,t ≥ T ∀ i ∈ {1, 2, ..., s})
the simulation as all agents do not simultaneously change s (A1)
parameters. m
X T
=1−P a1j ≥ ∗ .
j=1
N
V. DISCUSSION
which, after evaluating the probability and using the
self-consistent condition Pgo = N ∗ /A, yields the implicit
One of the most intriguing features that we observe in equation
both non-evolving regimes of the El Farol Bar Problem is
that systems in which individual agents’ models appear
⌊ T∗ +m⌋
to make better predictions based on past history end up N∗ 1 2NX 2
k m
having worse utilization of the resource in question. We =1− 1− (−1)
A m! k
have seen mechanistically how increasing the number of k=0 (A2)
m s
strategies available to an agent, given a fixed memory T m ∗
+ −k , N <T.
length, drives volatility. 2N ∗ 2
11
Fig. 13 shows solutions to Eq. (A2) as m and s increase. Appendix B: Distribution of Strategies Chosen in
Solutions to Eq. (A2) were above T with all parameters Error Minimization Meta-Strategy
tested, and hence inconsistent with the setup. No fixed
points were observed below T in simulation either. While We consider the distribution of strategies selected by
it is difficult to interpret this necessity from equation A2, each agent under the minimize squared prediction error.
we observe that with the uniform [−1, 1] distribution of The cost function for each strategy is given by Eq. (3).
weights, strategies are biased toward underestimating at- The cost function is quadratic and will always have a
tendance. In order to have a fixed point above below T , unique minimizer unless all N ’s are equal to 0. Depend-
−T
at least NN agents would need to overestimate atten- ing on the attendance history, this minimum could be
dance. inside or outside of the base of support of the a values.
However, there must also be a unique minimizer of
Eq. (3) within the base of support of the a’s. If the
optimal strategy is within the range of possible a values,
this will be the unique minimizer. We now consider the
case where the the minimizer of (3) is outside of the base
of support. Taking cmin to be the lowest cost observed
within the base of support of the a’s, consider ℓmin the
level curve of the cost function at cmin . Level curves of
equation (3) are all elliptical, while the base of support
of the a values is rectangular. If ℓmin intersected the base
of support of the a’s at multiple points, there would be
some region between ℓmin and the true minimizer of the
cost function within the base of support. Since the cost
surface is convex, this region would have cost below Cmin ,
contradicting our construction of Cmin .
Consider the cumulative distribution of strategies se-
lected. We will use F (a) to denote the cumulative dis-
tribution of strategies randomly assigned to an agent,
F ∗ (a) to denote the cumulative distribution function of
the strategy selected, and ak∗ to denote the kth weight
of the selected strategy. Since the strategies are i.i.d.,
Z a1 Z am
∗
F (a) = s ··· P (C(a1 ) < min{C(aj )|j ∈ 2, · · · , s} | a = b)f (b)db
−∞ −∞
Z a1 Z am Z !s−1 (B2)
=s ··· 1− f (c)dc f (b)db .
−∞ −∞ E(b)
Thus, when strategies are sampled from the Uniform(- Appendix C: Implicit Equations for Fixed Points
1,1) distribution, the PDF of the chosen strategies, (a∗1 , with the Error Minimization Meta-Strategy
a∗2 , · · · , a∗m ) is
Here, we show the implicit equations for additional
fixed points shown in Fig. 6. For each choice of s, we
Z !s−1 compute Pgo by integrating Eq. (10) over the region sat-
∗ s 1 isfying Eq. (12), and derive the implicit equation by set-
f (a) = m 1− da . (B3)
2 E(a) 2m ting Pgo = N ∗ /A. With s = 3 and m = 2, we have
12
1 4 2
64 − 3 NT∗ + 18 NT∗ + 0 < NT∗ ≤ 1 ,
24 NT∗ + 8
N∗
= 1 T 4
T 3
A
64 − 3 N ∗ + 32 N ∗ − otherwise .
2
126 NT∗ + 216 NT∗ − 72
(C1)
With s = 5, m = 2, we have
1 T 6 T 5
0 < NT∗ ≤ 1 ,
384 − 5 N ∗ 4 − 12 N ∗ 3 +
T T
15 N ∗ + 80 N ∗
∗
2
+105 NT∗ + 60 NT∗ + 12
N
= 1
T 6
T 5
A
384 − 5 N ∗ + 84 N ∗ −
T 4 T 3
585 N ∗ + 2160 N ∗ −
2
4455 NT∗ + 4860 NT∗ − 1804 otherwise .
(C2)
Finally, with s = 2, m = 10, we have
T 11
1
0 < NT∗ ≤ 1
22528 − 10 N ∗ −
T 10 T
9
77 N ∗ − 220 N ∗ −
T 8
7
165 N ∗ + 660 NT∗ +
6 5
2, 310 NT∗ + 3, 696 NT∗ +
4 3
3, 630 NT∗ + 2, 310 NT∗ +
2
935 NT∗ + 220 NT∗ + 22
T 11
1
22528 10 N ∗ − otherwise .
N∗ T 10 T 9
319 N ∗ + 4, 620 N ∗ −
= T 8
A
40, 095 N∗ +
7
231, 660 NT∗ −
6
935, 550 NT∗ +
5
2, 694, 384 NT∗ −
4
5, 533, 110 NT∗ +
3
7, 938, 810 NT∗ −
2
7, 577, 955 NT∗ +
4, 330, 260 NT∗ − 1, 099, 404
(C3)
[1] W. Brian Arthur. Inductive reasoning and bounded ratio- ADS ’16, San Diego, CA, USA, 2016. Society for Com-
nality. The American Economic Review, 84(2):406–411, puter Simulation International.
1994. [5] Shu-Heng Chen and Umberto Gostoli. Coordination in
[2] Eduardo Zambrano. The interplay between analytics and the el farol bar problem: The role of social preferences
computation in the study of congestion externalities: The and social networks. Journal of Economic Interaction
case of the el farol problem. Journal of Public Economic and Coordination, 12:59–93, 03 2017.
Theory, 6(2):375–395, 2004. [6] Damien Challet, M Marsili, and Gabriele Ottino. Shed-
[3] Miklos Szilagyi. The el farol bar problem as an iterated ding light on el farol. Physica A: Statistical Mechanics
n-person game. Complex Systems, 21, 06 2012. and its Applications, 332:469–482, 2004.
[4] Shu-Heng Chen and Umberto Gostoli. On the complexity [7] N.F. Johnson, S. Jarvis, R. Jonson, P. Cheung, Y.R.
of the el farol bar game: A sensitivity analysis. In Pro- Kwong, and P.M. Hui. Volatility and agent adaptabil-
ceedings of the Agent-Directed Simulation Symposium, ity in a self-organizing market. Physica A: Statistical
13
Mechanics and its Applications, 258(1):230–236, 1998. and organization in an evolutionary game. Physica A:
[8] Hilmi Luş, Cevat Onur Aydın, Sinan Keten, Hakan Statistical Mechanics and its Applications, 246(3):407–
İsmail Ünsal, and Ali Rana Atılgan. El farol revisited. 418, 1997.
Physica A: Statistical Mechanics and its Applications, [13] D.B. Fogel, K. Chellapilla, and P.J. Angeline. Inductive
346(3):651–656, 2005. reasoning and bounded rationality reconsidered. IEEE
[9] M.A.R. Cara, O. Pla, and Francisco Guinea. Competi- Transactions on Evolutionary Computation, 3(2):142–
tion, efficiency and collective behavior in the “el farol” 146, 1999.
bar model. Physics of Condensed Matter, 10:187–191, 07 [14] Duncan Whitehead. The El Farol Bar Problem Revisited:
1999. Reinforcement Learning in a Potential Game. Edinburgh
[10] Neil F Johnson, Michael Hart, and P.M Hui. Crowd ef- School of Economics Discussion Paper Series 186, Ed-
fects and volatility in markets with competing agents. inburgh School of Economics, University of Edinburgh,
Physica A: Statistical Mechanics and its Applications, September 2008.
269(1):1–8, 1999. [15] Andrew J. Collins. Strategically forming groups in the
[11] Phase Spaces of the Strategy Evolution in the El Farol el farol bar problem. In Proceedings of the 2017 Inter-
Bar Problem, volume ALIFE 2020: The 2020 Conference national Conference of The Computational Social Science
on Artificial Life of ALIFE 2021: The 2021 Conference Society of the Americas, CSS 2017, New York, NY, USA,
on Artificial Life, 07 2020. 2017. Association for Computing Machinery.
[12] D. Challet and Y.-C. Zhang. Emergence of cooperation