Professional Documents
Culture Documents
Benoit Jottreau*
* Department of Applied Mathematics, Universite Paris-Est,
Cite Descartes, 5 Bd Diderot, Champs sur Marne, 77454 Marne La Vallee, France;
E-mail: benoit jottreau@yahoo.fr
Abstract. Online soccer betting has become a complex continuous-time market. Hence, the need for continuous time and sophisticated models is obvious as the original simple Poisson model of Maher is not able
to reproduce the prices and dynamics that we observe in the market. In order to price bets more accurately,
Dixon and Robinson proposed some modulated Poisson processes for goals. Nevertheless, as there are few
closed-form for most liquid bets, they chose to calibrate their model historically. We generalize their models
and describe a procedure to calibrate the prices implicitely from a set of basic bets prices. Our main result is
the expressions of prices of correct score bets in this model.
1. Introduction
1.1 Introduction
Back to 20th century, betting on soccer was limited to bets on match outcome (win/draw/lose) and correct
score guessing. In 90s, the betting market began to grow intensively in volume and in diversity of bets. We
can count more than one hundred different bets that can be taken on a single soccer match: from the number
on the shirt of the first goalscorer to the number of substitutions made in each team passing by the time of
the n-th goal in the match. Moreover, some of these bets can be taken whereas the match is playing. Hence,
modelling the final score does not allow to price such exotic bets whose payoff is determined by the whole
path of the match process.
First attempts to price bets tried to model the outcome as a discrete variable which can take values in the
set (win, draw, lose). In these link-models, each team is given a rating, so that the strength differential in the
match is modelled as the difference of ratings and the discrete law is constructed with this rating differential.
Maher (1982) tried to model directly the final numbers of goals scored by each team. In his model, the final
score follows the law of a couple of independant Poisson variables with parameter depending on attack and
defence ratings of each team. More generally, we can define two independent Poisson processes N1 and N2 for
the score of each team.
To account for team strategies in a specific match evolution, Lee (1997) and later Dixon and Robinson
(1998) introduced time-varying parameters during the match depending on the instantaneous score. For example if a team is leading 1-0 he may choose to be more defensive. On the other way a losing team may adopt
a strategy with more attack. They estimate then the different abilities for each type of score. This approach
requires data about the goal times for the estimation to be done by likelyhood maximization of the goal-times
laws.
Empirical work by Vecer et al. (2006) and game theoretical work by Palomino et al. (2000) have shown with
a simple poisson model that teams play with more attack strength when they are losing and in a less statistically
significant way, they play more defensively when they are winnng. Indeed, we know that constant Poisson
model underestimates draw probability and does not reflect time variation of parameters which sometimes is
important. It seems that market intensities are in most cases draw-reverting in the sense that they evolve to
make the score closer to the draw result. Dixon incorporated these features in Maher model and also timedependency of intensities. We introduce here a general setup for soccer goals processes that reminds of the
models for credit event pricing.
Our aim being implicit calibration, we need a model that allows us to find closed formulas for the final
score probabilities. From basic bets prices, we can then infer the values of parameters for this specific match.
Then, by closed-form or by Monte-Carlo simulation we are able to give the prices of more sophisticated bets
like goal times or win/draw/lose (w/d/l). Ironically, in these models the bets on outcome w/d/l are highly exotic
and no closed-formula is available.
where we suppose the state vector (t, N1 , N2 , 1 , 2 ) to be Markovian. It means that in each time interval
[t,t + dt], conditionally on Gt , process Ni has a jump of size 1 with probability i (t)dt and no jump occurs with
probability 1 (1 + 2 )dt, also no jumps occur simultaneously.
Rt
The SDE for i s clearly induces that intensities depend on time and score as in Lee-Dixon-Robinson
models but moreover we added stochasticity to the inter-goals intensities processes through the Brownian
motion W . We can imagine another generalization where instead of modelling only the scores, we model all
events which are of interest for bet pricing, e.g. the times of cards, corners or shots. Though this full model
would require more counting processes , more detailed data and would become highly untractable, it must be
investigated further. For example, Fitt et al. (2006) used simple Poisson processes for corners and goals and
then did some implicit calibration to the spread bet market.
2. Pricing and implicit calibration in a deterministic model
2.1 The deterministic time-proportional model
We will use the following assumptions (Hypothesis A):
The Markovian state vector is now (t, N1 (t), N2 (t)). Let us denote by Mt1 and Mt2 the martingales associated
with the Poisson processes under the market probability, which are defined by:
dMti = dNti i (t, Nt1 , Nt2 )dt , M0i = 0
Let PtT,n1 ,n2 (A ) denote the price at time t of some bet A if the score is (n1 , n2 ) at time t. By Itos lemma, we
get for the price dynamics:
hP
i
dP(t, Nt1 , Nt2 ) =
+ 1 P.1 + 2 P.2 .dt + 1 P.dMt1 + 2 P.dMt2
t
where: 1 P(t, n1 , n2 ) = P(t, n1 + 1, n2 ) P(t, n1 , n2 ) and 2 P(t, n1 , n2 ) = P(t, n1 , n2 + 1) P(t, n1 , n2 ).
Hence, under risk-neutral probability prices are martingales and the price dynamics between two goals is
given by the differential equation:
P
1
2
1
2
t + 1 P.1 (t, Nt , Nt ) + 2 P.2 (t, Nt , Nt ) = 0
Solving this equation gives the recursive (backward) pricing equation:
PtT, j,k
R T j,k
= PTT, j,k e t s ds +
Z T
t
R u j,k
t
s ds
du
where sj,k = 1 (s, j, k) + 2 (s, j, k) is the total intensity and Pi is the value of the same bet just after a goal of
Team i.
Theorem 1. Under hypothesis A, the price of any bet depending only on the final scores has the following
form:
PtT,n1 ,n2 =
c( j,k,n1 ,n2 ) e
jn1 ;kn2
RT
t
Sketch of the proof: we reduce the problem to correct score pricing and using the recursive formula, we
proceed by induction on goals needed to settle the bet.
For example, the probability of the score ending in (0 : 0) is:
R T 0,0
RT
PtT,0,0 (0 : 0) = e t s ds = e t 1 (s,0,0)+2 (s,0,0)ds
and the one for the score ending in (1 : 0) if the score is still (0 : 0):
h R T 0,0
(0,0)
R T 1,0 i
+2
)(T
t)
.e l
P
(0
:
2)
=
.
(T
t)
.
(
2
l
d
(2,d)
(2,l)
t
h
i
T,0,0
Pt (1 : 1) = (1,d) (2,w) .2 (d w )(T t) + (2,d) (1,l) .2 (d l )(T t) .(T t)2 ed (T t)
ex 1
x
1 x
x2
xn (e 1 x 2!
n1
x
... (n1)!
) , n (0) =
1
n!
Nevertheless those formulas are not invertible in a closed form so we will use least squares fitting and use
more prices to get a better fit of the model.
2.3 Example of static implicit calibration and pricing on a specific match
We perform the implicit calibration on a prices set from a particular match. These prices have been taken
before the match. In table 1, we report the fitted intensities and observe the general feature of intensities, i.e.
the team who leads generally plays more defensively and the team who is behind plays more offensively. This
pattern confirms the results found by Palomino et al. (2000) in their analysis based on game theory and optimal
strategy for soccer teams.
In table 2 and 3, we present the results of the pricing by simulation/closed-forms.
Error
0,85
1,07
1,1
0,54
2,41
2,65
1,57
3,3
2,07
0,91
1,76
1,09
1,41
1,37
1,53
0,99
4,58
2,85
5,82
4,42
4,19
1,99
1,13
The -error and %-error we quoted in the results are the difference between model price and mid-quote
price measured in half bid-ask spread and percentage of mid-price respectively.
Table 3. Result of pricing spread bets in the basic model.
Betname
Bid Ask Model Confidence Interval - error %-error
Winningmargin 0 0,2 0,05
0,04:0,06
-0,5
-50
Totalgoals
2,2 2,4 2,31
2,28:2,33
0,08
0,35
Firstmatchgoal 40 43 37,11
36,96:37,26
-2,93 -10,58
FirstTeam1goal 54 57 53,86
53,69:54,03
-1,09
-2,95
FirstTeam1goal 56 59 55,61
55,44:55,77
-1,26
-3,29
2ndMatchgoal 64 67 63,35
63,21:63,49
-1,43
-3,28
2ndTeam1goal 78 81 79,35
79,24:79,45
-0,1
-0,19
2ndTeam2goal 80 82 79,98
79,88:80,09
-1,02
-1,26
3rdMatchgoal 78 81 78,6
78,5:78,7
-0,6
-1,13
3rdTeam1goal 87 89 87,62
87,57:87,67
-0,38
-0,43
3rdTeam2goal 87 89 87,68
87,63:87,73
-0,32
-0,36
Lastmatchgoal 58 61 58,31
58,16:58,47
-0,79
-2
Winninggoal
33 36
34
33,8:34,1
-0,33
-1,45
Totalgoalminutes 110 120 111,4
110,9:111,8
-0,72
-3,13
TGM supremacy -2 12 2,06
1,66:2,46
-0,42
-58,8
TGM Team1
58 63 56,7
56,4:57
-1,52
-6,28
TGM Team2
53 58 54,6
54,3:54,9
-0,36
-1,62
Error
5
0,17
5,56
1,8
2,04
2,17
0,14
1,13
0,82
0,41
0,34
1,26
0,7
1,5
4,97
3,09
0,76
We remark that the fit is pretty good for almost all bets except for big scores. Nevertheless, we truncated the
intensity surface to one goal difference and then used only three levels. From Dixon and Robinson (1998), we
know a better but still parsimonious fit can be achieved by taking five levels for the intensity surface. This can
be compared to volatility surfaces in traditional finance and option pricing as in Black-Scholes model where
deep out or in-the-money options generally have a different implied volatility than at-the-money options.
For the spread bets, the fit is pretty good aswell even if some prices are underestimated by the model.
The larger spread in this market results in lower error due to the measurement unit. Nevertheless, the fit is
really better than a simple Poisson model would produce. The worst fit is for the time of first goal. This can
be investigated further. One reason could be the fact that intensities are not only score-dependent but also
time-dependent. We know by empirical works that intensity is generally slightly increasing in each half-time.
This clearly induces that the time of first goal would be greater than the one predicted by time-independent
intensities with same mean over the match. Indeed, the findings of Dixon and Robinson (1998) show that
intensities are better fitted with a time-trend. This induces that even if starting prices are well fitted, the
time-trend is necessary to get prices during the match closed to market prices.
2.4 Extension to affine inter-goals intensities
From theorem 1, we know that classical bets prices depend only on the integral of intensities, so we can easily
insert a time trend into the model as long as we keep the same average intensity.
As goal times are undervalued, we add an increasing affine component by taking (t) = at + (1 aT /2)
such that intensities have same means over the match.
In this affine case, we have a closed-formula for the First Goal Time price. Fitting the value of a with the First
Goal Time price, we found aT 0.7344. Hence, intensities grow from (1 aT /2) to (1 + aT /2) i.e. from
0.63 when match starts to 1.37 at the end of the match.
Within this affine model, the spread bets calibration is much better as we can see in table 4.
Table 4. Result of pricing spread bets in the affine model.
Betname
Bid Ask Model with trend -Error %-Error
Firstmatchgoal 40 43
41,5
0
0
FirstTeam1goal 54 57
57.11
1,07
2,9
FirstTeam2goal 56 59
58.66
0,77
2,02
2ndMatchgoal 64 67
66,64
0,76
1,74
2ndTeam1goal 78 81
80,80
0,87
1,64
2ndTeam2goal 80 82
81,29
0,29
0,36
3rdMatchgoal 78 81
80,20
0,47
0,88
3rdTeam1goal 87 89
88.00
0
0
3rdTeam2goal 87 89
88,06
0,06
0,07
Lastmatchgoal 58 61
61,33
1,22
3,08
Winninggoal
33 36
37.02
1,68
7,3
Totalgoalminutes 110 120
121.53
1,31
5,68
TGMsupremacy -2 12
2,59
-0,34
-48,2
TGM Team1
58 63
62.06
0,62
2,58
TGM Team2
53 58
59.47
1,59
7,15
where (t) 0 is a deterministic function and xt is a CIR process with x0 > 0 and dynamics given by:
+ + ( )e
+ + ( )e
and = 2 2 + 2 .
Hence, we recover closed-forms for the prices of classical bets as they are linear combinations of Laplace
transforms of the integral of x.
For example, the probability that no more goal is scored is:
(T,n ,n ,x)
Pt 1 2 (n1 : n2 ) = em( ,T t)n( ,T t).x
(n ,n )
RT
t (s)ds
(n ,n )
with = 1 1 2 + 2 1 2 .
Finally, we can perform some implicit calibration to recover the parameters of the models for any match
but we need more prices in closed-forms due to the additional parameters. We remark that another process
than CIR may be used if we know its integrated Laplace transform. In a next work, we will present the results
obtained in this stochastic intensity model.
4. Conclusion
We extended the model of Dixon and Robinson and found closed-formulas for soccer betting products which
fit price structure better than the Poisson model with constant intensities. Implicit calibration gives surprisingly
good results given the simplicity of the model. Nevertheless, if we use a more detailed price structure, this basic
model might show its limits and we introduce a stochastic inter-goals intensity model where some calibration
procedure is described aswell.
References
Cox J.C., Ingersoll J.E. and Ross S.A. (1985) A theory of the term structure of interest rates. Econometrica
53, 385408.
Dixon M.J. and Robinson M.E. (1998) A birth process model for association football matches. The
Statistician 47, 523538.
Fitt A., Howls C. and Kabelka M. (2006) The valuation of soccer spread bets. J. Oper. Res. Soc. 57, 975985.
Lamberton D. and Lapeyre B. (1995) An introduction to stochastic calculus applied to finance, Chapman and
Hall.
Lee A. (1997) Modeling scores in the premier league: Is manchester united really the best? Chance , 1519.
Maher M. (1982) Modelling association football scores. Statistica Neerlandica 36, 109118.
Palomino F., Rigotti L. and Rustichini A. (2000) Skill, strategy, and passion: an empirical analysis of soccer,
Econometric Society World Congress 2000 Contributed Papers 1822, Econometric Society.
Vecer J., Ichiba T. and Laudanovic M. (2006) Parallels between betting contracts and credit derivatives:
Lessons learned from fifa world cup 2006 betting markets, Technical report, Department of Statistics,
Columbia University.