How do you forecast an election?

Nassim Nicholas Taleb

DRAFT - CANNOT BE CITED YET. I need to make the notations uniform across the two parts.

A Dynamic View of Forecasting
1.0 0.6
0.5
0.8 0.5
0.4
0.6 0.4
0.3
, 0.4
, 0.3 ,
0.2 0.2
0.1 0.2 0.1

20 40 60 80 100 20 40 60 80 100 20 40 60 80 100

0.8 1.0 1.0

0.6 0.8 0.8

0.6 0.6
0.4 , 0.4
, 0.4
,
0.2 0.2 0.2

20 40 60 80 100 20 40 60 80 100 20 40 60 80 100

1.0 1.0
0.8
0.8 0.8
0.6
0.6 0.6
0.4 , 0.4
, 0.4
,
0.2 0.2 0.2

20 40 60 80 100 20 40 60 80 100 20 40 60 80 100

1.0 0.5

0.6 0.8 0.4

0.6 0.3
0.4
, 0.4
, 0.2
0.2
0.2 0.1

20 40 60 80 100 20 40 60 80 100 20 40 60 80 100

◼ Figure 1: A collection of forecasters for the same variable {0,1} over 100 periods. The blue has little
uncertainty in his forecast. The most efficient forecaster is half way, closer to the green line

1. the estimation error can be integrated into the variance of W. Yes we have known for >200 years since Laplace’s argument that uncertainty and ignorance makes odds remain close to 1/2. the closer the probability in two-contest need to be at ..2 binary forecasting 538. This note is organized as follows. we can transform later): dW = dt μ + dZ σ (1) By Ito' s Lemma: . or some other variable.0 ELECTION Rigorous DAY 0. They responded to this criticism with .5 ◼ The higher the uncertainty in the system the more slowly forecast need to update until the final result. more ignorance of probability.9 updating 0.7 0. a Wiener process WLOG..nb ◼ Figure 2: The defect of 538. Very very basics of stochastic calculus.6 20 40 60 80 100 Some mathematical derivations Let us start the model from the very basics. Assume W is a continuous state variable determining the final result. Note the following: ◼ The higher the uncertainty. I discuss the option approach than show how it corresponds to de Finetti’s approach to minimize the Brier Score as a “proper” score. We have the election estimate F a function of a state variable W. W has for simple dynamics (arithmetic B M. W can be an estimate.8 538 0.

1] time t0 .∞) → [0. your “probability”. you are evaluated at how little opportunity one can arbitrage you. It is simply. The math is as follows. t]. Since your forecast is left hanging. and assume μ=0 to simplify WLOG ∂F dF . Also this shows how it is worse to produce no change in forecast than keep changing. that is buy from you at bt0 and sell at bt0 +Δt . F[W. Every day’s P/L matters.1] to get it to translate. basically. a forecaster who is also a market maker can go bankrupt before final outcome. 2 tc = F[W. t] ⩵ σ2 D[F[W. sol = DSolve[{Eq. 2}]. σ2 (4) ∂t 2 ∂W 2 which is. . The idea of no arbitrage is that a continuously made forecast must itself be a martingale of sorts. a method that penalizes you if your distribution of outcomes diverges from the “real” probability distribution. etc. This relates with the Brier metric which would be 1bt0 -bτ 22 .nb 3 ∂F ∂F 1 ∂2 F dF = dt + dW + dW (2) ∂t ∂W 2 ∂W 2 Ito’s calculus allows dt 2 and dt dW to vanish. Assume elections happen time τ. binary forecasting 538. We have for terminal conditions: F t →0 =θ[W] where θ is the Heaviside Theta function. {W. Replacing with (2). we can transform ϕ-1 :(-∞. at some point. t]. 0] == HeavisideTheta[W]. the heat equation. Apply the Black Scholes (or a standard no arbitrage) argument. If W is a “poll”. {W. t]. and tell when you can pronounce forecaster A better than forecaster B —for no matter the final outcome A will dominate B. tc }. In the real world. You need to consider the steps in the process. Note the Brier metric uses Norm L2 (squared deviations) but your P/L is is norm L1 (absolute deviations) but the former is preferable because it is a “proper” score. and how to calibrate changes to volatility. E finito! Connection to De Finetti' s Approach What makes a good forecaster? As traders we know that the final outcome is just a piece of the pie. bτ ϵ{0.1} being the final result. The idea of a “proper score” is as follows. Let bt0 be your “price” ∈[0. you can tell a bad forecaster before the end event. t] → 1 + Erf   2 2 t Abs[σ] which is the CDF of a the Normal distribution for P≥W. We can try to solve on Mathematica (by fudging. t}] 1 W F[W. inverting the backward- forward equation) 1 Eq = D[F[W. In fact. Hence your quality of forecasting is some norm 1bt0 -bt0 +Δt 22 . dW = 0 (3) ∂W We end up with the partial differential equation: ∂F 1 ∂2 F dt = . and bt +Δt your price time t+Δt.

and how to calibrate changes to volatility. 0.4. {0. 1]}.172056}. 0.. {1. {1.1. 0. {3.166177}} . 0.6. . {0. 4 binary Theforecasting math is538. {1.158104}.t0 μB (t0 . 0. Brier[m. {0.3. m σ Sqrt[Length[ta] . Length[vec] 1 In[39]:= DiffBrier[vec_] :=  (vec[[i]] .15599}. Hence your quality of forecasting is some norm 1bt0 -bt0 +Δt 22 .5. {0.182563}. 0.194707}} In[29]:= tt2 = Table[{m. Note the Brier metric uses Norm L2 (squared deviations) but your P/L is is norm L1 (absolute deviations) but the former is preferable because it is a “proper” score.160005}.nb as follows.biΔt+t0 )2 (6) n i =0 Of course a series of Brier scores n 1 τ . 5. 1}] In[41]:= tt1 = Table[{m. {0. {1. {σ. bτ ϵ{0.157015}. 2. τ-t0 1 Δt μ(t0 . 0. {i. ta[[i]]].8.vec[[Length[vec]]]) ^ 2 Length[vec] i=2 Brier[m_. etc.1} being the final result. you are evaluated at how little opportunity one can arbitrage you. 0. 0.157217}.3. ta1 = Table[CDF[NormalDistribution[0. 0. {5.7. 0. σ]. 1]}. Also this shows how it is worse to produce no change in forecast than keep changing. {2.1}] Out[54]= {{0.0001. and bt +Δt your price time t+Δt. τ.166106}. 5. DiffBrier[ta1].162826}. In[21]:= Table[Brier[1. 0. {1. that is buy from you at bt0 and sell at bt0 +Δt .168916}. RandomVariate[NormalDistribution[0. 1. Assume elections happen time τ.9. τ. {4. .1] time t0 . your “probability”. {2. 0.195365}. {1. Brier[m.. {m. 1}] Out[43]= {{1.15875}. Next let us see how a dynamic forecaster using Ito’s lemma minimizes the Brier score. 0. Max[. 0.187488}.1.i]]]. 2 × 10 ^ 5}] // Mean We can see the Brier is flat in σ as both scale equally.4. 0. {0. 1.1. Δt) := (bΔt (i +1)+t0 . 0.2. 0. σ]. n = (7) n i =0 Δt The probabilist can see that as Δt →0 we have a nonanticipating Ito integral for the L2 norm.biΔt+t0 6 (5) n i =0 τ-t0 1 Δt μ2 (t0 . 0. 0.6. 1. σ_] := Table[ta = Join[{0}. {1. {i. 0.5. {m.159371}. 0.7.157812}.2.157854}. Let bt0 be your “price” ∈[0.164491}.178219}. Δt) := 5b(i +1) Δt+t0 . 0. τ. {1. 100]] // Flatten // Accumulate.162945}. Δt) := (bΔt (i +1)+t0 . 1. 0. 0.8. {0.bn+1 )2 .16113}. Length[ta]}]. This relates with the Brier metric which would be 1bt0 -bτ 22 . Since your forecast is left hanging. {1.163945}.161152}.9. {0.215846}. {1.

21 0.16 σ 1 2 3 4 5 .17 0.nb 5 In[57]:= ListPlot[Join[tt1. PlotStyle → Red. tt2]. binary forecasting 538.20 0. AxesLabel → {σ.19 Out[57]= 0.18 0. Score}] Score 0.