You are on page 1of 46

Simulating matches and

tournaments
Lecture 6
Wisdom of Crowds
Event Class Prediction (outcome)
Tampa – Kansas (Superbowl) 0.38
Liverpool – Man City 0.48
Warriors – Nets 0.53
Serena wins Australian open (yes/no) 0.64-0.36
Lecture # Lecture Title
1 ✔️ Introduction to Sports Analytics

Course Outline
Module 1: Measuring Team Strength and Predicting Outcomes
2 Normal models of score differentials in the NFL (I)
3 Normal models of score differentials in the NFL (II)
4 Logistic models and the Elo system
5 Poisson models for low-scoring sports
Module 1: Measuring Team Strength and Predicting Outcomes 6 Simulating matches and tournaments
Lecture 2 Normal models of score differentials in the NFL (I) Module 2: Situational analysis
7 Valuing states (I. Markov Chains)
Lecture 3 Normal models of score differentials in the NFL (II)
8 Valuing states (II. Applications to baseball)
Lecture 4 Logistic models and the Elo system 9 Working with PITCHf/x data: Clustering pitch types
Lecture 5 Poisson models for low-scoring sports 10 When should you go for it on fourth down? (I)
11 When should you go for it on fourth down? (II)
Lecture 6 Simulating matches and tournaments
12 Guest Lecture
13 xG: Measuring chance quality in soccer (I)
14 xG: Measuring chance quality in soccer (II)
Module 3: Player evaluation
Sections
15 The +/- score (I. Basic and adjusted +/-)
Christy: Monday, 9am-10am & 9pm-10pm
16 The +/- score (II. Regularized adjusted +/-)
Andrew: Tuesday & Thursday, 7pm-8pm
17 Streaks, momentum and the hot hand (I)
18 Streaks, momentum and the hot hand (II)
Office hours
Module 4: Tracking data
Christy: Tuesdays, 10am to 12pm
19 Introduction to working with tracking data
Andrew: Tuesday & Thursday, 8pm to 9pm
20 Guest Lecture: William Spearman (Liverpool FC)
Laurie: Thursdays, 1pm to 3pm
Final Project
21-23 <PROJECT WORK IN CLASS>
1st Assignment due TODAY
24 Project presentations
Simulating tournaments
• Who’s going to win:
• The superbowl?
• Stanley Cup?
• Premier League?

• Who’s going to qualify for playoffs?

• Get relegated?
Monte Carlo simulation
Evaluate complex distributions by random sampling.

Most simulations follow something like the following process:


1. Define range and pdf of input variables
2. Generate random samples of inputs by drawing from pdfs
3. Perform calculations on these variables
4. Aggregate results and explore distributions.

Credit: nicoguaro (wikipedia)

+
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs

Components of a simulation
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs

Components of a simulation

Model for predicting


1. match outcomes
Home/away parameters & covariates, exogeneous
parameters, generating distribution
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs

Components of a simulation

Model for predicting


1. match outcomes
Home/away parameters & covariates, exogeneous
parameters, generating distribution

2. Fixture schedule
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs

Components of a simulation

Model for predicting


1. 3. Simulator
match outcomes
Draw outcomes for all matches in schedule
Home/away parameters & covariates, exogeneous
parameters, generating distribution

Evaluate tournament outcome (table,


winner, etc)
2. Fixture schedule
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs

Components of a simulation

Model for predicting


1. 3. Simulator
match outcomes
Draw outcomes for all matches in schedule
Home/away parameters & covariates, exogeneous
parameters, generating distribution Repeat N
times
Evaluate tournament outcome (table,
winner, etc)
2. Fixture schedule
Simulating Tournaments
Want to evaluate the probability of season outcomes
• Distribution of positions in a table
• Outcome of knock-out tournament or play-offs

Components of a simulation

Model for predicting


1. 3. Simulator
match outcomes
Draw outcomes for all matches in schedule
Home/away parameters & covariates, exogeneous
parameters, generating distribution Repeat N 4. Evaluate distributions of
times outcomes
Evaluate tournament outcome (table,
winner, etc)
2. Fixture schedule
Example: 2015/16 Premier League Season
Simulate 2015/16 Premier League season
• Probabilities of finishing: top, top 4, bottom 3

Components:
1. Team strength model (calibrate from 2014/15 season)
2. Fixture schedule (easy for league formats; a bit tricker for tournaments)
3. Simulation engine
4. Analysis of results
1. Team Strength Model
Poisson model

Home Goals ~ Poisson() Away Goals ~ Poisson()

Scoring rates dependent on the difference in team strengths


Home team l og λ h=𝜗 h − 𝜗 𝑎 + 𝛽 +𝛼 Intercept
scoring rate
Home & away team Home advantage parameter
strength parameters (+ve for home team, -ve for away)

Away team
l og λ 𝑎=𝜗 𝑎 − 𝜗 h − 𝛽+ 𝛼 Intercept
scoring rate
To Code
-> EPL_team_strength.py
1. Team Strength Model
Team Theta Lambda
Man City 0.44 1.87
Chelsea 0.40 1.80
Arsenal 0.34 1.70
Man United 0.25 1.55
Southampton 0.21 1.49
Tottenham 0.05 1.27
Liverpool 0.04 1.26
Stoke 0.03 1.24
Everton -0.02 1.18
Swansea -0.03 1.17
West Ham -0.03 1.17
Crystal Palace -0.04 1.16
Leicester -0.09 1.10
West Brom -0.13 1.06
Hull -0.18 1.01
Sunderland -0.22 0.97
Newcastle -0.23 0.96
Burnley -0.25 0.94
Aston Villa -0.26 0.93
QPR -0.30 0.89
Expected # goals scored

1. Team Strength Model against the average


team at a neutral ground

Team Theta Lambda


Man City 0.44 1.87
Chelsea 0.40 1.80
Arsenal 0.34 1.70
Man United 0.25 1.55
Southampton 0.21 1.49
Tottenham 0.05 1.27
Liverpool 0.04 1.26
Stoke 0.03 1.24
Everton -0.02 1.18
Swansea -0.03 1.17
West Ham -0.03 1.17
Crystal Palace -0.04 1.16
Leicester -0.09 1.10
West Brom -0.13 1.06
Hull -0.18 1.01
Sunderland -0.22 0.97
Newcastle -0.23 0.96
Burnley -0.25 0.94
Aston Villa -0.26 0.93
QPR -0.30 0.89
1. Team Strength Model

One issue is that three teams were relegated in 2014/15

And three teams were promoted from the league below to play in the EPL in 2015/16:
• Bournemouth
• Watford Team strengths?
• Norwich
1. Team Strength Model
Team Theta Lambda

Make a simple assumption Man City 0.44 1.87


Chelsea 0.40 1.80
Arsenal 0.34 1.70
The average strength of the promoted Man United 0.25 1.55
Southampton 0.21 1.49
teams is roughly equivalent to the Tottenham 0.05 1.27
average strength of the relegated teams Liverpool 0.04 1.26
Stoke 0.03 1.24
Everton -0.02 1.18
Swansea -0.03 1.17
West Ham -0.03 1.17
Crystal Palace -0.04 1.16
Leicester -0.09 1.10
West Brom -0.13 1.06
Hull -0.18 1.01
Sunderland -0.22 0.97
Newcastle -0.23 0.96
Relegated teams, previous season
Burnley -0.25 0.94
Aston Villa -0.26 0.93
QPR -0.30 0.89
Bournemouth -0.24 0.95
Watford -0.24 0.95
Norwich -0.24 0.95
1. Team Strength Model
Team Theta Lambda

Make a simple assumption Man City 0.44 1.87


Chelsea 0.40 1.80
Arsenal 0.34 1.70
The average strength of the promoted Man United 0.25 1.55
Southampton 0.21 1.49
teams is roughly equivalent to the Tottenham 0.05 1.27
average strength of the relegated teams Liverpool 0.04 1.26
Stoke 0.03 1.24
Everton -0.02 1.18
Swansea -0.03 1.17
West Ham -0.03 1.17
Crystal Palace -0.04 1.16
Leicester -0.09 1.10
West Brom -0.13 1.06
Hull -0.18 1.01
Sunderland -0.22 0.97
Newcastle -0.23 0.96
Average strength = -0.24
Burnley -0.25 0.94
Aston Villa -0.26 0.93
QPR -0.30 0.89
Bournemouth -0.24 0.95
Watford -0.24 0.95
Norwich -0.24 0.95
1. Team Strength Model
Team Theta Lambda

Make a simple assumption Man City 0.44 1.87


Chelsea 0.40 1.80
Arsenal 0.34 1.70
The average strength of the promoted Man United 0.25 1.55
Southampton 0.21 1.49
teams is roughly equivalent to the Tottenham 0.05 1.27
average strength of the relegated teams Liverpool 0.04 1.26
Stoke 0.03 1.24
Everton -0.02 1.18
Swansea -0.03 1.17
West Ham -0.03 1.17
Crystal Palace -0.04 1.16
Leicester -0.09 1.10
West Brom -0.13 1.06
Hull -0.18 1.01
Sunderland -0.22 0.97
Newcastle -0.23 0.96
Average strength = -0.24
Burnley -0.25 0.94
Aston Villa -0.26 0.93
QPR -0.30 0.89
Bournemouth -0.24 0.95
Watford -0.24 0.95 Promoted sides: each assigned the
Norwich -0.24 0.95 average strength of the relegated teams
2. Fixture Schedule
For league, just simulated all the scheduled fixtures.

For tournament: need to consider structure


• Is the scheduling of play-offs / knock-rounds pre-generated or randomly drawn?
• Seeding?
• Wildcards?

Code for simulating the FIFA world cup: https://github.com/eightyfivepoints/World-Cup-Simulations


3. Simulation Engine
Model for predicting
1. 3. Simulator
match outcomes
Draw outcomes for all matches in schedule
Home/away parameters & covariates, exogeneous 4. Evaluate distributions of
parameters, generating distribution Repeat N outcomes
times
Evaluate tournament outcome (table,
winner, etc)
2. Fixture schedule

Core of the simulate is to draw results for each match


3. Simulation Engine
Model for predicting
1. 3. Simulator
match outcomes
Draw outcomes for all matches in schedule
Home/away parameters & covariates, exogeneous 4. Evaluate distributions of
parameters, generating distribution Repeat N outcomes
times
Evaluate tournament outcome (table,
winner, etc)
2. Fixture schedule

Core of the simulate is to draw results for each match

Home Goals ~ Poisson(| , ) Away Goals ~ Poisson( | , ))


3. Simulation Engine
Model for predicting
1. 3. Simulator
match outcomes
Draw outcomes for all matches in schedule
Home/away parameters & covariates, exogeneous 4. Evaluate distributions of
parameters, generating distribution Repeat N outcomes
times
Evaluate tournament outcome (table,
winner, etc)
2. Fixture schedule

Core of the simulate is to draw results for each match

Home Goals ~ Poisson(| , ) Away Goals ~ Poisson( | , ))

Some additional choices:


• Run “hot”: update ’s as you simulated the season
• Draw ’s from distribution given sample error.
• Incorporate other covariates and exogenous parameters (travel, recovery time, match importance etc).
4. Simulation Analysis
Analyze and interpret the results:
• Expected performance of each team
• Who will make play-offs? win tournament?
• Strength of schedule (expected points vs an average schedule)

Find visually interesting ways to communicate results with effect


To Code
-> Sim_EPL_Season.py
Results

Low-High: 95% confidence interval


Results
12 teams predicted within
2 places of final position

Rank correlation between


expected and final
positions = 0.77 +- 0.22

Low-High: 95% confidence interval


Results
12 teams predicted within
2 places of final position

Rank correlation between


expected and final
positions = 0.77 +- 0.22

But at least two major


outliers

Leicester won league in 0


of 10000 sims

Chelsea finished 10th or


lower in only 3 of 10000

Low-High: 95% confidence interval


Uncertainty
Purpose of Monte Carlo simulations is to explore pdf of outcomes.

To properly explore the distribution of outcomes, we must try to correctly explore the
distributions of simulation variables.

In this case, team strengths are not known exactly: each has an associated standard error.
Uncertainty
Team
Man City 0.44 0.10 Standard error on parameters:
Chelsea
Arsenal
0.40
0.34
0.10
0.10
• Team strength: 0.1
Man United 0.25 0.10 • Intercept & beta: 0.03
Southampton 0.21 0.10
Tottenham 0.05 0.10
Liverpool 0.04 0.10
Stoke 0.03 0.10
Everton -0.02 0.10
Swansea -0.03 0.10
West Ham -0.03 0.10
Crystal Palace -0.04 0.10
Leicester -0.09 0.10
West Brom -0.13 0.10
Hull -0.18 0.10
Sunderland -0.22 0.10
Newcastle -0.23 0.10
Burnley -0.25 0.10
Aston Villa -0.26 0.10
QPR -0.30 0.10
Intercept 0.19 0.03
Beta 0.15 0.03
Uncertainty
Team
Man City 0.44 0.10 Standard error on parameters:
Chelsea
Arsenal
0.40
0.34
0.10
0.10
• Team strength: 0.1
Man United 0.25 0.10 • Intercept & beta: 0.03
Southampton 0.21 0.10
Tottenham 0.05 0.10
Liverpool 0.04 0.10 Each new simulation iteration i, draw new strength
Stoke 0.03 0.10
Everton -0.02 0.10 parameter for team t:
Swansea -0.03 0.10
West Ham
Crystal Palace
-0.03
-0.04
0.10
0.10
)
Leicester -0.09 0.10
West Brom -0.13 0.10
Hull -0.18 0.10
Sunderland -0.22 0.10
Newcastle -0.23 0.10
Burnley -0.25 0.10
Aston Villa -0.26 0.10
QPR -0.30 0.10
Intercept 0.19 0.03
Beta 0.15 0.03
Uncertainty
Team
Man City 0.44 0.10 Standard error on parameters:
Chelsea
Arsenal
0.40
0.34
0.10
0.10
• Team strength: 0.1
Man United 0.25 0.10 • Intercept & beta: 0.03
Southampton 0.21 0.10
Tottenham 0.05 0.10
Liverpool 0.04 0.10 Each new simulation iteration i, draw new strength
Stoke 0.03 0.10
Everton -0.02 0.10 parameter for team t:
Swansea -0.03 0.10
West Ham
Crystal Palace
-0.03
-0.04
0.10
0.10
)
Leicester -0.09 0.10
West Brom -0.13 0.10
Hull -0.18 0.10
Sunderland -0.22 0.10
and new intercept and home advantage parameters
Newcastle -0.23 0.10
Burnley -0.25 0.10 )
Aston Villa -0.26 0.10
QPR -0.30 0.10
Intercept 0.19 0.03 )
Beta 0.15 0.03
Results
12 teams predicted within
2 places of final position

Rank correlation between


expected and final
positions = 0.77 +- 0.22

But at least two major


outliers

Leicester won league in 0


of 10000 sims

Chelsea finished 10th or


lower in only 3 of 10000

Low-High: 95% confidence interval


Results
12 teams predicted within
2 places of final position

Rank correlation between


expected and final
positions = 0.78 +- 0.22

But at least two major


outliers

Leicester won league in 5


of 10000 sims

Chelsea finished 10th or


lower in only 63 of 10000

Low-High: 95% confidence interval


Vs Betting Odds
Assignment: Short Project 1
You are a ‘data journalist’ working for FiveThirtyEight.

It is the beginning of the regular season.

Nate Silver asks you to make projections for the season ahead:
- The probability of teams {winning the league; qualifying for play-offs, winning ‘cup’}
- Range of expected outcomes for each team
- Are they likely to improve (or not) relative to previous season.

He asks for a draft of your piece on his (virtual) desks within three weeks
Assignment: Short Project 1
Your job is to:

- Pick a sport & a season (e.g. 2017-18 NFL)


- Estimate team strengths from previous season(s) using the methods we have covered
- Simulate the season that you have chosen (as if you were at the beginning of it).
- Analyze the results (creating at least one chart and/or table)
- Write up your report in the style of a data journalist anticipating the outcome of the
season ahead. [no more than 3 pages, 12pt font, including figures & tables]

Project report + code is due on 3/10, midnight EST.


Assignment: Short Project 1
Group working:

- You may work in groups of up to three people (although you do not have to).
- You must submit your own report (not a joint report) detailing your own simulations
(preferably of a different seasons).
- Indicate at the top of the code where each person contributed
Data
Here is a list of some useful resources:
• https://fbref.com/en/ (soccer)
• https://www.football-data.co.uk/data.php (soccer)
• https://www.baseball-reference.com (baseball)
• https://www.basketball-reference.com (basketball)
• https://www.hockey-reference.com (hockey)
• https://www.pro-football-reference.com (NFL)

Any my own data repo here (please do not share with people not on the course).
Other Presentation Types
Other Presentation Types

You might also like