You are on page 1of 12

EPL 2019-20 Prediction

Group 8

Aiyush Bahl (E008)


Anish Dalmia (E018)
Parshotam Jagwani (E028)
Rohit Lala (E038)
Ayuesh Raj (E048)
Khush Singhvi (E058)
Background Objectives
• 20 Teams in EPL
 To evaluate using analytics the
• 38 matches have been played by all teams
• Data collected from www.github.com difficulty of a match
using API in the FootbalIR package for
 To understand and quantify the
2018-19 season
importance of fans and assess the
investments made towards this.
 To draw relevant insights from the
Problem Statement data set and determine the
How important is home advantage in football significant factors that influence a
match
Data Description Factor Selection Preliminary Analysis Prediction Model Further Analysis

Using Tableau, graphs have been plotted between the various factors and the total points scored by the team as of now.
Correlation has been used to determine the strongest predictor of match results.
Data Description Factor Selection Preliminary Analysis Prediction Model Further Analysis

The factors : Home Goals and Away Goals Analysing the distribution followed by goals scored
Article Studied: Gao, J. (2017). Predicting Premier League Final
All goals scored by the teams have been plotted and the distribution Points and rank using Linear Modeling Techniques.
follows Poisson Distribution.

Assumptions
1) Both teams have equal chances of scoring 2) Chances of Scoring are equally distributed in the 90 minute play time

Distribution of Goals
400
350
300
No. of Matches

250
200
150
100
50
0
0 2 4 6 More
Goals Scored

Frequency
The purpose of these ratings is to calculate the number of goals any given team will
most likely score or concede against any given opponent, and vice versa.

Their algorithm rates the attack and defence of all the teams based on previous results
and in-game scoring sequences

The process of producing predicted scorelines roughly follows a Poisson distribution.


The Poisson distribution shows the probabilities for the possible score lines.

Expected goals is one form of data analysis that soccer teams use.
Because a shot is the defining action of a goal, shot data is key to any expected goals
model.
Fixture
Difficulty
Ratings
Data Description Factor Selection Preliminary Analysis Prediction Model Further Analysis

Linear Regression model based on Poisson distribution has been built to predict the results of the matches

• Home and Away have difference strengths


• Glm() function is used to create those two strengths for
all teams
• If Everton’s Home parameter is lesser than Liverpool’s
Away parameter, then Liverpool has better chance of
winning the match at Everton’s Home ground.

• Simulate function used to simulate matches between


two teams by specifying them
• It predicts the goals to be scored by away team and
home team – which in turn gives us the result of each
match
• Simulation is run for all remaining matches and results
are added to the current league table
Data Description Factor Selection Preliminary Analysis Prediction Model Predictions

Team name Home Goals Away Goals Total Points


Liverpool 51 31 97
Our Predictions for Man City 58 36 92
2019-2020 Season Chelsea
Leicester
34
34
34
32
78
77
Tottenham 34 25 74
Arsenal 42 29 72
Wolves 28 19 57
Man United 24 23 55
Everton 25 25 53
Bournemouth 29 19 49
Newcastle 24 11 46
West Ham 27 14 46
Crystal Palace 15 28 44
Aston Villa 23 23 43
The likelihood of correctly
Southampton 21 19 37
estimating the final Premier Burnley 19 17 35
League table is comparable to Brighton 18 16 34
your odds of winning the Sheffield United 19 9 30
lottery (1/45,000,000) Norwich City 19 11 20
Watford 7 9 15
Home Advantage Impact Scope
• Increased overall • Home advantages • Research should
performance levels of influence various consider the effects of
each team when they factors such as a fouls additional situation
play at home venue. committed, chances of variables purported to
• Home Advantage is an injury, and decisions influence the mental,
calculated based on of the officials. physical, technical, and
how many more goals • For eg a team tends to tactical components of
each team scores or commit more fouls football performance
concedes during their when playing at home.
home matches. This can be attributed
to increased levels of
motivation and
aggression.
Data Description Factor Selection Preliminary Analysis Prediction Model Further Analysis

Uncertainty in Football
Super Computer is much more accurate in prediction as it can assimilate much more data and has greater
computational capabilities.
Our model takes into account only two factors – in reality there might be 100 factors that can affect the match
result.
Some of those factors are
 Injuries
 Player’s motivation towards the end (to avoid relegation, to win the league)
 Other leagues happening simultaneously (UCL, Europa League, International break, etc.)
 Bench depth
 Squad tiredness
 Manager changes
 New player addition
 Weather

A super computer can take into account most of these (if not all) and can give more accurate results.
RECOMMENDATION
• The home advantage of playing a soccer game is significant and hence
the clubs should invest in maintaining the stadium and building an
atmosphere where fans storm into the game to support their teams.
• Managers could identify and make commends to protect their key
players if the chances of an injury seems higher.
• Coaches can use this information to establish objectives for players
and teams during practices and matches and can be prepared for
these different competitive scenarios.
Thank

You

You might also like