You are on page 1of 15

EPL

Prediction
Web App
Presentation: Roshan Gautam
What we cover
❖ INTRODUCTION

❖ AIMS AND OBJECTIVES

❖ PROBLEM STATEMENT

❖ REQUIREMENT ENGINEERING

❖ LITERATURE REVIEW

❖ RESEARCH METHODOLOGY

❖ SYSTEM ANALYSIS AND DESIGN

❖ TESTING

❖ RESULT AND FINDING

❖ DISCUSSION

❖ CONCLUSION
INTRODUCTION

The EPL's global Growing demand for Machine learning Predicting football match The presentation aims to
significance lies in its precise match predictions algorithms have outcomes is complex due to explore machine learning's
unparalleled stems from the sports revolutionized EPL analysis, numerous variables, application in EPL outcome
competitiveness and betting industry, sports using extensive data and necessitating an prediction, comparing
immense fan base, making analytics, and the decision- advanced techniques to understanding of machine various algorithms, and
it one of the most making process for enhance prediction learning's limitations and providing insights into
celebrated football leagues managers and coaches. accuracy and deepen the multifaceted nature of performance variations and
worldwide. football understanding. the game. the future of prediction in
the league.
AIMS AND
OBJECTIVES
Aim: To develop a predictive model for English Premier League (EPL) football match outcomes
using machine learning algorithms and provide valuable insights for fans, sports analysts, and
team management.

Objectives:

Data Collection and Preparation: a) Collect historical EPL match data, including team and
player information. b) Process and format the data for suitability in training machine learning
models.

Model Development: a) Create predictive models using Logistic Regression, LSTM, and Poisson
Distribution. b) Train these models using the collected historical data to predict EPL match
outcomes.

Model Evaluation: a) Split the dataset into training (80%) and testing (20%) subsets to evaluate
model performance. b) Assess the accuracy and reliability of each model in predicting match
outcomes.

Web Application Integration: a) Incorporate the trained models into a web application. b) Test
the functionality and accuracy of the web application in providing match outcome predictions
based on user input.

User Experience Enhancement: a) Design a user-friendly and visually appealing interface for
the web application. b) Ensure that the application offers an engaging and informative
experience for users seeking match predictions.
PROBLEM STATEMENT
accurately predicting English Premier League (EPL)
Traditional prediction methods often lean on
match outcomes is a persistent challenge. The
historical statistics and basic statistical models,
EPL's global importance draws fans from diverse
potentially missing crucial factors like player form,
backgrounds, each seeking dependable
injuries, tactical variations, and external variables
predictions for purposes like sports betting,
like weather. These factors significantly impact
fantasy football, and informed decision-making by
match results.
teams.

Beyond statistics, predicting EPL match outcomes


Machine learning, with its data-processing
entails providing actionable insights that inform
prowess and ability to uncover intricate patterns,
vital decisions, heighten fan engagement, and
offers a promising solution. However, applying
enrich our understanding of the sport. This real-
machine learning to EPL match predictions is not
world challenge revolves around discovering the
without hurdles. Selecting the right algorithms,
most effective and accurate method for using
crafting relevant features, data preparation, and
machine learning to predict EPL match outcomes,
robust model evaluation are complex
all while recognizing the unpredictable nature of
considerations.
football.
REQUIREMENT EGINEERING

DATA COLLECTION FEATURE MODEL SELECTION: MODEL TRAINING MODEL USER INTERFACE
AND ENGINEERING: AND INTEGRATION: (UI) DESIGN:
PREPROCESSING: OPTIMIZATION:
LITERATURE
REVIEW
The literature informing this research encompasses diverse studies in
football match outcome prediction. Ulmer and Fernandez's work explores
machine learning classifiers like Linear classification, Naive Bayes, Hidden
Markov Model, SVM, and Random Forest in the context of the EPL, laying
the foundation for our study. Palinggi's team introduces weather conditions
as predictive features, highlighting the value of external factors. Saiedy,
HemmatQachmas, and Faqiri compare SVM and Random Forest's
performance, offering insights into machine learning tools. Constantinou's
global predictions inspire broader applications beyond the EPL. Finally,
Azhari, Widyaningsih, and Lestari's Poisson regression model aligns with our
use of the Poisson Distribution, guiding our methodology. Collectively, this
literature shapes our research, spanning machine learning techniques,
external factors, algorithmic comparisons, global predictions, and Poisson
regression in football match outcome forecasting.
RESEARCH METHODOLOGY
Results
Data Model Training and Ethical
Data Collection: Model Selection: Feature Selection: Model Evaluation: Interpretation and Future Research:
Preprocessing: Validation: Considerations:
Visualization:
•Historical Match •Feature •Poisson •We divided the •Feature selection •We assessed model •Visualizations, •Ethical •We considered
Data: We collected Engineering: We Regression Model: dataset into training methods, including performance using including heatmaps considerations potential avenues
a comprehensive conducted Given its suitability and validation sets, recursive feature metrics such as and charts, were were taken into for future research,
dataset of historical extensive feature for modeling goal- employing elimination and accuracy, precision, used to interpret account, including the
EPL match results, engineering to related events in techniques like feature importance recall, F1-score, model outcomes particularly integration of real-
including team extract relevant football matches, cross-validation to analysis, were and area under the and understand the concerning data time data and
performance information, such as we employed the assess model employed to Receiver Operating impact of different privacy and bias extending
metrics, player team form, player Poisson regression performance. identify the most Characteristic features on match mitigation in model predictions to other
statistics, and form, home and model as the core •Hyperparameter influential (ROC-AUC) curve. predictions. training and football leagues.
match-specific away performance, predictive tuning was predictors for •Comparative •Model explanations predictions.
details. and weather- algorithm. conducted to match outcomes. analysis between and insights were
•Weather Data: related variables. •Machine Learning optimize the the Poisson derived to provide
Weather conditions •Data Cleaning: Classifiers: models for regression model context to the
for each match The dataset Additionally, we predictive accuracy. and machine predictions.
were obtained, underwent rigorous utilized various learning classifiers
drawing on sources cleaning to handle machine learning was conducted to
such as missing values, classifiers such as determine the most
meteorological outliers, and Logistic Regression, effective approach.
databases and inconsistencies, Random Forest,
historical weather ensuring data and Support Vector
records. quality. Machine (SVM) to
•Team and Player •Normalization and compare their
Data: Information Scaling: Features performance with
on team rosters, were normalized the Poisson
player attributes, and scaled to regression model.
and past maintain
performances was consistency and
sourced from facilitate machine
reputable sports learning model
databases and EPL training.
records.
SYSTEM ANALYSIS AND DESIGN

System
Data Flow Diagram Architecture:
(DFD): • The preliminary system
architecture was
• A high-level DFD was
outlined, including the
Preliminary System developed to illustrate
the flow of data within
high-level components
Design: the system. It outlined
and their interactions.
This helped identify the
the processes involved
major subsystems such
in data collection,
as data storage,
preprocessing, model
preprocessing,
training, and prediction.
modeling, and results
presentation.

System Technology Stack


Requirements Use Case Modeling: Selection:
Analysis: • We created use case
• At this stage, we diagrams to depict the • We identified the appropriate
conducted a thorough interactions between technologies and tools
analysis of the system users (researchers and required for data handling,
requirements. This analysts) and the system.
Key use cases included machine learning, database
involved defining the
goals and objectives of data collection, model management, and
the predictive system, training, prediction visualization. Selection was
identifying stakeholders, generation, and results based on factors like
and understanding their visualization.
scalability, performance, and
needs.
compatibility.
TESTING
Integration Functional Performance Cross-
Unit Testing
Testing Testing Testing Validation

Objective: To test the system's Objective: To evaluate the Objective: To validate whether Objective: To assess the Objective: To evaluate the
individual components interaction between different the system meets its functional system's performance, predictive accuracy and
(functions, methods, classes) in system modules and requirements and user scalability, and responsiveness generalization of machine
isolation. components. expectations. under various conditions. learning models.

Method: Develop test cases for Method: Test how different Method: Create test cases Method: Implement k-fold
Method: Conduct load testing
each component, providing parts of the system work based on functional cross-validation on the
to determine how the system
input data and assessing the together. Verify that data flows requirements and expected historical match data to assess
behaves under heavy user
output. Ensure that data correctly between data user interactions. Verify that the model's performance on
loads. Measure response times
preprocessing, model training, preprocessing, modeling, and users can input data, obtain different subsets of the data.
for predictions and ensure they
and result generation functions result visualization predictions, and view results as Measure metrics like accuracy,
meet acceptable thresholds.
behave as expected. components. intended. precision, recall, and F1-score.

Tools: Python's built-in Tools: Selenium or similar tools


Tools: Custom test scripts and Tools: Tools like Apache Tools: Python libraries like
unittest or third-party libraries for automated UI testing, along
frameworks, often specific to JMeter or custom scripts to Scikit-learn for implementing
like pytest for Python-based with manual testing for specific
the technology stack used. simulate user loads. cross-validation.
components. use cases.
RESULT AND
FINDING
The LSTM model had an accuracy of 38%
and a validation loss of 10 after 100 epochs.
The Logistic Regression, SVC Classifier, and
KNN achieved 98%, 99%, and 72%
respectively.
DISUCUSSION
Our project has showcased the effectiveness of various machine learning
algorithms in predicting English Premier League (EPL) match outcomes. Notably,
the LSTM model, despite having an accuracy of 38% and a validation loss of 10 after
100 epochs, represents one facet of our comprehensive approach. The Logistic
Regression, SVC Classifier, and KNN models achieved significantly higher accuracies
at 98%, 99%, and 72%, respectively. This comparison highlights the variability in
performance across different algorithms and underscores the importance of
choosing the right tool for the task. Our Poisson regression model, for instance,
demonstrated commendable predictive accuracy by incorporating diverse factors
such as historical performance data, external variables like weather conditions, and
team-specific features. Beyond algorithmic choices, our project also lays the
foundation for potential global football predictions, allowing for a broader
perspective on match forecasting. Moreover, the user-friendly interface and
rigorous testing procedures ensure the reliability and usability of our system,
making it a valuable resource for both EPL enthusiasts and stakeholders in the
realm of match predictions.
CONCUSION
Our project highlights the potential of machine learning in predicting
EPL match outcomes, with a focus on the Poisson regression model.
Our system achieved commendable accuracy by considering historical
data, weather conditions, and team-specific factors. While the LSTM
model had a 38% accuracy and 10 validation loss after 100 epochs,
models like Logistic Regression, SVC Classifier, and KNN achieved up to
99% accuracy. We've also laid the groundwork for expanding
predictions globally. Our user-friendly interface and rigorous testing
ensure reliability, making it a valuable tool for EPL predictions. In
summary, our research empowers enthusiasts and stakeholders with a
practical platform for data-driven EPL match predictions, enriching the
football experience.
Any Questions
?
Thank you

You might also like