0% found this document useful (0 votes)
196 views6 pages

ML Project

The project aims to develop a machine learning model to predict the outcomes of matches in the 2025/26 UEFA Champions League using data such as club performance and metrics. It addresses gaps in existing models by incorporating dynamic form tracking and advanced metrics, utilizing an XGBoost classifier and Monte Carlo simulations for accurate predictions. The expected results include high accuracy in match predictions and final league rankings, with real-time updates as the tournament progresses.

Uploaded by

sumeet1065
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
196 views6 pages

ML Project

The project aims to develop a machine learning model to predict the outcomes of matches in the 2025/26 UEFA Champions League using data such as club performance and metrics. It addresses gaps in existing models by incorporating dynamic form tracking and advanced metrics, utilizing an XGBoost classifier and Monte Carlo simulations for accurate predictions. The expected results include high accuracy in match predictions and final league rankings, with real-time updates as the tournament progresses.

Uploaded by

sumeet1065
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Machine Learning​

Project Report

Predicting the 2025/26 UEFA Champions League


Table Using Machine Learning

Submitted to.: Prof. Selva Kumar S.​



By:​
Prabhaat Autar Mam (23BCB7028)​
Sumit Mahapatra (23BCB7038)​
Sarthak Katiyar (23BCB7039)
Motivation
Football is the most widely followed sport globally, and the UEFA Champions League is, by
many, widely regarded as the highest level of club competition this sport has to offer. It
attracts hundreds of millions of viewers each season. Each edition of the tournament is
marked by a high degree of unpredictability, with outcomes often defying expectations—such
as lower-ranked teams emerging as champions or comparatively inexperienced clubs
eliminating established contenders. This inherent uncertainty makes forecasting both a
challenging and intellectually stimulating task. In parallel, the growing field of sports analytics
has demonstrated how data-driven approaches can yield valuable insights into performance,
strategy, and competitive outcomes. Machine learning, in particular, provides a powerful
framework for capturing complex patterns within football data and has been successfully
applied to a variety of predictive problems.

Objective
The objective of this project is to design and develop a machine learning based model
capable of predicting the outcomes of football matches played between any two teams
competing in the 2025/26 UEFA Champions League, using publicly available data like club
coefficient, past performance, current form, and other related metrics. By simulating each
fixture multiple times, the model will be able to generate predictions which can then be
aggregated to construct the final league phase table, and ranking all the participating teams.
We sought after a dynamic model which can refine its predictions based on the updated
match results as the tournament progresses, with the ultimate goal of forecasting the results
through to the final match to be held at Puskás Aréna in Budapest, Hungary. Additionally, the
project will estimate the probability of each team winning the tournament as a
complementary analysis.

Literature Survey

Title Model Dataset Performance Objective Takeaway


Metric
Prediction of SVM, Logistic General match Accuracy Accuracy of Established
football match Regression, data nearly 70% baseline; but
results with Neural Networks lacked dynamic
form tracking.
Machine Learning
[1]

Predicting Explainable ML UK Football F1 Score, Focus on Betting Provides


Football Match Models Leagues Probability strategy interpretability,
Outcomes with but highly
context-specific to
eXplainable
UK Leagues
Machine Learning
and the Kelly
Index [2]
Prediction of Time Series EPL Historic data R-Squared Focus on Captures league
English Premier Analysis long-term trends stability but
League Soccer ignores cup
competition
Matches [3]
dynamics

Predicting Ensemble Multiples Leagues Accuracy, ROC Highly accurate Strong predictive
Football Match Methods (RF, AUC (approximately power, but
Outcomes With Gradient 77%) complex
Boosting) implementation
Machine Learning
for real time
Approaches [4]

A predictive Regression General Football MAE, RMSE Robust results Excellent for
analytics Models Data across different score prediction
framework for leagues but less direct for
discrete outcome
forecasting
(W/D/L)
soccer match
outcomes using
machine learning
models [5]

EPL Points Feature Selection EPL Data Accuracy Good long-term High dependence
Prediction Using & ML prediction on specific league
Machine Learning structure.
[6]

A Bayesian Bayesian Single Team Probability Provides Excellent for


approach to Modelling focus estimation uncertainty probability needs
predict quantifications but complex
hierarchical setup
performance in
for all teams
football: a case
study [7]

From Players to Deep FIFA World Cup Accuracy Focus on Generalizable


Champions: A Learning/XG international features but less
Generalizable boost tournaments data density than
club level football
Machine Learning
Approach for
Match Outcome
Prediction with
Insights from the
FIFA World Cup
[8]

Calculating FC Statistical Single Club Data Probability High focus on one Lacks scalability
Barcelona's Modeling team for a full
Probability of tournament
simulation
Winning the 2025
UEFA Champions
League: A Holistic
Statistical
Approach [9]

Predicting Optimization UEFA tournament Error Rate Group stage Does not fully
Qualification Techniques data scenarios simulate knockout
Thresholds in stage
UEFA’s
Incomplete
Round-Robin
Tournaments [10]
Research Gap
While numerous ML applications exist for football prediction, the specific requirements of
forecasting the new-format UEFA Champions League (UCL) table expose several gaps:

1.​ Inter-League Variance: Most existing models focus on single domestic leagues (e.g.,
EPL), failing to account for the high variance, unique tactics, and heterogeneous
opponent strength present in inter-league competitions like the UCL.
2.​ Dynamic Form Tracking: Few models adequately and dynamically update a team's
strength and form between matchdays within the short, intense UCL schedule, which
is crucial for accurate knockout predictions.
3.​ Holistic Tournament Simulation: Existing methods often stop at match outcome
prediction. There is a lack of a cohesive, single system that integrates match
probabilities into a Monte Carlo simulation to provide the required final league table
ranking and overall tournament win probabilities.
4.​ Feature Richness: Simpler models often rely only on basic scorelines. A gap exists in
integrating advanced, modern features like Expected Goals (xG), Expected Points
(xPTS), and travel fatigue metrics to improve the predictive power for a premium
competition like the UCL.

Proposed Model

Dataset Details
Category Features Included Source Type Purpose

Team Strength UEFA Club Coefficient, Historical Data Baseline assessment for
(Static) Market Value, Manager team quality
Experience.

Historical Last 5 years UCL results, Public Match data by Identify historical
Performance Head-to-head records UEFA dominance/weakness
against high-tier
opposition

Current Form Domestic league form Current Data by Real-time assessment of


(W/D/L, Goals for/against Sofascore, Transfermarkt team momentum and
in last 5 matches), Injury fitness
Report

Advanced Metrics Average xG, Average Predicted Data by Quantify underlying


xGA, xPTS difference Sofascore, Transfermarkt performance beyond the
scoreline

Match Context Home/Away indicator, Tournament data by Capture contextual


travel distance, match UEFA biases and fatigue
importance

Model
Algorithm: XGBoost Classifier for Match Outcome Prediction integrated with Monte Carlo
Simulation for Tournament Progression.
●​ XGBoost: It is a highly efficient and effective implementation of gradient-boosted
decision trees. It handles non-linear relationships and feature interactions inherent in
complex sports data better than simpler models, providing highly accurate probability
estimates for the three outcomes (Home Win, Draw, Away Win).
●​ Monte Carlo Simulation: This technique is essential for transforming match
probabilities into tournament outcomes. By running thousands of full tournament
simulations, we can generate a reliable probability distribution for the final league
table ranking and the ultimate tournament winner, directly meeting the project
objective.

Workflow Diagram:

The proposed solution is a two-stage predictive engine:


1.​ Match Prediction Engine (XGBoost): Takes match features as input and outputs a set
of probabilities: P(Home Win), P(Draw), P(Away Win).
2.​ Tournament Simulation Engine (Monte Carlo): Uses the match probabilities to
simulate the entire 2025/26 UCL tournament schedule (League Phase, Round of 16,
Quarter-Finals, Semi-Finals, and Final) 10,000 times.
The final output will be derived from the aggregate results of these simulations, presenting
the most likely final league table ranking and the probability of each club lifting the trophy.

Reference used for this project


[Link]

Expected Results
Metric Target Value Description

Match Prediction Accuracy ~70-75% Accuracy on the test set for the
three-class outcome (W/D/L).

Probability Calibration Low Breir Score The model's predicted probabilities


should be well-calibrated.

Final Table Ranking ~80-85% confidence interval The final ranking of the league
phase is based on aggregated
Monte Carlo results, providing a
high-confidence interval for each
team’s final position.

Tournament Win Probability ΣP(team_win) = 1 A complete list of all 36 teams


with their estimated percentage
chance of lifting the trophy.

Dynamic Insight Real-time update The system will be capable of


re-running the simulation in
real-time after each match to
update all subsequent team
probabilities and rankings.

The project is expected to deliver a statistically robust and highly dynamic forecast, moving
beyond simple match betting to provide a comprehensive, data-driven ranking of the
2025/26 UEFA Champions League tournament.

You might also like