You are on page 1of 11

International Journal of Sports Physiology and Performance, (Ahead of Print)

https://doi.org/10.1123/ijspp.2023-0444
© 2024 Human Kinetics, Inc. ORIGINAL INVESTIGATION
First Published Online: Feb. 24, 2024

Predicting Soccer Players’ Fitness Status


Through a Machine-Learning Approach
Mauro Mandorino,1,2 Jo Clubb,3 and Mathieu Lacome1,4
1
Performance and Analytics Department, Parma Calcio 1913, Parma, Italy; 2Department of Movement, Human and Health Sciences,
University of Rome “Foro Italico,” Rome, Italy; 3Global Performance Insights Ltd, London, United Kingdom;
4
Laboratory of Sport, Expertise and Performance (EA 7370), French Institute of Sport (INSEP), Paris, France

Purpose: The study had 3 purposes: (1) to develop an index using machine-learning techniques to predict the fitness status of soccer
players, (2) to explore the index’s validity and its relationship with a submaximal run test (SMFT), and (3) to analyze the impact of
weekly training load on the index and SMFT outcomes. Methods: The study involved 50 players from an Italian professional soccer
club. External and internal loads were collected during training sessions. Various machine-learning algorithms were assessed for their
ability to predict heart-rate responses during the training drills based on external load data. The fitness index, calculated as the
difference between actual and predicted heart rates, was correlated with SMFT outcomes. Results: Random forest regression (mean
absolute error = 3.8 [0.05]) outperformed the other machine-learning algorithms (extreme gradient boosting and linear regression).
Average speed, minutes from the start of the training session, and the work:rest ratio were identified as the most important features. The
fitness index displayed a very large correlation (r = .70) with SMFT outcomes, with the highest result observed during possession
games and physical conditioning exercises. The study revealed that heart-rate responses from SMFT and the fitness index could
diverge throughout the season, suggesting different aspects of fitness. Conclusions: This study introduces an “invisible monitoring”
approach to assess soccer player fitness in the training environment. The developed fitness index, in conjunction with traditional fitness
tests, provides a comprehensive understanding of player readiness. This research paves the way for practical applications in soccer,
enabling personalized training adjustments and injury prevention.

Keywords: football, heart rate, training load, prediction, test

It is now commonplace to use technology in the soccer environ- maximal fitness testing, which can be time-consuming, fatiguing, and
ment to gain insight into player training status. Tracking technologies therefore disruptive to the training needs of the players.8 Therefore,
such as global positioning systems (GPS) quantify the external load, SMFT may enable player monitoring, thereby mitigating the physical
which is the work completed by the athlete.1 Meanwhile, internal and mental burdens associated with the classic longer maximal test.6
load, the physiological cost associated with such work,1 is often Conversely, SMFT has also been criticized for requiring additional
captured through heart-rate (HR) monitoring. The relationship training time and may be limited by coach willingness to include
between external and internal load has been widely used to evaluate them,8 especially during congested period.9
the fitness status of the players.2–4 Most simply, ratios between For this reason, noninvasive monitoring strategies that utilize
internal and external load have been used as a representation of work data already collected within the training environment, without the
efficiency.2 Linear models have been developed but so far, either need for any additional testing, are particularly appealing. This
using only one external load metric2,3 or multiple one but not approach is widely referred to as “invisible monitoring.”10 One
accounting for nonlinear relationships5 and could represent simplistic example was proposed by Lacome et al,5 in which the authors
approaches. Indeed, the ability of external load metrics to predict adopted a linear regression (LR) approach to predict HR response
internal load is likely player-specific, and as such, models that during small-sided games from external load data. The difference
integrate individual combinations of external load variables are between the actual HR and predicted HR was used as a fitness index
thought to be superior.5 (FI). However, there were several limitations to this study; namely,
Another approach is to integrate a submaximal fitness test the analysis was limited to 10 players, simple LR models were built
(SMFT) into the training plan, which enables a physiological state for each player, thus preventing the generalizability of prediction,
assessment via the internal load response to a fixed external load, that and the models were fit on the entire data set thereby increasing the
is, a standardized physical stimulus.6 SMFT is generally performed risk of overfitting.5
with a short duration (3–15 min) and variable protocol (eg, continuous Therefore, the current study aims to improve the approach
or intermittent). A recent convergent validity meta-analysis of 20 proposed in Lacome et al5 to further explore the feasibility of using
studies indicated an inverse, large relationship between HR measured machine learning (ML) techniques to assess the fitness status of
at the end of the SMFT and endurance test performance.7 Such an soccer players within their training environment. To date, ML has
approach has become increasingly popular, as it negates the need for been employed for numerous purposes in soccer such as injury
prevention,11,12 performance analysis,13,14 or technical/tactical
Clubb https://orcid.org/0000-0002-6509-7531 analysis.15 This widespread use is closely linked to its ability to
Lacome https://orcid.org/0000-0002-1082-0200 model complex and nonlinear interactions inside high-dimensional
Mandorino (mmandorino@parmacalcio1923.com) is corresponding author, data.11 Therefore, the study presents 4 objectives: (1) building a FI
https://orcid.org/0000-0002-5858-2758 based on a ML approach, (2) analyzing the validity of the index,
1
2 Mandorino, Clubb, and Lacome

(3) assessing the impact of weekly workload on the FI variation, Internal load data (HR) was recorded with a sampling frequency
and (4) compare the FI to SMFT outcomes. of 4 Hz using a Garmin HR band (Garmin Ltd) and synchronized
using a telemetry system (WIMU PRO RealTrack Systems).21 The
physiological intensity of the training sessions was expressed as the
Methods percentage of individual maximum HR (HRmax). Players’ HRmax
was collected at the beginning of the season through an incremental
Subjects treadmill protocol. The test started with an initial speed at 8 km per
The study has been conducted during the 2022/2023 soccer season. hour and increments 2 km per hour every 2 minutes, until exhaus-
Fifty players from a men’s first team (n: 22, age: 24.1 [4.7] y, body tion.22 External and internal load data were collected during 3
mass: 80.5 [6.1] kg, height: 185.1 [5.2] cm), U19 team (n: 18, age: different type of training drills which were classified in:
18.4 [0.6] y, body mass: 76 [8.5] kg, height: 181.4 [6.8] cm), and • Game simulations (GS): games performed with 2 regular goals
U18 team (n: 10, age: 17.7 [0.3] y, body mass: 69.1 [4.6] kg, height: and goalkeepers.
179.5 [8.9] cm) competing in an Italian professional soccer club • Possession games (PO): possession drills performed without
were recruited for the current study. The data collection was the goals.
obtained from the club as the players were routinely monitored
throughout the season. Therefore, the usual procedure for ethics • Physical conditioning exercises (PH): exercises performed
committee clearance was not required.16 However, all data were without the ball with the aim to improve fitness status of the
anonymized before the analysis to guarantee team and players’ players.
confidentiality, and the study was conducted following the Decla- The different formats utilized for GS and PO were presented in
ration of Helsinki. Supplementary Table S1 (available online).

Design Methodology
Players’ external and internal load data were recorded during 261 Evaluation of Fitness Status Through Submaximal Run Test
training sessions (first team, n: 124, duration: 67.9 [11.2] min; U19, HR responses during a submaximal run test (HRex) were used to
n: 77, 84.8 [17.1] min; U18, n: 60, 84.5 [13.5] min). An average of evaluate the fitness status of the players. Previous studies identified
213 (85) sessions per player was recorded. Players who took part in the HRex as a good marker to monitor positive aerobic-oriented
<60% of training sessions were excluded from the study to remove training adaptations.23,24 The test consisted of 2 bouts of 3-minute
the subjects who had poor training continuity due to injuries or running at 10 km per hour and 12 km per hour interspersed with
absence.17 External training load was collected using the WIMU 1 minute of recovery during which the players totally stopped on
Pro system (RealTrack Systems) whose validity and reliability the pitch. The last 30 seconds of each bout were considered, and the
have been previously tested.18–20 Twelve different parameters were average was calculated and expressed as a percentage of HRmax
considered (Table 1). (HRex). The submaximal run test was performed periodically

Table 1 Variables Inserted in the Machine-Learning Model as Predictors


Predictor Metric Unit
External load data Work:rest ratio (ratio between work period [distance covered with a speed n/a
above 4 km/h] and rest period [distance covered at a speed 0–3.9 km/h])
Drill duration min
Total distance m
Distance covered with a speed above 7.2 km/h m
Distance covered with a speed above 14.4 km/h m
Distance covered with a speed above 19.8 km/h m
Distance covered with a speed above 25.2 km/h m
Max speed km/h
Average speed km/h
Number of accelerations above 3 m/s2 Count
Number of decelerations below 3 m/s2 Count
PlayerLoad™13 AU
Additional information Player’s playing position: center back, fullback, midfielder, winger, forward n/a
Type of the drill during which heart rate was collected: game simulations, n/a
possession games, physical conditioning exercises
Player’s team: First team, U19, U18 n/a
Ambient temperature °C
Ambient humidity %
Time since the start of the training session min

(Ahead of Print)
Fitness-Status Evaluation Through Machine Learning 3

(every 1–4 wk) and always 3 days after the previous match. No absolute error [MAE]) was used to build the different ML models.
warm-up was performed before the test.24 Therefore, RFE and Randomized Search allowed us to select the
most important features and the optimal hyperparameters to maxi-
Evaluation of Fitness Status Through ML Models mize the algorithm’s predictive ability.
Multiple ML models were built using a training data set consisting RFE and Randomized Search were performed on 20% of the
of a vector of features which described the players’ external load. data set, while the remaining 80% was employed to test the
Additional information (Table 1) that could improve ML models’ predictive ability of the ML models adopting a 5-fold cross-
ability to predict HR responses was inserted in the vector: validation. All the analyses were performed using Anaconda
(version 3.9.12, Anaconda Inc) and Python libraries.
• Player’s playing position.
• Type of the drill during which HR was collected. Evaluation of ML Models
• Player’s team. The predictive ability of the models was evaluated adopting the
• Ambient temperature. MAE and root mean squared error (RMSE). The lower the MAE
and root mean squared error values, the better the goodness of the
• Ambient humidity. models. The performance of the models was compared with 2
• Minutes since the start of the training session baselines, that is, dummy classifiers which make predictions based
Ambient temperature and humidity were collected at the simple rules: baseline B1 returned the most frequent value observed
beginning of each training session (heat index, Weather Tracker, in the training set; baseline B2 generated predictions uniformly at
Kestrel 146 4500 NV, Kestrel Weather Instrument). The average random from the values observed in the training set. The model
HR registered during the different training drills was expressed as a which exhibited the best predictive ability was subjected to feature
percentage of HRmax (HRdrill), and it was considered in the ML importance analysis.
models as our target variable.
Calculation of the FI
Algorithm Selection The FI was calculated as the difference between the real HRdrill value
Different ML algorithms were considered in the current study: and the HRdrill predicted by the ML model. We expected that if the
real HRdrill was lower than the value predicted by the model (FI < 0),
• Extreme gradient boosting (XGBoost) the players showed an ability to minimize the HR response compared
• Random Forest Regression (RF) with that expected based on the external load collected (good fitness
• LR status). In contrast, a positive FI was interpreted as a condition where
the player exhibited a higher HR response than predicted (bad fitness
XGBoost (boosting algorithm) and RF (bagging algorithm) status). The average of the different indices obtained during a week
were chosen for their efficiency in regression problems and ability was calculated and used as the final FI. The FI was calculated:
to detect nonlinear relationships in high-dimensional data.11,13,25 (1) using the data collected during all the different type of drills
Differently, LR was selected for its simplicity in identifying the (FIGS-PO-PH), (2) considering the combination of 2 types of drills
linear relationship between the independent and dependent variables. (FIGS-PO, FIGS-PH, and FIPO-PH), and (3) using 1 single drill in
isolation (FIGS, FIPO, and FIPH). A summary of the steps performed
Data Preprocessing to calculate the FI is presented in Figure 1.
When a player for each drill registered an average HR outside the
1.5 ± interquartile range, the respective observation was removed Analysis of the Impact of Weekly Training Load
from the data set to exclude outliers determined by malfunction or on HRex and FI
incorrect use of the HR chest strap. In addition, data dropping was One of the aims was to understand whether HRex, evaluated
employed to handle missing data. Categorial predictors (player’s through SMFT, and FI, calculated through the ML approach, were
playing position, type of drill, and player’s team) were subjected sensitive to the variation of the weekly training load. For this
to 1-hot encoding before being included in the ML models. One-hot purpose, the weekly training load was calculated as the rolling sum
encoding is a process of converting categorial data variables that can of the previous 7 days for the following external load metrics: total
be used in ML algorithms to improve predictive ability. Moreover, distance, distance >19.8 km per hour, distance >25.2 km per hour,
before LR training, all features were normalized using Min Max and mechanical load (number of accelerations >3 m/s2 + number of
Scaler. The features were scaled between 0 and 1. This technique is decelerations < −3 m/s2).
useful when the data have varying scales or outliers that could alter the
predictive ability of the ML models. Instead, XGBoost and RF do not
require normalization being tree-based algorithms.26 Statistical Analysis
Considering that the training and weekly load management chan-
Feature Elimination, Hyperparameters Tuning, and Cross- ged in relation to the team’s/coach philosophy, we used the entire
Validation data set to build the ML models, and then, we limited the subse-
The Recursive Feature Elimination (RFE) algorithm was per- quent analyses to the first team only.
formed to remove features that could increase the risk of over-
fitting.12 After this initial step, a Randomized Search was Validity of FI
implemented to tune hyperparameters in XGBoost and RF. Con- To test the concurrent validity of the FI, Pearson product–moment
versely, LR does not require this process. A 3-fold cross-validation correlation coefficient (r) was calculated to establish the strength
was used to tune hyperparameters whose optimal combination and direction of the relationship between the FI and HRex values.
returned the best performance across each fold (lowest mean The following criteria were used to determine the magnitude of the
(Ahead of Print)
4 Mandorino, Clubb, and Lacome

Figure 1 — Summary of the steps performed to calculate the fitness index. GS indicates game simulations; MAPE, mean absolute percentage error; PH,
physical conditioning exercises; PO, possession games; RMSE, root-mean-square error.

relationship: ≤.1 (trivial), .1 to .3 (small), .3 to .5 (moderate), .5 to .7 Analysis of the Trend of FI and HRex
(large), .7 to .9 (very large), and ≥.9 (nearly perfect).27 To better Throughout the Season
understand the optimal combination of drills which contribute to A Dynamic Time Warping (DTW) algorithm was used for mea-
increase the relationship between FI and HRex values, the Pearson suring similarity and to calculate the distance between the 2 curves
correlation analysis was performed for all the indexes previously (FI and HRex) throughout the season. DTW algorithm calculates
described. the Euclidean distance between the points of the 2 series data, adds
(Ahead of Print)
Fitness-Status Evaluation Through Machine Learning 5

up all the minimum distances that were stored, and the final value significance level was set to P ≤ .05. All the analysis were performed
represents the similarity between the 2 curves.28 Lower values were using the Statistical Package for the Social Science (version 28.0).
interpreted as higher similarity in the trend of the 2 curves. The
similarity of the 2 series data was evaluated in 3 different moments
of the season: (1) preseason (week 1–week 7), (2) mid-season
Results
(week 8–week 27), and (3) end-season (week 28–week 46) to RF showed the best predictive ability (MAE = 3.8 [0.05]; RMSE =
understand the behavior of the 2 curves over the course of the 5.11 [0.11]) compared with the other ML algorithms (Table 2), and
season. Repeated measures analysis of variance was used to it was selected to be used for further analyses. After RFE process, 7
analyze mean differences in the similarity in the 3 periods of the external load metrics were selected and subjected to feature
season. When significant differences were found, the least signifi- importance analysis (Figure 2). Only the features presented in
cant difference post hoc test was performed. The size of the Figure 2 were inserted in the ML models. Investigating the validity
differences was also calculated adopting partial eta squared (η2). of the FI, the index FIPO-PH exhibited the highest correlation
η2 values of .01, .06, and .14 were interpreted as small, moderate, (r = .70, P < .01) with HRex values (Figure 3). After DTW analysis,
and large effect, respectively.29 The significance level was set the trend of the 2 curves (FIPO-PH and HRex) appeared significantly
to P ≤ .05. different throughout the 3 different periods of the season (P < .01;
η2 = .200). Particularly, the 2 curves showed a significant higher
Impact of Weekly Training Load on HRex and FI similarity in the preseason compared with the mid-season and end-
The impact of weekly training load on HRex and FI was evaluated season (Figure 4). The results of the stepwise LR are presented in
through stepwise multiple LR models. The training load data were Table 3.
used as independent variables, while HRex and FI were inserted in the
model as dependent variables. Two different regression models were
fit in the 3 different periods of the season. The F probability for variable
Discussion
entry was set at 0.05 and that for variable removal was set at 0.10. The The aim of the present study was to develop and analyze an in situ
FI adapted to soccer based on a ML approach. The key findings
Table 2 Performance of the Machine-Learning were as follows: (1) RF outperformed baseline and LR models;
Models (2) average speed, minutes since start of the training session, and
the work:rest ratio were the most important features to predict HR
Machine-learning model MAE RMSE response; and (3) the FI was highly correlated with HRex from the
Random forest regression 3.80 (0.05) 5.11 (0.11) SMFT, but this trend changed across the season.
Extreme gradient boosting 4.04 (0.02) 5.38 (0.06)
Linear regression 5.02 (0.09) 7.03 (0.19) Model Construction
B1 7.46 (0.06) 9.93 (0.13) External load data were used to predict HR responses during
B2 11.01 (0.08) 11.02 (0.10) training drills. From the different models used, the RF technique
Abbreviations: B1, first baseline model; B2, second baseline model; MAE, mean showed the best predictive ability of the ML techniques (Table 2)
absolute error; RMSE, root-mean-squared error. Note: Bold values indicate the and, therefore, was chosen for further analysis. Our findings are in
model that showed the best predictive ability. line with recent work that used ML to develop a new locomotor

Figure 2 — Feature importance analysis.

(Ahead of Print)
Figure 3 — Pearson correlation between the FI and HRex values. FI indicates fitness index; GS, game simulations; HRex, heart-rate responses during a
submaximal run test; PH, physical conditioning exercises; PO, possession games.

6 (Ahead of Print)
Figure 4 — (a) Analysis of the trend (average value of the team) of the fitness index calculated through machine-learning model, HRex, and
(b) similarity of the 2 curves calculated for each player throughout the course of the season. Significant difference versus midseason (P < .05). #Significant
difference versus end-season. HRex indicates heart-rate responses during a submaximal run test.

(Ahead of Print) 7
8 Mandorino, Clubb, and Lacome

Table 3 Summary of Stepwise Linear Regression Performed in the 3 Periods of the Season (Preseason,
Midseason, and End-Season) Between the Weekly Training Load Parameters (Total Distance, Distance
>19.8 km/h, Distance >25.2 km/h, and Mechanical Load) and Target Values (FIPO-PH and HRex)
HRex FIPO-PH
Preseason B SE B β P B SE B β P
Total distance 0 0 −0.240 .021 — — — —
Distance >19.8 km/h — — — — — — — —
Distance >25.2 km/h — — — — — — — —
Mechanical load — — — — — — — —
HRex FIPO-PH
Midseason B SE B β P B SE B β P
Total distance — — — — 0 0 −0.380 .001
Distance >19.8 km/h — — — — — — — —
Distance >25.2 km/h — — — — 0.005 0.002 0.328 .001
Mechanical load — — — — — — — —
HRex FIPO-PH
End-season B SE B β P B SE B β P
Total distance — — — — — — — —
Distance >19.8 km/h — — — — −0.003 0.001 −0.541 .006
Distance >25.2 km/h — — — — 0.011 0.003 0.670 .001
Mechanical load — — — — — — — —
Abbreviations: β, standardized beta coefficient; B, unstandardized beta coefficient; FIPO-PH, fitness index calculated through machine-learning model; HRex, heart-rate
response during submaximal run test; SE, standard error.

efficiency index, which also found RF outperformed baseline ML model of locomotor efficiency. Taken together, these findings
models.30 This highlights that this type of more complex model reinforce that a focus on total distance alone is too reductive to
built using RF, which is capable of capturing nonlinear relation- reflect the mechanical demands of soccer and support a multivariate
ships in high-dimensional data,25 is more suitable for this type of approach. Finally, these results also prove the importance of taking
analysis. Previous work has demonstrated a nonlinear dose– into account the minutes since the beginning of the session, as HR
response relationship between training load and stress markers,31 responses at the end of the training session may be influenced by
and injury.32 Our findings support the application of more complex the phenomenon known as “cardiovascular drift.”38,39 To the best
techniques, such as RF, to better capture these relationships of our knowledge, this is the first study integrating this parameter in
compared with more traditional approaches. the construction of the models.
Among the different variables, average speed, and minutes
since the start of the training session were recognized as the most Relationship Between FI and HRex
important features to predict HR responses during training drills
(Figure 2). In addition, work: rest ratio, distance covered above Research suggests that a disassociation between internal and external
7.2 km per hour and maximum speed, and mechanical parameters, load can indicate an athlete’s adaptation to training.1,2 Therefore, a
namely PlayerLoad™ and number of decelerations (<−3 m/s2), FI may highlight when there is a difference in an athlete’s expected
were also identified as important features to predict HR. This is internal load response to their respective external load. In other
perhaps not surprising given the need to use a multivariate words, a higher or lower physiological cost than expected for their
approach to measure training load.33 Studies have previously training load.1 The FI was calculated as the difference between the
shown that a single training load variable is unable to capture a real HRdrill value and the HRdrill predicted by the ML model, with a
meaningful proportion of the variance provided by multiple inter- decrease in FI interpreted as an improvement in fitness status, and
nal and external load measures.34,35 This adds further weight to the conversely. Our results reported a large correlation (r = .7) existing
findings by Lacome et al5 that HR during specific soccer drills can between the FI and the HRex measured during a SMFT. Given a
be affected by the mechanical demand, and should therefore be recent systematic review and meta-analysis demonstrated SMFT
considered both when assessing and planning training load. There- HRex to be a valid and reliable proxy indicator of endurance
fore, PlayerLoad18 together with the number of decelerations can performance in team sport athletes,6 the high correlation in our
be used to quantify the load imposed on players.36 Conversely, it is study supports the use of a FI calculated using ML techniques to
worth noting that PlayerLoad has recently been criticized due to estimate fitness status. Particularly, PO and PH were identified as
several limitations: confusion with its calculation, high correlation the most suitable drills to calculate the FI, as combined they
with the total distance covered by the players, and inability to exhibited the highest correlation (r = .70, P < .01) with HRex values
quantify the magnitude of accelerations.37 However, Mandorino (Figure 3). This is perhaps not surprising given GS drills likely
et al30 found similar training load variables to be important to their require the greatest levels of soccer-specific mechanical demands.
(Ahead of Print)
Figure 5 — Relationship between FI and HRex throughout the season. FI indicates fitness index estimated through machine learning; HRex, heart rate
responses collected during a submaximal run test.

(Ahead of Print) 9
10 Mandorino, Clubb, and Lacome

After DTW analysis, the 2 curves appeared very similar in the


first period of the season (preseason), while they tended to Practical Applications
dissociate in the mid- and end-season (Figure 4). Therefore, HRex Such analysis enables practitioners to combine HRex and the FI to
could describe a general fitness status while FI a soccer-specific pragmatically group athletes into quadrants of fitness that relate to
one. Such a demarcation has previously been described, with generic (from the SMFT) and specific (from the FI), thereby
a general component mostly related to cardiopulmonary perfor- inferring conditioning needs on an individual player basis. Fur-
mance during generic types of exercise bouts, as with the thermore, given the limitations associated with a SMFT, such as
SMFT, versus a soccer-specific fitness with a greater neuro- time demands and coach/athlete buy-in,8 consideration should be
muscular component, which relates to the ability to perform given as to whether an in situ FI alone is sufficient to track soccer-
and repeat specific types of locomotor actions during soccer specific fitness via an invisible monitoring approach. This approach
drills.5,40 This is further confirmed by the impact of weekly load benefits from not requiring any formal testing, plus can be assessed
on the 2 variables. Notably, HRex is influenced by cumulated total at much higher frequency.
distance only at the beginning of the season, during which the
higher gain in fitness is registered. The preseason training period
exposes soccer players to sudden and severe increases in training Conclusions
load, at a time when their physical conditions are commonly
The present study aimed to develop a fitness index (FI) based on the
lower than the remainder of the season following a period of
combination of internal and external load measures. We demon-
detraining over the off-season.30,41 Conversely, FI continues to be
strated that machine-learning techniques, which account for non-
influenced by the weekly training load over the course of the linear relationships and can combine multiple variables, had better
season except for the preseason period. This distinction between predictive ability than traditional linear metrics. The index devel-
general and soccer-specific fitness was also hypothesized by oped is related to outcome measures from the submaximal fitness
Lacome et al.5 test (SMFT) (ie, heart-rate responses during the SMFT) and allows
The correlation magnitude in the present study (moderate to practitioners to gain insight into the fitness of their players in a less
large) was similar to that found by Lacome et al between a similar invasive manner than additional testing. However, we recommend
in situ FI and a standardized submaximal run HRex (r = .56–.70 vs using them in combination if possible, as they appear to provide
r = .66, respectively). The authors suggest that a lack of perfect slightly different information, specifically, soccer-specific fitness
correlation is not a limitation but, in fact, a reflection that the 2 (FI) and more general fitness (SMST).
metrics represent distinct fitness qualities of general and soccer-
specific fitness. Consequently, the authors propose these measures
be visualized across 2 axes to create 4 quadrants.5 Starting from the References
top left quadrant and moving in a clockwise direction, these
quadrants can represent (a) generic unfit and specific fit, (b) generic 1. Halson SL. Monitoring training load to understand fatigue in athletes.
unfit and specific unfit, (c) generic fit and specific unfit, (d) generic Sports Med. 2014;44(2):139–147. doi:10.1007/s40279-014-0253-z
fit and specific fit (Figure 5). Such analysis can guide practitioners 2. Delaney JA, Duthie GM, Thornton HR, Pyne DB. Quantifying the
as to whether conditioning for individual players should focus on relationship between internal and external work in team sports:
generic running or more soccer-specific fitness. Furthermore, development of a novel training efficiency index. Sci Med Footb.
similarly to Lacome et al,5 this analysis can also be used to assess 2018;2(2):149–156. doi:10.1080/24733938.2018.1432885
changes in fitness across different times of the season, most notably 3. Buchheit M, Cholley Y, Lambert P. Psychometric and physiological
with the preseason conditioning period as players most commonly, responses to a preseason competitive camp in the heat with a 6-hour time
in theory, shift from an unfit quadrant (b), to general fitness difference in elite soccer players. Int J Sports Physiol Perform. 2016;
quadrant (c), to specific fitness quadrant (d). It is worth noting 11(2):176–181. PubMed ID: 26182437 doi:10.1123/ijspp.2015-0135
that this analysis remains hypothetical and may be limited as 4. Akubat I, Barrett S, Abt G. Integrating the internal and external
a simplified approach to understanding fitness but may offer a training loads in soccer. Int J Sports Physiol Perform. 2014;9(3):457–
pragmatic approach to understanding player fitness with minimal 462. PubMed ID: 23475154 doi:10.1123/ijspp.2012-0347
additional testing. 5. Lacome M, Simpson B, Broad N, Buchheit M. Monitoring players’
readiness using predicted heart-rate responses to soccer drills. Int J
Limitations Sports Physiol Perform. 2018;13(10):1273–1280. PubMed ID:
29688115 doi:10.1123/ijspp.2018-0026
First, our FI was not corroborated against a soccer-specific fitness 6. Shushan T, McLaren SJ, Buchheit M, Scott TJ, Barrett S, Lovell R.
measure. As discussed, HRex is a proxy for aerobic fitness, rather than Submaximal fitness tests in team sports: a theoretical framework for
the intermittent and varied physical demands required in soccer, and evaluating physiological state. Sports Med. 2022;52(11):2605–2626.
therefore, our analysis could have benefitted from comparison to an PubMed ID: 35817993 doi:10.1007/s40279-022-01712-0
intermittent fitness test, such as the 30 to 15 intermittent fitness test or 7. Shushan T, Lovell R, Buchheit M, et al. Submaximal fitness test in
Yo-Yo test. Second, while HR measures are important measures for team sports: a systematic review and meta-analysis of exercise heart
internal load, erroneous data collection is common during team sport rate measurement properties. Sports Med Open. 2023;9(1):21.
training due to compliance issues and errors caused by contact and/or PubMed ID: 36964427 doi:10.1186/s40798-023-00564-w
the belts slipping. Moreover, the HR recorded during training drills 8. Shushan T, Norris D, McLaren SJ, et al. A worldwide survey on the
may be influenced by additional confounding factors not accounted practices and perceptions of submaximal fitness tests in team sports.
for in the current study. These factors, such as the noise resulting from Int J Sports Physiol Perform. 2023;18(7):765–779. doi:10.1123/
the presence or absence of the ball during training activities and the ijspp.2023-0004
overall stress levels of the players, have the potential to impact the 9. Carling C, Lacome M, McCall A, et al. Monitoring of post-match
assessment of the players’ fitness status. fatigue in professional soccer: welcome to the real world. Sports Med.

(Ahead of Print)
Fitness-Status Evaluation Through Machine Learning 11

2018;48(12):2695–2702. PubMed ID: 29740792 doi:10.1007/ 2012;112:711–723. PubMed ID: 21656232 doi:10.1007/s00421-
s40279-018-0935-z 011-2014-0
10. West SW, Clubb J, Torres-Ronda L, et al. More than a metric: how 25. Kensert A, Alvarsson J, Norinder U, Spjuth O. Evaluating parameters
training load is used in elite sport for athlete management. Int J Sports for ligand-based modeling with random forest on sparse data sets. J
Med. 2021;42(04):300–306. doi:10.1055/a-1268-8791 Cheminformatics. 2018;10(1):1–10. doi:10.1186/s13321-018-0304-9
11. Mandorino M, Figueiredo AJ, Cima G, Tessitore A. Predictive 26. Mandorino M, Figueiredo AJ, Cima G, Tessitore A. A data mining
analytic techniques to identify hidden relationships between approach to predict non-contact injuries in young soccer players. Int J
training load, fatigue and muscle strains in young soccer players. Comput Sci Sport. 2021;20(2):147–163. doi:10.2478/ijcss-2021-
Sports. 2021;10(1):3. PubMed ID: 35050968 doi:10.3390/sports 0009
10010003 27. Hopkins W, Marshall S, Batterham A, Hanin J. Progressive statistics
12. Rossi A, Pappalardo L, Cintia P, Iaia FM, Fernández J, Medina D. for studies in sports medicine and exercise science. Med Sci Sports
Effective injury forecasting in soccer with GPS training data and Exerc. 2009;41(1):3. PubMed ID: 19092709 doi:10.1249/MSS.
machine learning. PLoS One. 2018;13(7):e0201264. PubMed ID: 0b013e31818cb278
30044858 doi:10.1371/journal.pone.0201264 28. Keogh EJ, Pazzani MJ. Scaling up dynamic time warping to massive
13. Mandorino M, Figueiredo AJ, Cima G, Tessitore A. Analysis of datasets. In: Principles of Data Mining and Knowledge Discovery:
relationship between training load and recovery status in adult soccer Third European Conference, PKDD’99, Prague, Czech Republic,
players: a machine learning approach. Int J Comput Sci Sport. 2022; September 15–18, 1999. Proceedings 3. Springer; 1999:1–11.
21(2):1–16. doi:10.2478/ijcss-2022-0007 29. Cohen J. Statistical Power Analysis for the Behavioral Sciences.
14. Rossi A, Perri E, Pappalardo L, Cintia P, Iaia FM. Relationship Academic Press; 2013.
between external and internal workloads in elite soccer players: 30. Mandorino M, Tessitore A, Coustou S, Riboli A, Lacome M. A new
comparison between rate of perceived exertion and training load. approach to comparing small-sided games and soccer matches de-
Appl Sci. 2019;9(23):5174. doi:10.3390/app9235174 mands. Bio Sport. 2024;41(3):15–28. doi:10.5114/biolsport.2024.
15. Rico-González M, Pino-Ortega J, Méndez A, Clemente F, Baca A. 132989
Machine learning application in soccer: a systematic review. Biol 31. Milanez VF, Ramos SP, Okuno NM, Boullosa DA, Nakamura FY.
Sport. 2023;40(1):249–263. PubMed ID: 36636183 doi:10.5114/ Evidence of a non-linear dose-response relationship between training
biolsport.2023.112970 load and stress markers in elite female futsal players. J Sports Sci
16. Winter EM, Maughan RJ. Requirements for ethics approvals. J Sports Med. 2014;13(1):22–29. PubMed ID: 24570601
Sci. 2009;27(10):985. doi:10.1080/02640410903178344 32. Bache-Mathiesen LK, Andersen TE, Dalen-Lorentsen T, Clarsen B,
17. Helgerud J, Høydal K, Wang E, et al. Aerobic high-intensity intervals Fagerland MW. Not straightforward: modelling non-linearity in
improve VO2max more than moderate training. Med Sci Sports Exerc. training load and injury research. BMJ Open Sport Exerc Med.
2007;39(4):665–671. PubMed ID: 17414804 doi:10.1249/mss. 2021;7(3):e001119. PubMed ID: 34422292 doi:10.1136/bmjsem-
0b013e3180304570 2021-001119
18. Gómez-Carmona CD, Pino-Ortega J, Sánchez-Ureña B, Ibáñez SJ, 33. Weaving D, Jones B, Till K, Abt G, Beggs C. The case for adopting a
Rojas-Valverde D. Accelerometry-based external load indicators in multivariate approach to optimize training load quantification in team
sport: too many options, same practical outcome? Int J Environ Res sports. Front Physiol. 2017;8:1024. PubMed ID: 29311959 doi:10.
Public Health. 2019;16(24):5101. PubMed ID: 31847248 doi:10. 3389/fphys.2017.01024
3390/ijerph16245101 34. Vallance E, Sutton-Charani N, Imoussaten A, Montmain J, Perrey S.
19. Gómez-Carmona CD, Bastida-Castillo A, García-Rubio J, Ibáñez SJ, Combining internal and external-training-loads to predict non-contact
Pino-Ortega J. Static and dynamic reliability of WIMU PRO™ injuries in soccer. Appl Sci. 2020;10(15):5261. doi:10.3390/app10155261
accelerometers according to anatomical placement. Proc Inst Mech 35. Weaving D, Jones B, Marshall P, Till K, Abt G. Multiple measures
Eng Part P J Sports Eng Technol. 2019;233(2):238–248. doi:10. are needed to quantify training loads in professional rugby league. Int
1177/1754337118816922 J Sports Med. 2017;38(10):735–740. PubMed ID: 28783849 doi:10.
20. Muñoz-López A, Granero-Gil P, Pino-Ortega J, De Hoyo M. The 1055/s-0043-114007
validity and reliability of a 5-hz GPS device for quantifying athletes’ 36. Randers MB, Nielsen JJ, Bangsbo J, Krustrup P. Physiological
sprints and movement demands specific to team sports. J Hum Sport response and activity profile in recreational small‐sided football:
Exerc. 2017;12(1):156–166. doi:10.14198/jhse.2017.121.13 no effect of the number of players. Scand J Med Sci Sports. 2014;
21. Gomez-Carmona CD, Bastida-Castillo A, Gonzalez-Custodio A, 24:130–137. PubMed ID: 24944137 doi:10.1111/sms.12232
Olcina G, Pino-Ortega J. Using an inertial device (WIMU PRO) 37. Bredt SGT, Chagas MH, Peixoto GH, Menzel HJ, de Andrade AGP.
to quantify neuromuscular load in running: reliability, convergent Understanding player load: meanings and limitations. J Hum Kinet.
validity, and influence of type of surface and device location. J 2020;71:5. PubMed ID: 32148568
Strength Cond Res. 2020;34(2):365–373. PubMed ID: 31985715 38. Coyle EF, Gonzalez-Alonso J. Cardiovascular drift during prolonged
doi:10.1519/JSC.0000000000003106 exercise: new perspectives. Exerc Sport Sci Rev. 2001;29(2):88–92.
22. Buchheit M, Simpson BM, Lacome M. Monitoring cardiorespiratory PubMed ID: 11337829
fitness in professional soccer players: is it worth the prick? Int J Sports 39. Zuccarelli L, Porcelli S, Rasica L, Marzorati M, Grassi B. Compari-
Physiol Perform. 2020;15(10):1437–1441. PubMed ID: 33004681 son between slow components of HR and VO2 kinetics: functional
doi:10.1123/ijspp.2019-0911 significance. Med Sci Sports Exerc. 2018;50(8):1649–1657. PubMed
23. Buchheit M. Monitoring training status with HR measures: do all ID: 29570539 doi:10.1249/mss.0000000000001612
roads lead to Rome? Front Physiol. 2014;5:73. PubMed ID: 40. Verheijen R. Football periodisation. World Football Academy; 2014.
24578692 doi:10.3389/fphys.2014.00073 41. Malone JJ, Di Michele R, Morgans R, Burgess D, Morton JP, Drust B.
24. Buchheit M, Simpson MB, Al Haddad H, Bourdon PC, Mendez- Seasonal training-load quantification in elite English premier league
Villanueva A. Monitoring changes in physical performance with soccer players. Int J Sports Physiol Perform. 2015;10(4):489–497.
heart rate measures in young soccer players. Eur J Appl Physiol. PubMed ID: 25393111 doi:10.1123/ijspp.2014-0352

(Ahead of Print)

You might also like