Professional Documents
Culture Documents
WORLD CUP:
SEARCHING THE FORMULA FOR SUCCESS
Ionuț APAHIDEANU
ABSTRACT
1
INTRODUCTION
The research underlying the article started from a preliminary, superficial, and curiosity-determined match-by-match incursion into the World Cup’s 64
games employing 22 parameters classifiable as “ingredients of success” across match outcomes – wins, ties, and losses:
2
Whereas some of the results don’t really represent surprises (e.g. the crucial importance of
scoring first, especially in a final tournament, or the diminished relevance of possession), what firstly
draws attention is the low predictive power of the variables, with only four of them barely surpassing
the 50% winning threshold.
Secondly, and more importantly, the graph captures some quite intriguing facts, if not genuine
anomalies, from what should normally be expected when searching explanations for success, such
as: defensively, the team either/both more active or/and more competent in terms of tackles has more
often, and significantly so, lost than won (51.6 vs. 28.1%); physically, the team moving faster (in both
measurements) on the pitch lost more often than it won; most surprising tough, tactically, indicators
of territorial domination and attack-orientation seem to have turned into negative factors - making
more deliveries and solo runs into the attacking third per own possession time unit brings 1.3 time
more losses than wins, whereas committing more offsides than the opposition has become the
strongest inhibitor of success (55% losses vs. 30% wins).
Corroborating such intriguing findings with other, either direct visual, or superficial metrics
posted during matches on the FIFA website and suggesting a rather atypical tournament, has prompted
us to a further, in-depth, investigation of the WC-participating teams as units of analysis, specifically
the conditioning of their performance by various independent variables.
➢ physical:
1. total distance covered by team (km) standardized per conventional 90-minutes match;
2. speed zone score [hereinafter SZS] – weighs ordinally the % of time spent on weighted
averages and for a standardized 90’ match by the team in five different speed zones: Z1 from
0 to 7 km/h; Z2 from 7 to 15 km/h; Z3 from 15 to 20km/h; Z4 from 20 to 25 km/h; Z5 over
25km/h: (%TSZ1 *1/5) + (%TSZ2 * 1/2.5) + (%TSZ3 * 1) + (%TSZ4 * 2.5) + (%TSZ5 * 5);
3. total number of sprints made by the team / standardized 90’;
4. average height of the squad;
5. % of aerial duels won (not weighted avg. of matches (lack of data));
➢ positional:
6-10. weighted % of time spent by mathematically-modelled avg. field player (starters and subs,
weighted for all players, matches and durations, GKs excluded) in four areas of the pitch
and the derived general positional score [GPS]: a) in team’s own half (multiplier = -0.5);
b) in the opposition’s half, outside of the attacking third (multiplier = 1); c) in the attacking
third, outside of the box (multiplier = +2.5); d) in the opposition’s box;
1
With the exception of squad average height (taken from transfermarkt.de), respectively % of aerial duels won, % of
tackle success, and % of successful dribbles (match reports on skysports.com).
4
11. Offsides committed / std. 90’;
12. Offsides received / std. 90’;
13. Offsides committed / 10’ AOWP;
14. Offsides received / 10’ AOPP.
➢ disciplinary:
15. no. of fouls committed / 1’ AOPP;
16. severity of fouls / std. 90’ – number of yellow and red cards received per std. 90’ of match,
acc. to FIFA’s algorithm: -1 for YC; -3 for 2nd YC – indirect RC; -4 for direct RC;
17. disciplinary score [discS] = total number of fouls committed * total weighted number of
cards received / total minutes of AOPP;
➢ defensive:
18. defensive distance covered by all field players per standardized 30’ AOPP [DDT];
➢ passing-related:
28. w. avg. % possession (actual) [Poss];
29. avg. passing accuracy (%) [PAcc];
➢ offensive:
30. no. of shots delivered / std. 90’;
31. no. of shots on target [OTS] delivered / std. 90’;
32. goals for / std. 90’;
33. shot accuracy (% OTS / total shots delivered);
34. shot efficiency A (% goals / total shots delivered);
35. shot efficiency B (% goals / OTS delivered);
36. number of OTS delivered / 1’ AOWP;
37. set-pieces conversion rate [SPCR, %] - includes corner kicks, attempts from direct,
respectively indirect free kicks towards goal (as hosted on fifa.com); working “set-pieces”
filter = ≤ 6 passes after a CK; ≤ 3 passes after a free kick2);
2
E.g. Kroos’ goal scored against SWE, considered by FIFA for some reason as originating from open-play, is
subsumed here to goals from set-pieces (2 passes post initial ball touch). Same goes for KOR’s 1st goal against DEU,
although in our opinion the situation is still, indisputably, and entirely, a corner kick situation, etc.
5
38. goal-scoring diversity – Taylor-Hudson fragmentation index for three categories: a) set-
pieces + 11m; b) counterattack; c) open-play (own goals are distributed functionally and
logically across the three categories3; filter: at least 3 goals scored); this variable should
explore any relation between scoring polyvalence and success;
39. no. of shots delivered /10’ AOPP;
40. no. of deliveries into attacking third / 1’ AOWP;
41. no. of deliveries and solo runs into att. third / 1’ AOWP;
42. % of successful dribbles (not match duration - weighted for lack of adequate data).
To the above, we added a non-metric variable of tactical type – the playing formation
dominantly used by each participating team in our own observation and understanding, and not
according to the official FIFA website.
Three other methodological remarks should be made at this point. Firstly, all data and results
are based on actual playing-time, ball-based, possession, not aggregate, time-based, the differences
between them being significant: on average, the actual playing time of the 64 WC matches was
55’45’’ per standardized 90’ (mode = 52’, nine matches); in minutes/minutes, actual possession
accounts for various % of total possession, from 47.5 (MOR, in the match against IRN), up to 85.8
(CRC against BRA); over a series of 64 matches * 2 opposing teams, the avg. |difference| between
the two counts of % possession is 2.74% (σ = 1.89), etc.
Secondly, variables [30-37] and 39 do not include penalty kicks, our rationale being that
shooting on target from a penalty (same for scoring) does not indicate attacking competence (nor,
conversely, defending incompetence), an old adage stating that penalty kicks are missed, and not
defended.
Thirdly, before an exploratory factor analysis, our preliminary correlations matrix for the 42
variables indicated the selection of them as satisfactory. Thus, aside from the deliberate multi-
collinearity cases (e.g. time spent in different areas of the pitch – GPS, or DPS derived directly from
the latter), there were only a few noteworthy correlations, none of them anyhow surprising: - teams
sitting deeper on pitch will register lower Poss and PAcc, and also deliver less shots per match; -
tackling aggressiveness (in terms of intention) and competence correlate positively (players at this
level tend to tackle only if they know what they are doing); - the two measures of efficiency correlate
strongly with the set-piece conversion rate, respectively with goals scored per 90’; recoveries and
clearances are in a negative relation with the total distance covered per match (also logical – the team
runs more exactly because/when it fails to recover the ball); - dribbling success characterizes teams
sitting more advanced on the pitch and attacking (TS – opp. box), therefore it correlates negatively
with the number of yellow and red cards, etc.
3
E.g. Mandzukic’s own goal in the final is considered here as a set-piece-originating goal for France.
4
Such as the series comprising a mere 32 pairs on a linear regression, or the very specifics of a final tournament not
only as opposed to a championship (in the former case, the mathematically strongest teams can meet in the early
knockout stages), but also in terms of the multitude of playing styles (e.g. some teams play the offside-trap, others
prefer to defend deep, some resort to counterattacks with a high occurrence of solo runs into the attacking third, others
prefer a combinative game with more passes/deliveries, etc.).
6
Factors conditioning success at the 2018 World Cup. First step approach: correlation and significance:
Employing a single-factor ANOVA test (for two, respectively four groups of teams arranged ordinally according to their performance) adds
another three noteworthy variables, with DPS and OTS delivered / AOWP surpassing the significance threshold that they only bordered in the case of
regression, respectively SZS as a new entry.
7
Finally, based on the 28 teams (and corresponding pairs of values) remaining after filtering
out the ones with less than 10 offensive set-pieces over all their matches, SPCR (p = .057, adj. R2 =
.098) also becomes a relevant factor when ANOVA-tested over two groups (p = .044; F 4.45 > F crit.
4.23)5.
Whereas the variables at the top of the table are quite evident in terms of both statistical terms
and their meaning in a football game, the following addresses five elements bordering statistical
significance or beneath it in the strict canonical linear approach, but undoubtedly quite relevant and
explanatory in modern football, to which we add some non-numerical elements, such as playing-
system used.
5
Expectable considering the high % of goals scored at the WC from set-pieces (29.8% in our definition, second only to
open-play – 38.6%, above counterattacks – 18.1% and penalty kicks – 13.5%, the latter higher than 7% at the previous
WC due to introduction of VAR).
6
Possession hasn’t been identified among predictors of success at any of the previous four World Cups (see for instance
Rumpf et al. 2017 and Castellano et al. 2012), except for the group stage at the 2014 edition (Liu et al. 2015). Hughes
and Frank (2005) make an interesting case on the negative relation between possession and the goals/shots ratio.
8
interval), our operating definition of attack-orientation comprising shots delivered, deliveries into
opp. third (adjusted by PAcc), and solo runs into the same area, all per AOWP time-unit:
The tournament displayed enough variety, with however neither any relation discernible between
a certain playing system and performance (ANOVA), nor any massive mutation of preferences for
certain formations. The relative majority of teams continued to opt for the balanced, stable 4-2-3-1
formation. Apparently, the 4-3-3 seems to be on a slight decline, in both its more offensive,
9
possession-based, 4-V-1 version, and its more defensive 4-5-1 counterpart, in favour of two broad
alternatives: 4-4-2/4-4-1-1, usually flat (except URU’s diamond in their last three matches),
respectively various 3 centre-back-formations, e.g. 3-4-2-1 (BEL, POL (twice), NIG (once)), 3-5-
2/3-5-1-1 (ENG, NIG (once), or a deep-sitting 5-3-1-1 (CRC).
Neither of the last two evolutions, both anyhow incipient and incremental, should be over-
emphasized. Thus, regarding the last category, three remarks should be made before prematurely
considering the possibility of the 3 CBs’ “comeback”: firstly, only 24 (or 25, if including BEL vs
BRA) of all the 128 formations fielded at the tournament, so less than 1/5, have had three centre-
backs; secondly, only three teams have used the system constantly (BEL, ENG, and CRO), two other
in 2/3 of their matches (NIG, POL), and another four each once (RUS, ARG, PAN, and TNS each
once), thirdly, filtering out ENG and BEL, when confronted against teams with 4 defenders, the 3
CBs formations have won once (NGA > ICE), tied twice under particular circumstances (CRC-SUI,
RUS-ESP) and lost 7 times. As for the semi-finalists, there are again other factors we consider as
better explanations of their success than the playing formation (their accessible half of table, BEL’s
lethal counters, ENG’s ruthless set-pieces conversion and physical effort, etc.).
As for the 4-4-2, a system by definition fit for counterattacking, has indeed registered a certain
success7, but in our opinion, there were two other more general factors explaining the results better
than the formation in itself – integrity, respectively position of the defensive block8, both of them
discussed in the following.
• Integrity of the defensive block (“ignore the ball, keep the shape”)
Confirming the findings of the preliminary match-by-match analysis, our regressions captured
recoveries and clearances as irrelevant to success, regardless of how they are measured (basically
mathematical independence from success), whereas tackles (both aggressiveness and competence)
and offside traps as well are actually inhibitors of success (albeit beneath sig.). Instead, the crucial
7
Against different formations, it registered 52.9% wins vs 29.4% losses, second only to the 4-3-3 system (55.6% wins
vs 16.7% losses).
8
The combination of these two explaining for instance why the flat 4-4-2 teams have fared better against three central-
midfielders formations (6W, 2D, 4L) – they compensated the numerical deficit in the midfield by compactness and an
intelligent occupation of the pitch.
10
factor seems to be maintaining the shape of the defence (its compactness, in simpler terms, regardless)
at the assumed cost of not trying to recover the ball as quickly as possible, so an apparent passivity,
no high or middle pressing, and no offside traps. Such a style should and is indeed reflected in our
DDT indicator, with the mentioned 4-4-2 teams clearly standing out as a group among the WC teams:
Furthermore, this prioritization wasn’t specific to one formation type or another, but rather a
general rule of the entire tournament (France being actually the team that ran least in standardized
terms when defending!), and crucially so in the knockout stages:
9
The general fatigue level preventing successful/durable counter-pressing, not to speak of the training time required to
successfully implement it, the physical effort required on counters when defending too deep, the undesirable risk of
conceding a goal on the counter if too high up on the pitch, a risk exponentially increased in usually tactical final
tournaments, etc.
11
Defensive position and performance at the World Cup - the golden measure:
12
Concludingly, this compactness at a certain distance from own goal, forcing the opposition to
keep running in search of spaces, may explain why, over all WC matches, the team with higher
possession did indeed run less than its opponent on average, as expectable, but in only 57% of the
matches (33/58 valid cases)10.
Thus constructed, strongly correlated in both its dimensions and significantly so, bearing an adjusted
linear R2 of .66 (at p < 0.001), our performance score is bettered only by the reference-serving GF/GA
ratio (.68):
10
As a side note, except for our engineered DDT, no other physical performance indicator related significantly to
success at the 2018 WC, similar to findings regarding the previous World Cup (see Rumpf et al. 2017), or those
showing that technical indicators predict success better than physical ones (Carling 2013).
11
Which, on a defined set of 19 secondarily-derived variables highlighted the existence of 7 components accounting for
80+% of the total variance explained and all elements > .63 extraction values (under an expectedly KMO of only 103
given the sample size, df = 171, approx. chi-square 303.6).
13
Integrated defensive and offensive performance assessment of the 2018 WC participating teams. The performance score – instrument:
14
Obviously, aside from possibilities of refining it further12, the score has certain limitations,
most of them assumed. Thus, not only is it based on the fundamental idea of efficiency (which is
why, for instance, Russia ranks second on the attack scale), but, among the important factors it
deliberately doesn’t take into account one can immediately notice penalty kicks (with their share of
total tournament goals jumping from 7 to 13% following the introduction of the VAR system) or the
very fundamental goalkeeping competence (which is why for instance Spain ranks fifth in defensive
terms – we don’t think that de Gea conceding 6 goals out of 7 shots on target over Spain’s four WC
matches makes the team less defensively strong). Additionally, in what might for instance explain
why, in what has been tournament generally perceived as defensive, the corresponding dimension
accounts for only roughly 40% of the performance scale, we need to state that the instrument is clearly
not adjusted to either goals conceded when it doesn’t really matter anymore (say ARG’s two goals
against FRA, or CRO’s late goal against the same opponent), or certain tournament-specific matches
without a real, genuine, stake (say for instance FRA vs DEN, or the memorable last minutes of POL
vs JAP), which will obviously distort the statistics of a 64-matches competition. Or, as a by no means
last example at hand, our performance score version or any other imaginable alternative for that
matter will never reach a full correlation with the success registered also because of the very specifics
of a tournament in which, unlike a national championship for example, the two mathematically best
teams can meet in an early knockout round, which, logically, doesn’t make the one eliminated less of
a good team.
Conclusions
Shot-prowess and associated measures of accuracy and more importantly efficiency (set-pieces
included as a sub-factor) remain the strongest predictors of success, as was the general case with
previous World Cups. Additionally however, aside from another nail in the coffin of possession for
possession’s sake, which we replaced with an upgraded and explanatory clearly stronger form of
“meaningful possession” oriented towards attack (in a well-tempered manner however), we have
also made a series of other clear, statistically significant, additions to the list of “ingredients of
success”, from the percentage of successful dribbles to a non-linear defensive position on the pitch,
or from the imperative of keeping the defensive shape in that position to the admittedly cynical idea
of breaking the fair play rules once in a while, whenever a key-situation demands it.
In relation to these findings, aside from a set of methodological adjustments throughout the
paper (from operating with actual, effective, possession, to functional and logical separations of set-
pieces from open-play), we constructed as a methodological instrument a “performance score” as a
considerably strong predictor of performance at the 2018 FIFA World Cup. Aside from its obvious
and exemplified limitations, among what we plea as being its merits: it salvages a number of clearly
relevant variables among those that, for certain and logical reasons, are not (always) correlated
individually with performance in linear approaches; offers an encompassing integration of physical,
positional-tactical, disciplinary, passing-related, defensive and offensive variables; it offers a
satisfactory single-number expression of performance, being bettered as a predictor only by the no-
brainer GF/GA ratio, in an associated equation that explains 2/3 of the performance scale at the World
Cup; it opens up new research paths and the possibility of extending its applicability to both
longitudinal (i.e. WC evolutions in time) and transversal comparisons (i.e. cross-tournament) and of
proceeding to opportune competition-specific adjustments.
Finally, from a more encompassing and not necessarily scientific perspective, we might
conclude that it has been an indeed somewhat bizarre tournament, amongst whose intriguing features,
for instance: the defensive aspect of the game has been extremely important, not in the sense of the
classic parking-the-bus manner, but rather as an intelligent positioning on the pitch when defending;
the percentage of goals scored from set-pieces plus penalty kicks was higher than the ones from open-
12
Match-weighted versions of the avg. dribble success and aerial duels won first come to mind, as well as xG elements,
constructing a measure of compactness, considering the season’s physical burden at tournament start, etc.
15
play; the world champion is the team that ran the least of all 32, in standardized terms, on the
defensive phase, and also deliberately ceded possession in three of their four knockout matches; the
challenger’s players have fouled their opponents the most often * severe; semi-finalist Belgium have
committed more fouls than the opposition in five of seven matches; while the other semi-finalist ENG
have scored three quarters of their goals from set-pieces plus penalty kicks.
Bibliography
Sources of data:
fifa.com
transfermarkt.de
skysports.com
Articles:
16