Professional Documents
Culture Documents
Research Paper
Examining the Relationship between Positional Spending in the NFL and Success Metrics
Introduction
position is the “most valuable position” in the sport, (Hughes, Koedel, Price 2015) with it often
being assumed to be the most valuable position in professional team sports. However, beyond the
quarterback discussion there is a lot of debate as to what positions for teams to value highest,
with many differing opinions being seen. This topic is an important point of discussion now in
pro football as we see the salary cap drop to $182.5 million due to the pandemic, down from
$198.2 million. (Belson, 2021) although we expect it to rise with economic recovery through the
industry. Teams in the NFL are beginning to adopt more analytical thought processes after
seeing the success of “Moneyball” teams in baseball like the Tampa Bay Rays and the Oakland
Athletics. Personnel department decisions should use new-age analytical tools look at all the data
available and determine the value of different positions in relation to success of your team would
be a great piece of information to see which position groups are most valuable.
As Patrick Schilling shared in his article, “The Art of Positional Spending in the NFL”,
“If a team is unable to balance their pay throughout their roster to excel in all three phases of the
game, it’s highly unlikely that they will be able to lift the Lombardi Trophy at the end of the
season.” He then goes on to mention the 2017 Philadelphia Eagles, Super Bowl LII champions,
noting how they ranked only 26th in the NFL at quarterback spending. According to Schilling,
the Eagles did a good job in this season spreading the limited cap space around position groups,
allowing them to be strong throughout the roster rather than having the cap space heavily spent
The purpose of this research will be to find statistically significant data linking NFL positional
group salaries to different metrics of success. This will be achieved through regression model
analysis and correlation studies to see whether there is a statistically significant link between
The main stakeholders in this research should be NFL personnel departments and owners.
Personnel departments are the first stakeholders as they are the ones who would be able to best
use the information that comes from this study, as it will better allow them to value players based
on position. Owners would also be stakeholders, although secondary, as their main desire should
be to find as much bang for their buck in terms of salary spent on players. Another theoretical
stakeholder could be the gambling community, as both the oddsmakers and the bettors
themselves use every advantage possible, including analytical data, in order to guarantee the
The main research question that is being asked in this research would ask are “What is
the relationship between positional spending in the NFL and both yards per play and points per
game?” These two metrics of success will give this study the best chance to project whether
Method
For my research into the relationship between positional spending in the NFL and success
on the field, I will use two main sources. Salary data will be collected from Spotrac.com, which
has already tracked the positional spending by NFL franchises back to the 2015 season. I had
initially thought going into this research I would be able to find data further back in time, but I
feel that the 6 seasons of data from 2015 to 2020 will be sufficient to give my research a large
enough sample size. When it comes to my metric for success, which will be yards per play from
scrimmage, I will have to use two different sources, NFL.com and ESPN.com. This is solely
because both sites only publish one of the metrics required to compute this calculation, with the
NFL publishing plays from scrimmage and ESPN providing total yardage. When it comes to
position groups, I will split them by major groups as Spotrac already does. This will leave me
with 9 position groups to analyze: quarterback, running backs, wide receivers, tight ends,
offensive lineman, defensive lineman, linebackers, defensive backs, and specialists. Ideally I
would like to split these groups even further, but for the case of this research it will be sufficient,
Prior to running the correlation analysis, I will check the normality of the DV’s for my
research by creating a histogram of each of the DV’s that will be used in this study (yards per
play and points per game). This histogram will show us the distribution curve of the DV’s,
In order to find whether there is a significant statistical link between NFL positional spending
and yards per play, I would run a correlation analysis between yards per play and points per
game, which will be the dependent variables for this test, and the NFL team spending by
positional groups, which will be my independent variables. Both variables are continuous values,
with them more specifically being ratio variables. First running a correlation analysis before
moving on to running a regression model will give the research a way to find whether there is a
significant link between these variables, as well as giving a clue as to what the regression model
will give us. For the correlation analysis, the null hypothesis will be that there is no significant
statistical link between yards per play/points per game and positional spending by NFL teams.
The alternative hypothesis will therefore be that there is a statistically significant link between
yards per play/points per game and positional spending. The correlation analysis will also be ran
in order to see if there are any multicollinearity concerns between the independent variables.
This will be done by examining whether there is any significant correlation between our
independent variables.
After running the correlation analysis to see whether there is a link between the variables,
a multiple linear regression model will be run to see the what the links are and how the
independent variables affect the dependent variables. The first metric that will be looked at after
running the regression will be the R-square, which will tell us about the total variance in the
DV’s that is caused by the IV’s. Running a regression model is imperative finding a solid answer
to my research question as, assuming we reject the null hypothesis of the correlation analysis, it
will best show the actual link between success and positional spending by showing the impact of
said link using the coefficients provided by the regression model. We will finally check the
normality of our regression model by creating a histogram of the residuals that come from our
models.
I will first look at major descriptive statistics like mean, median, standard deviation and
variance in my results section. When the multiple linear regression model is run, I will use the p-
values as a check in order to determine whether the results of my regression model are
statistically significant. In the results section of my research, I will explain the significance of the
results of the regression model and using the coefficients of that model in tandem with the
descriptive statistics gathered from the dataset I hope to determine what positional groups are
more valuable for NFL teams to invest in, and perhaps more importantly identify where NFL
teams should look to spend in order to get greater value out of their spending.
Results
First we looked at basic descriptive statistics for our independent variables and our
dependent variables. For our dependent variables (see tables 1-2), we found that the averages for
points per game both scored and allowed were 23.05, with a standard deviation of 4.26 for points
scored and 3.54 for points allowed. For yards per play, we see an average of 5.48 yards both
gained and allowed per play, with a standard deviation of 0.47 for yards gained and 0.41 for
yards allowed. Interpreting these differences in standard deviation show the potential that NFL
offenses show a higher degree of variance in terms of success metrics rather than their
counterparts on defense, but more research would need to be done to determine that. When
looking at the dependent variables of positional spending (tables 3-4), we see large variance and
standard deviation numbers for all position groups, showing that there is a wide range of figures
Table 1
Table 3
Table 4
(charts 5-8), with the histogram showing normal distribution for both variables on offense and
defense. The concern shown from these histograms is in the histogram of points per game, in
Chart 5
Chart 6
Chart 7
Chart 8
After checking the normality of the dependent variables, a correlation analysis was ran in
order to determine the relationship between our independent variables and our dependent
variables (Tables 9-10). In our correlation analysis of NFL offenses, we see that both of our
dependent variables have weak positive relationships with QB spending and OL spending, with
the other position groups having negligible correlation. Because of these weak relationships with
the dependent variables, we reject the null hypotheses that there is no statistically significant
relationship between positional spending and yards per play/points per game and accept the
alternative hypotheses. On defense however, we see negligible negative correlations between all
of our dependent variables and independent variables. For both offense and defense we do not
After the correlation analysis we move on to the multilinear regression models ran, with
two models both on offense and defense, one using yards per play as the IV and one using points
per game. (See Tables 11-14) All four of our regression models show F-stats of less than 0.05, so
all of our models as a whole are statistically significant and we would reject the null hypothesis
that there is no link between the dependent variable and positional spending by NFL teams and
accept the alternative hypotheses. The R-square show on both models for offense that
approximately 21% of all variances in the dependent variables is due to positional spending. On
defense however we see that positional spending only has 10% of the effect on variance of points
per game, and 5% of the effect on the variance of yards per play allowed. The individual p-
values in our models show us that RB and TE spending are not statistically significant in our
yards per play model on offense, while TE spending is not statistically significant to the model in
our offensive PPG regression as well. On defense, DL and DB spending is not statistically
significant to our YPP model. All other dependent variables have a p-value of less than 0.05 and
are statistically significant to our model. The coefficients off of each of our regression model
show us how much we would expect positional spending to be based on each of our dependent
variables. For example, in our regression model of offensive PPG, we expect to see an additional
R Square 0.2170326
Coefficien
ts P-value
Intercep 4.758E-
t 4.7706655 93
8.038E-
QB 1.379E-08 05
0.382538
RB/FB 7.521E-09 6
0.016959
WR 9.685E-09 2
0.588454
TE -4.45E-09 8
1.196E-
OL 1.641E-08 05
Table 12 (Offense PPG Regression)
R Square 0.2183195
Coefficien
ts P-value
Intercep 1.075E-
t 15.898148 32
0.000128
QB 1.216E-07 9
0.039471
RB/FB 1.62E-07 7
WR 8.7E-08 0.018319
3
0.348720
TE 7.013E-08 9
0.000124
OL 1.3E-07 5
For a final check of our regression models, we looked at the distribution curve of the
residuals found by our models. (see charts 15-18) In each of these histograms we see a normal
distribution curve, which is what we would expect to see. This is important to see as it is a
validity test for our regression models, if we were to see an abnormal distribution it would tell us
Chart 16
Chart 17
Chart 18
Discussion
After going through the results of my regression models, one of the clearest datapoints that I
could find was the strength of the spending on QB and OL in both the model for Points per
Game and Yards Per Play, especially in comparison to the other offensive positions. This finding
supports the common opinions that these are the most important positions to spend on.
(Schilling) When looking at the league we see that NFL teams on average follow the guidance
that this model would forecast, with QB and OL both having two highest average salary spent
per position group on offense. Our model also infers that WR is the most significant skill
position group to spend on, more so than at TE or at RB/FB. I would hypothesize this is due to
the increase in NFL offenses relying on passing during the years included in the model.
When we move over to look at defenses, one of the more eye-catching results that came
out of the two regression models was the lack of statistical significance from the DL and DB
position groups in the defensive YPP model. I personally was surprised to see this as a result, as
many teams in the NFL are putting less focus on the LB position, which was the only position
group found to be statistically significant in that same defensive YPP model. Fundamentally,
however, we saw minimal link between the positional spending by NFL teams on defense and
metrics of team success. I would hypothesize that this is potentially due to defenses
fundamentally being less individually focused than offenses that build around individual players
like QBs.
In my eyes one of the major practical takeaways that I see through this research is that
there is not a higher amount of variance that is caused by positional spending. While positional
spending as a whole did have an approximate impact on the variance of our dependent variables
of 21%, not an insignificant value, the vast majority of this value comes from the QB position
along with the OL. Seeing minimal impact from the skill positions on the variance of our
dependent variables was my most surprising result, as we think of skill positions as intuitively
having major impacts on the game, although overall the good and bad may even out. This value
of positions is something I would encourage other researchers to investigate, looking the total
spending is incredibly marginal, even in a league like the NFL where these types of margins
matter. We also saw a large spread of our residuals from our regression models, especially our
PPG model, which aligns well with the marginal impact of our independent variables on our
dependent variables. Overall, I would say that there is minimal reason for NFL teams to spend
resources to optimize positional spending, as my opinion after undertaking this research is that
Overall, I would say that this research has changed my thoughts on positional spending in
some ways, while solidifying what I thought about others. Finding that QB spending had a
sizable impact on success wasn’t surprising to me, while also finding that spending on a position
like TE did not was not what I had expected. Going forward, one topic I would like explored is
whether individual salaries at these positions make differences, and whether most big-money
contracts equate to value and win shares on the field. One major limitation for my study was that
it only encompassed the 5 years from 2016-2020. While I feel that this was a large enough
sample size to find conclusions from the data, especially about the current NFL landscape, I
would encourage researchers to look at how this position value has changed over time, especially
as the NFL has transitioned from a run-first league to a pass-first one. Something else that I
would suggest fellow researchers look at in the future is whether the significance that the
regression models found at the WR position has been that way over time, or if in the past a
position like RB was more significant, especially further in the past when NFL offenses were
Citations:
Hughes, A., Koedel, C., & Price, J. A. (2015). Positional WAR in the National Football League.
Belson, K. (2021, March 10). N.F.L. salary cap drops to $182.5 million for 2021. The New York
https://www.nytimes.com/2021/03/10/sports/football/nfl-salary-cap.html#:~:text=The%20N.F.L.
%20has%20determined%20the,%24198.2%20million%2C%20a%20league%20record.
Schilling, P. (n.d.). It's not all about the quarterback. Samford University. Retrieved October 5,
Spending-Part-1.
Assunção, R., & Pelechrinisis, K. (n.d.). Sports analytics in the era of big data ... -
https://www.liebertpub.com/doi/full/10.1089/big.2018.29028.edi.
ESPN Internet Ventures. (n.d.). 2021 NFL Team Total Offense Stats. ESPN. Retrieved October
19, 2021, from https://www.espn.com/nfl/stats/team.
NFL 2021 Reg - Offense Passing Stats. NFL.com. (n.d.). Retrieved October 19, 2021, from
https://www.nfl.com/stats/team-stats/.
NFL positional payrolls. Spotrac.com. (n.d.). Retrieved October 19, 2021, from
https://www.spotrac.com/nfl/positional/breakdown/.
SPAD 637 Poor Satisfactory Good Excellent
Sport Analytics
Project Rubric
Difficult to An issue is Explains the issue Clearly explains the
understand the issue presented, but no but does not place it issue within
Introduction being investigated. context provided. within a larger context. Well-
No clear purpose No indication of context. Purpose written purpose
statement provided. study purpose. statement provided. statement.
Minimal discussion Mentions some Stakeholder groups Thorough analysis
20 Points of relevant stakeholder groups. listed. RQs present, of relevant
stakeholder groups. RQs either not but wording needs stakeholder groups.
Grade: 16/20 No RQs provided. provided or not improvement. Appropriately
worded worded RQs.
appropriately.
Does not mention Identifies IVs and Fully explains main Fully explains main
IVs and DVs used DVs, but does not IVs and DVs. Good IVs and DVs.
for analysis. Very explain what types explanation of data Detailed
Method surface level of variables they collection strategy. explanation of data
explanation of data are. Data collection Indicates use of collection strategy.
collection strategy. strategy is not descriptive stats and Appropriate
No indication of explained in detail. inferential tests but identification of
20 Points data analysis No coherent does not explain descriptive
techniques/strategy. indication of data analysis statistics, inferential
Grade: 18/20 analysis technique/strategy tests, and
techniques/strategy. in detail. assumption checks
to perform on the
data.
No indication of Does not provide Provides Provides
Results descriptive stats or descriptive stats. appropriate appropriate
inferential test Inferential tests do descriptive stats and descriptive stats and
results. No use of not match what was inferential test inferential test
tables and figures to outlined in Method. results as outlined results as outlined
20 Points represent data. Very little use of in Method section. in Method section.
tables and figures to Some use of tables Excellent use of
Grade: 17/20 represent data. and figures to tables and figures to
represent data. represent data.
Almost no actual Limited discussion Good discussion of Strong discussion of
discussion of of results. Maybe results. Some results. Excellent
Discussion results. No practical one practical practical description of
implications implication implications practical
offered. Missing provided. Possibly provided. Well- implications. Well-
20 Points several of missing Limitations, written Limitations, written Limitations,
Limitations, Future Future Research, Future Research, Future Research,
Grade: 18/20 Research, and/or and/or Conclusion and Conclusion and Conclusion
Conclusion sections. sections. sections.
sections.
Grammar, Unprofessional Formatting lacks Clean, consistent Professional
Writing, and document consistency. formatting presentation. Clean,
Formatting presentation. No Obvious grammar throughout. Some consistent
clear and consistent or spelling errors. minor grammar or formatting. No or
20 points formatting. Many Consistent issues spelling errors. very minor
spelling and with APA Some issues with grammar or spelling
Grade: 18/20 grammar errors. formatting APA formatting. errors. Appropriate
Very little thought throughout. APA formatting
given to APA throughout.
formatting.
Total
100 points