Professional Documents
Culture Documents
Applications of Trigonometry in Real Life
Applications of Trigonometry in Real Life
REAL LIFE
A dissertation submitted to
Submitted by
ASNA P A 200021032321
CBCS SEMESTER VI
DEPARTMENT OF MATHEMATICS ,
MARCH 2023
CBCS SEMESTER VI
DEPARTMENT OF MATHEMATICS
ALPHONSA COLLEGE,PALA
ARUNAPURAM
CERTIFICATE
Forwarded:
Dr.Sr.Sonia K Thomas
First and foremost, we thank GOD ALMIGHTY for all the blessings he bestowed
on us during the course of our work. We praise him for his great wisdom and guidance
throughout the endeavour. We also thank our principal .
We express our sincere gratitude to Ms Theresa J Puzhakkara , our guide, who took
utmost care in all our phase of doing this project and helped us with valuable guidance and
support in finishing our project successfully.
We thank our parents for their support in doing our project. We are grateful to all
the members of the staff in the library of Alphonsa College, who helped us a lot collecting
materials for the project.
We express our heartfelt thanks to all our friends and well-wishers for their keen
interest and encouragement.
ASNA P A
BIJI JOSEKUTTY
DECLARATION
ASNA P A
BIJI JOSEKUTTY
CONTENTS
• INTRODUCTION...............................................................................1
Basketball
• CONCLUSION..........................................................................................41
• BIBLIOGRAPHY.....................................................................................43
INTRODUCTION
People think about mathematics as being applied only in the field of science and
engineering. Yet mathematics plays a large role in the efficiency of sports. Within every
sport, there is multiple mathematical concepts which allow athletes to compete and be
successful within their chosen sport. Whether it is through discussing statistics or talking
tactics, deciding where players are going to be positioned on the pitch, or through the way in
which the game is scored, there is mathematics involved. Behind every shot, tackle, sprint,
kick, hit or throw etc., there has been a mathematical idea which has allowed the athlete
decide why they are carrying out the skill the way they are.
Coaches constantly try to find ways to get the most out of their athletes and sometime
they attend to mathematics for help. May include the best batting order for a team to
maximize the number of runs. Sports such as bridge, whist ,chess ,baseball
football ,basketball ,scorer and cricket are some of the sports that use maths. The new system
will allow them to use calculus to improve their players in training . In addition calculus can
be used to calculate the projectile motion of a baseball trajectory. This can be used in
baseball to optimise a pitchers throwing mechanism to maximise efficiency. Calculus can
also be used in basketball to find the arc length of the shot from the shooters released to the
net. Finding the most consistent percentage of shots made with using a certain angle you can
find out which player will score the most basket. In this project we are going to say about the
above topics in detail. From this project one will get an awareness about the mathematics in
sports projectile and calculus and its application.
1
CHAPTER-1
MATHEMATICS IN SPORTS
Let’s begin by looking at the throwing of a basketball. Now, we can use the equation
−16
f(x) = [ ]𝑥2 + (𝑡𝑎𝑛𝛼)𝑥 + ℎ
𝑣2(𝑐𝑜𝑠𝛼)2
It is helpful in finding out the velocity at which a basketball player must throw the
ball in order for it to land perfectly in the basket. When shooting a basketball , the ball is to
hit the basket at as close to a right angle as possible. For this reason, most players attempt to
shoot the ball at a 45°angle. To find the velocity at which a player would need to throw the
ball in order to make the basket, the range of the ball is to be determined when it is thrown at
a 45° angle.
[𝑣2𝑠𝑖(2𝛼)] 32
𝑅𝑎𝑛𝑔𝑒 =
Now, if a player is shooting a 3 point shot, then he is approximately 25 feet from the
basket. The graph of the range function indicates an idea of how hard the player must throw
the ball in order to make a 3 point shot.
2
fig 1.1
So, by solving the formula knowing that the range of the shot must be 25 feet we
have
2
25 = 𝑣
32
𝑣2 = 800
𝑣2 ≈ 28.2843
So in order to make the 3 point shot, the player must throw the ball at approximately
28 feet per second, 19 mph.
While throwing and hitting a baseball , the pitcher wants to throw the ball so that he
will strike out the batter. If his throw is too high or low then it is a ball and the better still has
at least three more opportunities to hit the ball. Similarly, when the batter hits the ball, he
wants to hit the ball so that it will be as far away from any of the other players as possible if
not outside of the ball field itself. The players must take into consideration the speed and
height of the ball to ensure that they will throw or hit it properly. Here is the equation for
finding the projectile motion of a baseball will travel:
−16
(𝑥) = [ ]2 + (𝑡𝑎𝑛𝛼)𝑥 + ℎ
𝑣2𝑐𝑜𝑠2𝛼
3
where all distances are measured in feet, h is the height from which the ball is
thrown, α is the angle at which the ball is thrown, v is the speed at which the ball is thrown,
and x is the distance that the ball travels. The distance that the ball will travel can be found by
using
[𝑣2𝑠𝑖(2𝛼)] 32
𝑦=
Now, a batter would be more concerned with the range of the ball, wanting it to
travel far enough to allow him to at least make it to first base safely. Several graphs of the
range with different α's and a fixed v and h are shown in fig 1.2.
fig 1.2
The black graph is when α = 30°, the blue graph when α = 45°, and the red graph
when α = 60°. It can be inferred from the graph that an angle of 45° will send the ball the
furthest. So, a batter would want to hit the ball as close to a 45° angle as possible, while a
pitcher, who is more concerned about the ball veering off path, would want to throw the ball
so the ball so that it would travel as close to a straight line as possible. Now, it is
approximately 420 feet from home plate to the edge of a baseball field. The batter wants to
hit the ball hard enough so that it will travel out of the field, over the approximately 7 foot
wall at the back of the outfield. If the batter hits the ball at a 40°angle and the ball is
approximately 5 feet in the air when struck, how hard must he hit the ball in order to have a
home run?
𝑣 ≈ 118.863𝑓𝑡/𝑠𝑒𝑐
Therefore, the batter must hit the ball at approximately 118 feet per second, which is
approximately 81 mph, in order to hit a home run when he hits the ball at an angle of 40°.
Many people consider bowling to be quite simplistic. However, you must consider
the angle of the ball and the velocity with which the ball is thrown when trying to get a strike.
The path of a bowling ball, thrown in a straight line, can be represented by the following
equation:
�
�](1 − −(𝑟∗𝑡)
𝑓(𝑡) = [ )
𝑟𝑒
where v is the velocity of the ball, t is the time in seconds that the ball travels, r is a
constant represents the friction, and g(t) is the distance in feet that the ball travels after t
seconds.
Now, the length of a blowing lane is approximately 60 feet. Let's say that the
friction caused by the bowling ball on the slick surface of the bowling lane is approximately
0.3 and the ball is rolled at approximately 15 mph, or 22 feet per second.The equation can be
graphed as
5
fig 1.3
From the graph,it is understood that the bowling ball, if thrown at 15 mph, should
make it all the way down the bowling lane.
♦♦♦
6
CHAPTER-2
Mathematics plays an important role in the field of sports. Coaches, athletes, trainers
often use mathematics to gain a competitive advantage over their counterparts. With statistics
of games, statistics of players, and probabilities of winning or losing games, mathematics is
everywhere. Applications of calculus in sports are endless.
According to this theory, to win a running race under 291 meters, the optimum
strategy is to sprint at 100% acceleration for the entire 291 meters. Races above 291 meters
require a different strategy to optimize performance.
7
𝑑𝑣 𝑣
+ = 𝑓(𝑡)
𝑑𝑡 𝑟
𝑑𝐸
𝑑𝑡 = 𝜎 − 𝑓𝑣
where E represents the runner’s energy supply, which has a finite initial value E 0, and
is replenished at a constant rate σ. In spite of this replenishment, the energy supply reaches
zero at the end of the race. Τ, σ, E0 and F are found by comparing the optimal race times.
CALCULUS IN BASEBALL
In baseball, calculus can be used to optimize the pitcher’s throw to achieve maximum
efficiency. Also, calculus can be used to calculate the projectile motion of baseball’s
trajectory and to predict if runners can make it to the next base on time given their running
speed and the speed of a hit ball.
The work done W on a moving ball from a position s0 to s1 is equal to the change
in ball’s kinetic energy. The kinetic energy K of a baseball of mass m and velocity v is given
by
𝐾 = 1𝑚𝑣2
2
2.2.2 FINDING THE AVERAGE FORCE ON THE BAT DURING THE COLLISION
The collision of ball and bat, are quite complex and their models are discussed in
detail in a book by Robert Adair, The Physics of Baseball.
fig 2.3
The above image shows an overhead view of the position of a baseball bat, shown
every fiftieth of a second during a typical swing. We can calculate the average force on the
bat during this collision by first calculating the change in the ball’s momentum.
It is known that the momentum p of an object is the product of its mass m and its
velocity v, that is, 𝑝 = 𝑚𝑣. Suppose an object, moving along a straight line, is acted on by a
force F = F(t) which is a continuous function of time t.
𝑡1
(t1) − 𝑃(t0) = ∫ 𝐹(𝑡)𝑑𝑡
𝑡 0
Using the above formula, one can find the average force on the bat during the
∆𝑣
collision F = ma where 𝑎 = . The application of calculus in sports does not end with
running
𝑡
9
CALCULUS IN BASKETBALL
Calculus can be used in basketball to find the exact arc length of a shot from the
shooter’s hands to the basket. The moment the basketball is released from the shooter’s
hands, its travelling path creates an arc all the way to the net.
Using the angle of release and strength of the release, one can mathematically
predict the travelling path and the length of the arc. While the ball is in the air, it is affected
by only one force, which is gravity.
The travel path of a basketball can be divided into two components, the horizontal
(x) direction and the vertical (y) direction. These two components can be represented by the
following parametric equations:
For horizontal, (𝑡) = 𝑥0 + 𝑣0 cos(𝜃) t 1
For vertical, 𝑦(𝑡) = 𝑦0 + 0 sin(𝜃) 𝑡 + 𝑔𝑡2
𝑣 2
where,
10
The derivatives of x(t) and y(t) with respect to time t are :
𝑑𝑥
= 𝑣0 cos(𝜃) 𝑡
𝑑𝑡
𝑑𝑦
= 𝑣0 sin(𝜃) − 9.81 𝑡
𝑑𝑡
Now, the distance of the travel distance of the basketball can be found using the arc length
equation
𝛽 𝑑2𝑥 𝑑2𝑦 .
𝐿 = ∫𝛼 √ + 𝑑𝑡,𝛼 ≤𝑡≤𝛽
𝑑𝑡2 𝑑𝑡2
Now, by inserting the derivatives of x(t) and y(t) in the arc length equation:
𝐿 = ∫𝛽 √(𝑣 cos(𝜃))2 + (𝑣 sin(𝜃) – 9.81 𝑡)2 𝑑𝑡
𝛼 0 0
𝛽
𝐿 = ∫ √𝑣02𝑐𝑜𝑠2(𝜃) + 𝑣02𝑠𝑖𝑛2(𝜃) − 19.62 ∗ 𝑡 ∗ 𝑣0𝑠𝑖𝑛(𝜃) = 96.24𝑡2 𝑑𝑡
𝛼
Example :If the average velocity of a basketball throw is 2.24 m/s, the angle of release is 45 °
degrees, and the time t required for the ball to travel is about 2 seconds, then the arc length
can be calculated using the above formula:
2
𝐿 = ∫0 √(2.24)2 − 19.62 ∗ 𝑡 ∗ 2.24 sin(45) + 96.24 𝑡2dt = 17.34 m
11
fig 2.5
The above figure shows different angles and entry points of a basketball into a
basketball hoop.The diameter of the hoop ring is 18 inches. As the basketball size is
smaller than the hoop ring, there is always a constant hoop margin. Hoop margin is the
amount of space left in the hoop ring after the basketball enters it.
Free throws, jump shots, and three-pointers enter at an angle that gives an oval
entrance to the hoop. This changes the given hoop margin. Apparent hoop size is the apparent
opening of the hoop to the ball. So, flatter the arc of throw, the smaller the ellipse of the hoop
ring.
An apparent hoop margin is the apparent hoop size minus the basketball’s
diameter. A basketball can be thrown in different ways and different angles. So, the apparent
hoop margin varies with each shot.
−16
(𝑥) = [ ]𝑥2 + (𝑡𝑎𝑛𝛼)𝑥 + ℎ
𝑣 𝑐𝑜𝑠 𝛼
2 2
where,
12
x is the distance the ball travels.
𝑣02 sin(2𝛼)
𝑅𝑎𝑛𝑔𝑒 =
32
Once, the range and the α angle of throw are known, then the velocity required for
the throw can be calculated using the above formulae.
The application of calculus in sports does not end with running, baseball and
basketball. Calculus can be applied to any physical sports to optimize performance.
♦♦♦
13
CHAPTER-3
MODELS
Predictive factors of sports injuries are biological variables and the relations
between them that can be indicators for creating a health profile or diagnosis. For example,
weight can be a predictive factor of diabetes, arteriosclerosis, and other metabolic illnesses. It
is even more useful when associated with height, BMI, and waist-hip ratio since it can then
be used in predicting hypertension, myocardial infarction, diabetes, and strokes. In order to
effectively predict health complications, the WHO recommends using anthropometry to
monitor risk factors of chronic diseases and to perform studies that define the association
between the aforementioned factors and specific outcomes, such as arterial hypertension.
Predicting factors of sports injuries can be grouped into two types of factors: Intrinsic factors
and extrinsic factors.
Extrinsic factors
Sports injuries are most commonly caused by poor training methods; structural
abnormalities; weakness in muscles, tendons, ligaments; and unsafe exercising environments.
The most common cause of injury is poor training. For example, muscles need 48 hours to
recover after a workout. Increasing exercise intensity too quickly and not stopping when pain
develops while exercising also causes injury.
14
Intrinsic factors
Everyone’s bone architecture is a little different, and almost all of us have one or
two weak points where the arrangement of bone and muscle leaves us prone to injury. There
is an increase in the occurrence of injuries in children and adolescents locomotion devices
when they try to perform more ambitiously in hopes of improving their short-term
performance. As age and competition level increase, so increases the risk of injury.
Common predisposing factor in injuries to the ankles, legs, knees, and hips include:
Bilateral weight and structural symmetry, Quadriceps and calf girth, patella alta, a kneecap
that’s higher than usual, Q-angle of the knee (high Q angle: kneecap displaced to one side, as
with knock knees), Forefoot varus, Rear foot valgus, true and apparent leg length, uneven leg
length, excessive pronation (flat feet), cavus foot (over-high arches), bowlegged or knock-
knee alignment.
(a) Uneven leg length may lead to awkward running and increases the chance of injury,
but many people with equal-length legs suffer the same effects by running on tilted running
tracks or along the side of a road that is higher in the centre. The hip of the leg that strikes the
higher surface will suffer more strain.
(b) Pronation is the inward rolling of the foot after the heel strikes the ground, before
the weight is shifted forward to the ball of the foot. By rolling inwards, the foot spreads the
shock of impact with the ground. If it rolls too easily, however, it can place uneven stress on
muscles and ligaments higher in the leg.
15
While an overly flexible ankle and foot can cause excessive pronation, a too-
rigid ankle will cause the effects of cavus foot. Although the arch of the foot itself may be
normal, it appears very high because the foot doesn’t flatten inward when weight is placed
on it.
Such feet are poor shock absorbers and increase the risk of fractures higher in the legs.
Bowlegs or knock knees add extra stress through knees and ankles over time, and may make
ankle sprains more likely. Other structural conditions that make sports injuries more common
include lumbar lordosis. Overuse injuries are caused by repeated, microscopic injuries to a
part of the body. Many long distance runners experience overuse injuries even after years of
running. For road runners, the surface is hard and sometimes uneven, and the running
movements are repetitive. In addition, there are usually both up- and downhill elements, and
these increase the stress on tendons and muscles in the lower leg. These will develop running
injuries, so use footwear that doesn’t allow side-to-side movement of the heel, and that
adequately cushions the foot.
1) To estimate the relation between two variables while taking the presence of other factors
into account
2) To construct a model that allows for the prediction of the value of the dependent variable
(in logistic regression, the probability of success) for specific values of a predicted group of
variables
The benefit of logistic regression no doubt comes from its capacity to analyse
clinical and epidemiological research data. The primary objective that this technique
accomplishes is modelling how the presence, or absence, of diverse factors and their values
influence the probability of the, typically dichotomic, occurrence of an event. This technique
can also be used to estimate the probability of the occurrence of an event with more than two
categories. These sorts of situations are approached using regression techniques. Nonetheless,
lineal regression methodology is not applicable since the outcome variable only provides two
values such as the presence/absence of a knee sprain, or the presence/absence of injury. If we
16
classify the value of the outcome variable as 0 when the event does not occur (the absence
of a knee
17
sprain) and as 1 when it does occur (the athlete sprains his or her knee), and we look to
calculate the possible relation between the occurrence of a sprained knee and, for example,
the difference in the thickness of both thighs (considered a possible risk factor), can be
determined using a linear regression:
And, based on the data, gauge the coefficients a and b of the equation through the
normal procedure of least squares. However, although this is mathematically possible, we
arrive at nonsensical results; upon calculating the resulting equation for different values of
thigh thickness, we will obtain results that generally differ from 0 and 1, while the only
results actually possible in this case are 0 and 1. Since this restriction is not imposed in lineal
regression, the outcome can theoretically take on any value. If p as the dependent variable of
probability that an athlete suffers a knee sprain,the equation can be built:
𝑝
𝑙𝑛
1−𝑝
As there is a variable taking any value, traditional regression equation can be
proposrd in order to find that value:
𝑝
𝑙𝑛 = 𝑎 + 𝑏 (𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑔ℎ 𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠)
1−
𝑝
1
𝐼𝑛𝑗𝑢𝑟𝑦 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
1 + [−𝑎−𝑏−{𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑔ℎ 𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠}]
And this is exactly the kind of equation known as a logistic model, where the
number of factors can be greater than one. Therefore, in the denominator exponent,
One of the factors that make logistic regression so interesting is the relation that
logistic model coefficients preserve with a risk quantification parameter known in the field as
an “odds ratio”. The odds associated with an event is the quotient of the probability of
occurrence given the probability that it does not occur:
18
𝑂𝑑𝑑𝑠 𝑅𝑎𝑡𝑖𝑜 = 𝑝
1−𝑝
with p being the probability of occurrence. Therefore, calculate the odds of an injury
occurrence when the difference in thigh thickness is equal to or greater than a specific
quantity, which determines how much more probable it is that an injury occurs than if it were
not to occur in this situation. Likewise, calculate the odds of an injury occurrence when the
difference in thigh thickness in less than that same figure. Divide the first odds by the second,
then calculate an odds quotient, or an odds ratio, which in some way quantifies how probable
the occurrence of an injury is when the difference in thickness is greater than a specific figure
(first odds) relative to when the difference in thickness is less. The notion being measured is
similar to what find in the relative risk, which corresponds to the probability quotient that an
injury occurs when a specific factor is present (difference in thickness) compared to when it
is not. In fact, when the prevalence of the event occurring is low (<20 %), the odds value
ratio and the relative risk are very similar; but such is not the case when the occurrence of the
event is quite common, a fact that is often ignored.
If there is a dichotomic factor in the regression equation, for example if the subject
is not a jumper, the b coefficient of the equation for this factor is directly related to the odds
ratio OR of being a smoker compared to not being one:
where exp(b) is a measurement that quantifies the risk presented when the corresponding
factor is present compared to when it is not, assuming that the rest of the model’s variables
Remain constant.
When the variable is numerical, for example, age or body mass index, it is a
measurement that quantifies the change in risk when a variable changes its value while the
rest
19
of the variables remain constant. Insomuch, the odds ratio that, in theory, moves from age X1
to age X2, with b being the coefficient that corresponds to age in the logistic model is:
Given that the employed methodology for calculations with the logistic model is
based on using quantitative variables, the same way as in any other regression process, it is
incorrect that qualitative variables are used in regression processes, whether nominal or
ordinal variables. Assigning a number to each category does not solve the problem since the
physical exercise variable has three possible answers: sedentary, sporadically performing
exercise, frequently performing exercise; and we assign the values 0, 1, 2, respectively, to
these variables. But then, performing frequent exercise has twice the value of performing
exercise sporadically, which makes little sense. for example: civil status, did not have any
ordering relation among the outputs.
20
The solution to this problem is to create as many dichotomic variables as the
number of outputs.
These new variables, artificially created, are called “dummy”, or indicator, internal,
or design variables. Therefore, if the variable in question produces exposure data with the
following outputs: Never ran, Ex-runner, Runs less than 10 kilometers per day, Runs 10 or
more kilometers per day, we have 4 possible answers from which we will construct 3
dichotomic internal variables (values 0,1) with different possibilities for codification that lead
to different interpretations. The most frequent of which is the following:
11 12 13
Never ran 0 0 0
Ex-runner 1 0 0
10 km per day
Runs 10 or more 0 0 1
Km per day
In this type of codification the regression equation’s coefficient for each design
variable (always transformed with the exponential function), corresponds to the odds ratio for
this category given the reference level (the first output). In our example, it quantifies how the
risk changes given the situation of never having run. There are other possibilities, among
which we will highlight an example with a qualitative variable and three outputs:
21
11 12
Output 1 0 0
Output 2 1 0
Output 3 1 1
11 12
Output 1 -1 -1
Output 2 1 0
Output 3 0 1
22
Term Coefficient Standand 𝐶ℎ𝑖2 P Interpretation
Error
23
Variable Odds ratio OR< 95% OR >95%
Goodness of fit
In the case of logistic regression, a rather intuitive idea is to calculate the probability
of an event, the occurrence of an injury or knee sprain in our case, for all athletes from the
sampling. If the goodness of fit is acceptable, one would expect a high probability value to be
associated with the presence of an injury, and vice-versa, if the calculated probability value is
low, one would likewise expect the absence of injury. This intuitive idea is formally realized
through the HosmerLemeshow test , that basically consists in dividing the range of
probability
24
in deciles of risk (which would be injury probability 0.1, 0.2, and so forth up to 1) and
calculating the distribution of both injured athletes as well as uninjured athletes that are
calculated in the equation and actually observed. These distributions, both calculated and
observed, contrast with each other through a chi² test. In the final presentation of logistic
regression data, a goodness of fit test should be included as well as a commented conclusion
drawn from the same test. With these, the HosmerLemeshow test would be more illustrative
than the mere obtained distribution values.
Despite the fact that accidents are unavoidable in sports, injury prediction and
prevention is a practical aspect of sports medicine considered to be the best treatment.
Regression models encompass mathematical techniques that deal with measuring the relation
between an outcome variable and predictive variables. When the outcome variable is
continuous, the preferred model is logistic regression. However, when the outcome variable
is dichotomic (injured/not injured) and the object of study is the relation between this and one
or more predictive variables (right Q angle, left Q angle, the difference in thigh thickness,
lower limb dissymmetry, age, sex, hours of training, kilometers run, etc...) the chosen
regression model is a simple logistic regression model (for one factor) or a multiple logistic
regression model (for more than one factor). Therefore, the logistic regression analysis
technique is used when it is suspected that one of the values of specific categorical variables
depends on a series of predictive or independent variables, along with the goal of finding a
mathematical function that expresses such a relation.
When the goal is to calculate the relation or association between two variables, the
regression models allow for the consideration that there may be other factors that affect this
relation. So, if the possible relation between lower limb dissymmetry and the probability of
suffering a knee injury is being studied as a risk factor, that relation can be different if other
variables are taken into account such as age, sex, or body mass index. Because of this, these
factors could be included in a logistic regression model as independent variables in addition
to dissymmetry. The other variables, in addition to the interest factor (in this example AGE,
SEX, BMI ), are called by several names: control variables, external variables, covariants, or
confounding variables.
25
Interaction
When the relation between the factor being studied and the dependent variable is
modified by the value of a third variable, we are then dealing with interaction. In our
example, we assume that the probability of suffering a sports injury increases with age when
there is lower limb dissymmetry.
In this case it is found that there is an interaction between the variables of Age and
Dissymetry.
One of the first considerations we must take into account is that the relation
between the independent variable and the event probability doesn’t change direction. In such
a case, the logistic model doesn’t work for us. A very clear example of this situation arises
when we evaluate the probability of an athlete’s sports injuries in relation to the age when he
or she first began sports competitions. Up to a certain age, the probability can increase as the
age at which the athlete began competing is earlier. And starting from a mature age, the
likelihood of injury also increases compared to the older age at which an athlete competes. In
this case, a logistic model would be inadequate.
Collinearity
Another problem that may arise in regression models, and not only logistic models,
is that the variables involved may be correlated, which would lead us to a nonsensical model
and therefore to some values of the coefficients that cannot be interpreted. This situation,
with correlated independent variables, is called collinearity.
In order to understand it, an extreme case is discussed in which the same variable is
introduced in the model twice. Then,
exp (-b0 – b1 * X – b2 * X) or
exp [-b0 – (b1 + b2) * X ]
where the sum of b1+b2 allows infinite possibilities when the value of a coefficient
is divided into two addends, and therefore the calculation obtained from b1 and b2 doesn’t
make sense.
26
An example of this situation could be given if we include variables such as the
length of the lower limbs and the length of the calves in the equation, two variables that are
closely correlated.
Sample size
Model selection
When talking about models that can be multivariable, an interesting topic is how to
choose the best set of independent variables to include in the model . The definition of the
“best” model depends on the type and objective of the study. In a case where something will
be predicted, the best model would be one that produces the most reliable predictions. And in
a case where the relation between two variables is being calculated (correcting the effect of
other variables), the best model will be one that obtains the most precise calculation of the
coefficient of the variable in question.
Types of differences
Number of variables
One must consider the maximum model, or the maximum number of independent
variables that can be included in the equation, while taking their interactions into account
when appropriate. Although there are different processes for choosing a model, there are only
three basic mechanisms for doing so: start with only one independent variable and, one by
one, add more according to the pre-established criteria (forward-moving process). Or also,
starting with
27
the maximum model, eliminate the variables one by one according to a pre established
criteria ( reverse moving process).
The method, called “ stepwise ”, combines the two previous mechanisms and, in
each step, a variable already present in the equation can be eliminated or another can be
added. In the case of logistic regression, the criteria for deciding if we should choose a new
model or stay with the currently used one at each step is established by the models’
likelihood ratio logarithm .
It is useful to use control techniques to evaluate the fit of the outcome results. With
the mathematical equations defined in the logistic regression analysis. The results should be
analysed in all studied subjects, for the studied group of athletes in question, and for a control
group of both sexes and differentiating the success rate by sex.
28
Sensitivity
Proportion of injured subjects in relation to how many the equation predicted would
be injured.
𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦(𝑆𝑛) =
(𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒)
𝑆𝑛 = 𝑃[𝑇 + 𝑖𝑓 𝐷+]
𝑇𝑃
𝑆𝑛 =
(𝑇𝑃 + 𝐹𝑁)
Specificity
Think of specificity as 1- the false positive rate. Notice what the denominator for
specificity is the number of healthy players. Using conditional probabilities, we can also
define specificity as:
29
Sp = P[Test is negative if Patient is healthy]
𝑆𝑝 = 𝑃[𝑇 − 𝑖𝑓 𝐼− ]
𝑆𝑝 = 𝑃 [𝑇 − 𝑖𝑓 𝐷−]
𝑇𝑁
𝑆𝑝 =
(𝑇𝑁 + 𝐹𝑃)
False positives
False negatives
Proportion of injured subjects in relation to how many the equation predicted would
not be injured.
In order to know the probability of whether or not a subject injures him or herself
in relation to the outcome injury ratio, we must know the positive predictive values (PPV)
and the negative predictive values (NPV) that should be defined as the following:
30
Positive predictive values: The probability of an athlete injuring him or herself
when predicted by the equation. To calculate this we use the equation:
(𝑆 ∗ 𝑃𝐿)
𝑃𝑃𝑉 =
(𝑆 ∗ 𝑃𝐿)(𝐹𝐿 ∗ 𝑃𝑁𝐿)
𝑃𝑃𝑉 = 𝑃 [𝐼 + 𝑖𝑓 𝑇+]
𝑇𝑃
𝑃𝑃𝑉 =
(𝑇𝑃 + 𝐹𝑃)
Negative predictive values: The probability that the athlete does not injure him or herself
when the model has predicted a situation of non-injury. To calculate this we use the
Equation:
(𝐸 ∗ 𝑃𝑁𝐿)
𝑁𝑃𝑉 =
(𝐸 ∗ 𝑃𝑁𝐿) + 𝐹𝑁 ∗ 𝑃𝐿)
31
POSITIVE TEST (T+) NEGATIVE TEST (T-)
𝑁𝑃𝑉 = 𝑃 [𝐼 − 𝑖𝑓 𝑇−]
𝑇𝑁
𝑁𝑃𝑉 =
(𝑇𝑁 + 𝐹𝑁)
𝑃 = 𝛽0 + 𝛽1 𝑋
where βo y β1 are the model parameters and X is the predictive variable. The probability (P)is
equal to a constant β0 plus the product of the other constant β 1 multiplied by the value of the
predictive variable X.The coefficient β0 is an independent or constant term and it is the value
of the outcome variable’s average. The coefficient β 1 is the regression coefficient and it is
interpreted as the change in the outcome variable’s average by the unit of increase of the
predictive variable. The change will be an increase if the regression coefficient value is
positive and it will be a decrease if the value is negative.
32
It is possible that once the model parameters are calculated, the substitution of some
values of the predictive variable gives way to values that aren’t allowed for a probability.
This is why one should perform a probability transformation for the probability of showing
the characteristics in question.
𝑝
This logit transformation that consists in the logarithmic odd
1−𝑝 that a characteristic
will present itself, is modelled by the following formula:
𝑝
𝐿𝑜𝑔 [
1−𝑝] = + 𝛽1𝑋
𝛽0
The Log [
1−𝑝] is called logit(P)
In the logistic regression model, the coefficient is the logarithm of the odds ratio
between two individuals that are differentiated in a unit in terms of the predictive variable.
O.R. = 𝑒0 = 1,which indicates that the two variables are independent and there is no relation
between them. The calculation of β1 is called the logistic regression coefficient.
If we have several predictive variables and we try to study the relation between the
outcome variable and the whole set of predictive variables simultaneously, a multiple logistic
regression model will be used.
𝑝
𝐿𝑜𝑔
𝛽 [ ] = + 𝛽 𝑋 + … … . . . +𝛽 𝑋
1−𝑝 0 1 𝑝
33
where P is also the probability of presenting the characteristic in question.
♦♦♦
34
CHAPTER-4
It is not always realised that mathematics has a crucial role in most of the sports.
From discussing a players statistics, to coaches formulating and drafting certain players or
judges scoring a particular athlete, mathematics is dynamic in nature. For that matter even the
possibility of a team or an athlete winning is a mere case of probability. From something as
simple as using matrix to application formulas which help in determining a player or team
statistics, mathematics a part of the system .
Basket ball-
At first glance, basketball and math seemingly have little in common. However, a
closer look at the sport reveals that there is a considerable amount of math in basketball.
Geometry in basketball
Whether they realize it or not, basketball players make use of many geometric
concepts while playing a game. The most basic of these ideas is in the dimensions of the
basketball court. The diameter of the hoop (18 in), the diameter of the ball (9.4 in), the width
35
of the court (50 ft.) and the length from the three point line to the hoop (19 ft.) are all
standard measures that must be adhered to in any basketball court.
The path the basketball will take once it’s shot comes down to the angle at which it
is shot, the force applied and the height of the player’s arms. When shooting from behind the
free throw line, a smaller angle is necessary to get the ball through the hoop. However, when
making a field throw, a larger angle is called for. Understanding arcs will help determine how
best to shoot the ball.
Basketball players understand that throwing the ball right at the basket will not help
it go into the hoop. On the other hand, shooting the ball in an arc will increase its chances of
falling through the hoop. Getting the arc right is important to ensure that the ball does not fall
in the wrong place. The best height to dribble can also be determined mathematically.
Understanding geometry is also important for good defense. This will help predict the
player’s moves, and also determine how to face the player. Mathematics can also be used to
decide how to stand while going on defense. The more you bend your knees, the quicker you
can move. Utilizing geometry, math in basketball plays a crucial role in the actual playing of
the sport.
Statistics in basketball
fig 4.1
Statistics is essential for analysing a game of basketball. For players, statistics can
be used to determine individual strengths and weaknesses. For spectators, statistics is used to
determine the value of players and analyse the performance of an individual or the entire
team. Percentages are a common way of comparing players’ performances. It is used to get
values like the rebound rate, which is the percentage of missed shots a player rebounds while
on the court. Statistics is also used to rank a player based on the number of shots, steals and
36
assists
37
made during a game. Averages are used to get values like the points per game average, and
ratios are used to get values like the turnover to assist ratio.
Baseball –
fig 4.2
Baseball is a game that lends itself very well to all kinds of statistics and math
calculation., here are a few important baseball numbers and how they are calculated.
This is perhaps the most commonly calculated statistic. You might hear more about
home runs or strike-outs, but those are really just totals. You will hear and see the hitter’s
Batting
Average for every player that comes up to the plate. Basically, the Average is a
percentage of how many times a batter hits the ball safely (reaching base – not the number of
non-injury plays) divided by the number of times he comes to the plate. Sounds easy right?
Just divide the total safe hits by the number of plate appearances and you have the Batting
Average?
𝐻𝑖𝑡𝑠
𝐵𝑎𝑡𝑡𝑖𝑛𝑔 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 =
𝐴𝑡 − 𝑏𝑎𝑡𝑠
Well, yes, but not quite. That is basically true, but there are a couple of things that
change those two basic numbers. The number of plate appearances isn’t exactly the number
used, and you have to be sure you understand what counts as a safe hit. The number used as
the number of plate appearances to be used in the Batting Average calculation is called an At-
bat. Take a walk, for example, (no don’t leave – just use the ‘walk’ as an example), if the
batter walks, it does not count as an At-bat. Also, if a batter gets hit by a pitch, it is not
counted as an At-bat. If the batter hits the ball and gets out, but advances a baserunner, it is
called a Sacrifice and is not counted as an At-bat. Other instances not counted in the At-bat
total are when a batter gets to go to 1st base because of an obstruction or interference call, if
38
the batter is still batting and a baserunner is called out for some reason to end the inning, or if
he gets replaced during
39
his plate appearance (there are specific rules for 2 strikes, but that is too much detail for
us…). An At-bat is counted whenever the batter hits the ball and reaches safely, or is safe due
to an error on the play, or when the batter is called out for any reason after the ball is put into
play, or on a fielder’s choice (where the batter is safe but another baserunner is called out,
like a force-out at second). Out of all of those possible outcomes, only actual hits (singles,
doubles, triples, home runs) are counted in the Hits total – even if the batter makes safely on
base due to some other reason. So, now that you know all of that, the calculation is easy…
Batting Average = Hits/At-bats, and is rounded to the third decimal place. The
number is said as if multiplied by 1000, so a hitter that had 30 hits out of 100 Atbats
(30/100=.300) is said to be “batting three-hundred”, which is quite good at the Major League
level. The closer you get to .300 and above, the more likely you are a well known star. Hitters
near .200 are not doing so well. The highest BA ever through an entire season was .406 (Ted
Williams of the Red Sox in 1941) and the highest career average is .366 (Ty Cobb from
1905-1928). Among Active players, Joe Mauer has the highest career BA with .323.
On-base percentage is similar to the Batting Average, but includes more. This
number gives a better idea of how often a hitter reaches base – which is a useful statistic for
deciding who the lead-off batter should be, since you want them on base as much as possible.
The OBP includes not only hits (H), but walks (or Base on Balls, BB) and number of time the
batter is hit by a pitch (HBP). The sum of these is divided by the total At-bats plus BB plus
HBP plus Sacrifices (SF).
𝐻𝑖𝑡𝑠 + 𝐵𝐵 + 𝐻𝐵𝑃
𝑂𝑛 – 𝐵𝑎𝑠𝑒 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =
𝐴𝐵 + 𝐵𝐵 𝐻𝐵𝑃 + 𝑆𝐹
The career leader in OBP is Ted Williams with an OBP of .4817 and the highest
single season OBP was Barry Bonds with .6094 in 2004. Any hitter with OBP of .350 or
more is doing pretty well.
40
Slugging Percentage (SLG)
The Slugging Percentage of a hitter tells you how many bases the hitter generates
per At-bat. It is simply a total of the bases gained by way of the hits, divided by the number
of At- bats (see above to know what counts as an At-bat). The total bases is pretty
straightforward. A single is 1 base, a double is 2, a triple is 3, and a home run is 4. Multiply
each of these by the number a hitter has of each, add them all together, and divide by the AB.
This measurement is useful for power hitters since they may not have as high of a
batting average, but the hits they get are usually for extra bases. This number is a good way to
compare their production, even if their BA is lower than desired.
The theoretical maximum is 4.000, if a hitter hits a home run every At-Bat – which
may happen the first time they ever come to the plate as a Major Leaguer, but is doesn’t take
long to drop from there. The highest for a season is .8634 by Barry Bonds in 2001. The
highest over a career (at least more than a single AB, that is) is Babe Ruth with .6897. A
really good power hitter would have a SLG of .500 or more.
With use of the above numbers, baseball people determined that the OBP and the
SLG combined to give a pretty good idea of the hitter’s overall production in a way that
neither of the values did individually. Eventually, the simplest combination was to simply add
the two values together, and this became known as On Base Plus Slugging, or OPS.
The most frequently referenced calculated statistic is the Earned Run Average. This
is the average number of earned runs allowed per 9 innings. Any run that is the result of a
defensive error is not included in the ERA, so the basic calculation is the number of earned
runs divided by the number of innings pitched, then times that ratio by 9.
41
𝐸𝑎𝑟𝑛𝑒𝑑 𝑅𝑢𝑛𝑠
𝐸𝑎𝑟𝑛𝑒𝑑 𝑅𝑢𝑛 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 = ∗9
𝐼𝑛𝑛𝑖𝑛𝑔𝑠 𝑃𝑖𝑡𝑐ℎ𝑒𝑑
This shows the average runs a pitcher would give up if they pitched the entire 9
innings and there were no errors on defense. Most pitchers throw much less than 9 innings
per outing, an so this is a good method to compare them and their different inning counts.
Really good pitchers will have an ERA below 4.00. While career ERA is important, the ERA
is usually more meaningful for the season as it shows how well the pitcher is currently
keeping other teams from scoring.
The active pitcher with the lowest career ERA is Mariano Rivera with 2.215.
𝐵𝐵 + 𝐻
𝑊𝑎𝑙𝑘𝑠 𝑎𝑛𝑑 𝐻𝑖𝑡𝑠 𝑝𝑒𝑟 𝐼𝑛𝑛𝑖𝑛𝑔𝑠 𝑃𝑖𝑡𝑐ℎ𝑒𝑑
𝐼𝑛𝑛𝑖𝑛𝑔𝑠 𝑃𝑖𝑡𝑐ℎ𝑒𝑑
=
Mariano Rivera is also the active leader in WHIP at 1.0026 and the top 50 active
pitchers are under 1.35. This metric has much smaller differences between pitchers, but if you
think about how what this represents, it can be a big deal. Think about this, a third of a runner
every inning, or a runner every 3 innings, or 3 runners per 9 inning game. Even though the
differences in the average are small, those extra runners could make a big difference in the
outcome of a game.
Wins (W)
So the number of Wins is a simple total, but it gets really confusing when you try to
understand what counts as a win. Only a single pitcher is credited with a win in any given
game, but many pitchers on the winning team could contribute to the victory – so how do you
decide? To get the W, you have to be the pitcher at the time your team takes a lead that it
does not give up for the remainder of the game. So a starting pitcher may pitch beautifully,
but if his reliever gives up the lead, the starter will not
42
get the W even if his team retakes the lead. If the starter doesn’t complete 5 innings, he
cannot get the W no matter what, and the official scorer determines which reliever was most
effective and he is given the W.
The overall leader in Wins is Cy Young with 511. Pitchers pitch much less often
than they did when Cy Young played, so his Win total may never be surpassed. The current
active Win leaders are Andy Pettitte with 255 and Tim Hudson with 205. 300 wins in a career
is very big deal, and 20 wins in a single season is a fantastic mark.
We can’t ignore the defence – there aren’t a lot of things to measure but they do
have Fielding Percentage. This is the number of putouts and assists divided by their total
number of chances. A putout is making an out – like catching a fly ball or touching a base on
a force out. An assist would be throwing the ball to another who gets the putout. If the player
makes a mistake doing either of these two things, they can be assigned an error. The errors
are part of the total number of chances.
𝑃𝑢𝑡𝑜𝑢𝑡𝑠 + 𝑎𝑠𝑠𝑖𝑠𝑡𝑠
𝐹𝑖𝑒𝑙𝑑𝑖𝑛𝑔 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =
𝑃𝑢𝑡𝑜𝑢𝑡𝑠 + 𝑎𝑠𝑠𝑖𝑠𝑡𝑠 + 𝑒𝑟𝑟𝑜𝑟𝑠
♦♦♦
43
CONCLUSION
As it has been observed, there really exist a very close relationship between sports
and maths in which the playing of all sports has been found to apply mathematical principles
like calculus,arithmetic,geometry,percentage etc. When researching the math behind sports,
we found that there are a multitude of formulas that go behind the simple actions in sports
such as basketball and baseball. To be successful in these sports, one must make their baskets,
and hit the ball a certain way. The point of this exploration is to delve into the math behind
these sports, and see what formulas occur during a game. We chose this topic, because we
love sports.
Calculus is the part of mathematics that has various applications in real-life we have
observed numerous applications of calculus in different types of sports. It plays an important
role in the field of sports. Baseball is one type of sport in which we use the application of
calculus. Athletes, trainers, and coaches often use calculus to gain benefits over their
counterparts. Calculus can also be used to calculate the projectile motion of baseball's
trajectory, speed of baseball when hit, and predict if runners can make it to the next base on
time, given their running Speed. Sports injuries affecting the lower extremities in high impact
sports, such as athletics or basketball, can be predicted by means of logistic regression
equations. The first injury score was described by Shambaugh in 1991, using imbalance in
bilateral weight and deviation of the Q-angle of the quadriceps as dependent variables.
Salazar (2000) developed a mathematical equation to predict lesions based on Shambaugh's
score and constructed through logistic regression analysis, while Fernández (2004) introduced
thigh thickness as a transcendence variable in the prediction of injuries, leading to a more
precise equation. From these investigations, we observed that logistic regression analysis can
be a valid method for discriminating among anthropometric parameters related to sports
injuries, providing a simple and reliable method that could be used in the routine practice of
sports medicine.
Sport and maths are very different activities, but some aspects of the mindset
required to be successful in maths or sport can certainly help us to achieve success in the
other. Mathematics plays an essential role in sport at all levels, whether it be through human
intelligence or through the use of technology to monitor working levels. As technology and
techniques continue to evolve, the data available and performance analysis can only improve
44
further. Mathematics is everywhere from daily lives to sports. When we sit down to watch
our favourite sports star or team we should recognize the behind-the-scenes role that
mathematics is playing in bringing these events to us and making it possible to have fair,
competitive and efficient sports events. This project give us a brief insight into the world of
mathematics and how it influences the world of sports.
45
BIBLIOGRAPHY
1) https://digitash.com/engineering/mathematics/how-to-apply-calculus-in-sports/
2)http://jwilson.coe.uga.edu/EMAT6680/Huffman/Mathematics%20in%20Sports/Mathemati
csSports.html
4)https://hillarydoshi.blogspot.com/2021/03/application-of-mathematics-in-sports.html?m=1
5)http://www.makemathagame.com/everyday_math/baseball-math/
46
47
48