You are on page 1of 45

INTRODUCTION

Mathematics is a fascinating subject. Indians have a long history of being regarded


as the country with great mathematicians. It helps us run errands , manage our money and
measure progress, as well as helping us in sport. In recent years, with the development of
technology, maths has played a more and more important role in sport. As technology to
measure and improve performance gains momentum, even sport cannot escape maths. From
amateur athletic training to high-level sporting prowess, similar technology is used to give
athletes feedback.

People think about mathematics as being applied only in the field of science and
engineering. Yet mathematics plays a large role in the efficiency of sports. Within every sport,
there is multiple mathematical concepts which allow athletes to compete and be successful
within their chosen sport. Whether it is through discussing statistics or talking tactics, deciding
where players are going to be positioned on the pitch, or through the way in which the game is
scored, there is mathematics involved. Behind every shot, tackle, sprint, kick, hit or throw etc.,
there has been a mathematical idea which has allowed the athlete decide why they are carrying
out the skill the way they are.

Coaches constantly try to find ways to get the most out of their athletes and sometime
they attend to mathematics for help. May include the best batting order for a team to maximize
the number of runs. Sports such as bridge, whist ,chess ,baseball football ,basketball ,scorer
and cricket are some of the sports that use maths. The new system will allow them to use
calculus to improve their players in training . In addition calculus can be used to calculate the
projectile motion of a baseball trajectory. This can be used in baseball to optimise a pitchers
throwing mechanism to maximise efficiency. Calculus can also be used in basketball to find
the arc length of the shot from the shooters released to the net. Finding the most consistent
percentage of shots made with using a certain angle you can find out which player will score
the most basket. In this project we are going to say about the above topics in detail. From this
project one will get an awareness about the mathematics in sports projectile and calculus and
its application.

1
CHAPTER-1

MATHEMATICS IN SPORTS

Mathematics plays a very important role in sports. Whether discussing a players


statistics, a coaches formula for drafting certain players, or even a judges score for a particular
athlete, mathematics are involved . Even concepts such as the likelihood of a particular athlete
or team winning, a mere case of probability, and maintain equipment are mathematical in
nature.

Let’s begin by looking at the throwing of a basketball. Now, we can use the equation

−16
f(x) = [ ]𝑥2 + (𝑡𝑎𝑛𝛼)𝑥 + ℎ
𝑣2(𝑐𝑜𝑠𝛼)2

It is helpful in finding out the velocity at which a basketball player must throw the
ball in order for it to land perfectly in the basket. When shooting a basketball , the ball is to hit
the basket at as close to a right angle as possible. For this reason, most players attempt to shoot
the ball at a 45°angle. To find the velocity at which a player would need to throw the ball in
order to make the basket, the range of the ball is to be determined when it is thrown at a 45°
angle.

The formula for the range of the ball is

[𝑣2𝑠𝑖(2𝛼)] 32
𝑅𝑎𝑛𝑔𝑒 =

But since the angle at which the ball is thrown is 45°,

𝑣2𝑠𝑖(2𝛼) 𝑣2𝑠𝑖𝑛 (2 ∗ 45) 𝑉2


𝑅𝑎𝑛𝑔𝑒 = [ ] = [ ]=
32 32 32

Now, if a player is shooting a 3 point shot, then he is approximately 25 feet from the
basket. The graph of the range function indicates an idea of how hard the player must throw
the ball in order to make a 3 point shot.

2
fig 1.1

So, by solving the formula knowing that the range of the shot must be 25 feet we
have

𝑣2Type equation here.


25 =
32

𝑣2 = 800

𝑣2 ≈ 28.2843

So in order to make the 3 point shot, the player must throw the ball at approximately
28 feet per second, 19 mph.

While throwing and hitting a baseball , the pitcher wants to throw the ball so that he
will strike out the batter. If his throw is too high or low then it is a ball and the better still has
at least three more opportunities to hit the ball. Similarly, when the batter hits the ball, he wants
to hit the ball so that it will be as far away from any of the other players as possible if not
outside of the ball field itself. The players must take into consideration the speed and height of
the ball to ensure that they will throw or hit it properly. Here is the equation for finding the
projectile motion of a baseball will travel:

−16
(𝑥) = [ ]2 + (𝑡𝑎𝑛𝛼)𝑥 + ℎ
𝑣2𝑐𝑜𝑠2𝛼

3
where all distances are measured in feet, h is the height from which the ball is thrown,
α is the angle at which the ball is thrown, v is the speed at which the ball is thrown, and x is the
distance that the ball travels. The distance that the ball will travel can be found by using

[𝑣2𝑠𝑖(2𝛼)]32
𝑦=

Now, a batter would be more concerned with the range of the ball, wanting it to travel
far enough to allow him to at least make it to first base safely. Several graphs of the range with
different α's and a fixed v and h are shown in fig 1.2.

fig 1.2

The black graph is when α = 30°, the blue graph when α = 45°, and the red graph when
α = 60°. It can be inferred from the graph that an angle of 45° will send the ball the furthest.
So, a batter would want to hit the ball as close to a 45° angle as possible, while a pitcher, who
is more concerned about the ball veering off path, would want to throw the ball so the ball so
that it would travel as close to a straight line as possible. Now, it is approximately 420 feet
from home plate to the edge of a baseball field. The batter wants to hit the ball hard enough so
that it will travel out of the field, over the approximately 7 foot wall at the back of the outfield.
If the batter hits the ball at a 40°angle and the ball is approximately 5 feet in the air when struck,
how hard must he hit the ball in order to have a home run?

In the projection equation, f(x) is the height of the ball, so

4
−16
7=[ ]4202 + (𝑡𝑎𝑛40)420 + 5
𝑣2𝑐𝑜𝑠40 2

−16 ∗ 4202
𝑣2 =
(7 − [𝑡𝑎(40)] ∗ 420 − 5)𝑐𝑜𝑠240
𝑣2 ≈ 14128.4074

𝑣 ≈ 118.863𝑓𝑡/𝑠𝑒𝑐

Therefore, the batter must hit the ball at approximately 118 feet per second, which is
approximately 81 mph, in order to hit a home run when he hits the ball at an angle of 40°.

Many people consider bowling to be quite simplistic. However, you must consider
the angle of the ball and the velocity with which the ball is thrown when trying to get a strike.
The path of a bowling ball, thrown in a straight line, can be represented by the following
equation:

𝑣
𝑓(𝑡) = [ ](1 − 𝑒 −(𝑟∗𝑡))
𝑟

where v is the velocity of the ball, t is the time in seconds that the ball travels, r is a
constant represents the friction, and g(t) is the distance in feet that the ball travels after t
seconds.

Now, the length of a blowing lane is approximately 60 feet. Let's say that the friction
caused by the bowling ball on the slick surface of the bowling lane is approximately 0.3 and
the ball is rolled at approximately 15 mph, or 22 feet per second.The equation can be graphed
as

5
fig 1.3
From the graph,it is understood that the bowling ball, if thrown at 15 mph, should
make it all the way down the bowling lane.

Mathematics is also used in ranking players and determining playoff scenarios. From
something as simple as using a matrix to the formulas used to determine a players or teams
statistics, mathematics is an integral part of this system. For example, in the Olympics, most
sports have players draw numbers to see who they will be competing against. If there are
2𝑘contestants then all athletes participate in the first round of play, if not, then some of the
participants enter during the second round of play. The number of athletes entering during the
second round of play will be 2𝑘 − 𝑛, where n is the number of contestants. Rankings are also
an important aspect of sports. In sports such as tennis, when rating athletes, an integral
estimator is used which is based on a players performance in a series of matches over a certain
period of time. Even horse racing uses mathematics to rank the horses based on how well they
have performed in previous matches, and these rankings go into determining the value of a
horse when a bet is placed. Mathematics is very prevalent in sports, from the most complex of
formulas to the simplest ideas such as betting.

♦♦♦

6
CHAPTER-2

CALCULUS IN SPORTS TO IMPROVE


PERFORMANCE

Mathematics plays an important role in the field of sports. Coaches, athletes, trainers
often use mathematics to gain a competitive advantage over their counterparts. With statistics
of games, statistics of players, and probabilities of winning or losing games, mathematics is
everywhere. Applications of calculus in sports are endless.

CALCULUS IN SPORTS: RUNNING RACES

Mathematics is involved in running to optimize the run, runners must keep


themselves at the right speed in order to finish in the shortest time possible. According to Joseph
Keller’s, A Theory of Competitive Running, the physiological running capacity of a human
body can be measured using a set of differential equations.

According to this theory, to win a running race under 291 meters, the optimum
strategy is to sprint at 100% acceleration for the entire 291 meters. Races above 291 meters
require a different strategy to optimize performance.

fig 2.1: Running Races

FINDING THE OPTIMAL VELOCITY FOR THE RUN WHILE


CONSERVING ENERGY

Keller’s theory, which is based on Newton’s second law and the calculus
of variations, provides an optimum strategy for running one-lap and half-lap
races. Keller wrote the equation of motion as:

7
𝑑𝑣 𝑣
+ = 𝑓(𝑡)
𝑑𝑡 𝑟

where υ is runner’s speed as a function of time t, τ is a constant characterizing the


resistance to running, assumed to be proportional to running speed, and f(t) ≤ F is the
propulsive force per unit mass.

Empirical knowledge of human exercise physiology is expressed in the assumed


relation between propulsive force and energy supply,

𝑑𝐸
= 𝜎 − 𝑓𝑣
𝑑𝑡

where E represents the runner’s energy supply, which has a finite initial value E 0, and
is replenished at a constant rate σ. In spite of this replenishment, the energy supply reaches
zero at the end of the race. Τ, σ, E0 and F are found by comparing the optimal race times.

CALCULUS IN BASEBALL

In baseball, calculus can be used to optimize the pitcher’s throw to achieve maximum
efficiency. Also, calculus can be used to calculate the projectile motion of baseball’s trajectory
and to predict if runners can make it to the next base on time given their running speed and
the speed of a hit ball.

FINDING THE WORK REQUIRED TO THROW THE BASEBALL

The work done W on a moving ball from a position s0 to s1 is equal to the change in
ball’s kinetic energy. The kinetic energy K of a baseball of mass m and velocity v is given by
𝐾 = 1𝑚𝑣2
2

fig 2.2 : Baseball Field

8
𝑠1 1 1
𝑊 = ∫ 𝐹(𝑠)𝑑𝑠 = 𝑚𝑣12 − 𝑚𝑣02
𝑠0 2 2
where 𝑣0 and 𝑣1 are initial and final velocities. Using this, baseball players can figure
out how much force they need to exert on the ball to reach the place where they want the ball
to go.

2.2.2 FINDING THE AVERAGE FORCE ON THE BAT DURING THE COLLISION

The collision of ball and bat, are quite complex and their models are discussed in
detail in a book by Robert Adair, The Physics of Baseball.

fig 2.3

The above image shows an overhead view of the position of a baseball bat, shown
every fiftieth of a second during a typical swing. We can calculate the average force on the bat
during this collision by first calculating the change in the ball’s momentum.

It is known that the momentum p of an object is the product of its mass m and its
velocity v, that is, 𝑝 = 𝑚𝑣. Suppose an object, moving along a straight line, is acted on by a
force F = F(t) which is a continuous function of time t.
𝑡
(t1) − 𝑃(t0) = ∫1 𝐹(𝑡)𝑑𝑡
𝑡 0

Using the above formula, one can find the average force on the bat during the
∆𝑣
collision F = ma where 𝑎 = . The application of calculus in sports does not end with running
𝑡

and baseball, it can be applied in basketball too.

9
CALCULUS IN BASKETBALL

Calculus can be used in basketball to find the exact arc length of a shot from the
shooter’s hands to the basket. The moment the basketball is released from the shooter’s hands,
its travelling path creates an arc all the way to the net.

fig 2.4 : Basketball throw

Using the angle of release and strength of the release, one can mathematically predict
the travelling path and the length of the arc. While the ball is in the air, it is affected by only
one force, which is gravity.

FINDING THE ARC LENGTH OF A BASKETBALL THROW

The travel path of a basketball can be divided into two components, the horizontal
(x) direction and the vertical (y) direction. These two components can be represented by the
following parametric equations:
For horizontal, (𝑡) = 𝑥0 + 𝑣0 cos(𝜃) t
1
For vertical, 𝑦(𝑡) = 𝑦0 + 𝑣0 sin(𝜃) 𝑡 + 𝑔𝑡2
2

where,

𝑥0 is the initial horizontal position of the basketball.

𝑦0 is the initial vertical position of the basketball.

𝑣0 is the initial velocity of the basketball.

𝜃 is the angle the ball is projected with respect to the x-axis.

g is the acceleration due to gravity, -9.81 m/s^2

t is the time travelled.

10
The derivatives of x(t) and y(t) with respect to time t are :

𝑑𝑥
= 𝑣0 cos(𝜃) 𝑡
𝑑𝑡

𝑑𝑦
= 𝑣0 sin(𝜃) − 9.81 𝑡
𝑑𝑡

Now, the distance of the travel distance of the basketball can be found using the arc length
equation

𝛽 𝑑 2𝑥 𝑑2𝑦 .
𝐿 = ∫𝛼 √ 2 + 2 𝑑𝑡,𝛼 ≤ 𝑡 ≤ 𝛽
𝑑𝑡 𝑑𝑡

Now, by inserting the derivatives of x(t) and y(t) in the arc length equation:
𝐿 = ∫𝛽 √(𝑣 cos(𝜃))2 + (𝑣 sin(𝜃) – 9.81 𝑡)2 𝑑𝑡
𝛼 0 0

This equation can be modified based on : (𝑎 − 𝑏)2 = 𝑎2 − 2𝑎𝑏 + 𝑏2

𝛽
𝐿 = ∫ √𝑣02𝑐𝑜𝑠2(𝜃) + 𝑣02𝑠𝑖𝑛2(𝜃) − 19.62 ∗ 𝑡 ∗ 𝑣0𝑠𝑖𝑛(𝜃) = 96.24𝑡2 𝑑𝑡
𝛼

By further modifying, the formula become,

𝐿 = ∫𝛽 √𝑣 2 − 19.62 ∗ 𝑡 ∗𝑣 sin(𝜃) + 96.24 𝑡2 dt


𝛼 0 0

Example :If the average velocity of a basketball throw is 2.24 m/s, the angle of release is 45
° degrees, and the time t required for the ball to travel is about 2 seconds, then the arc length
can be calculated using the above formula:

2
𝐿 = ∫0 √(2.24)2 − 19.62 ∗ 𝑡 ∗ 2.24 sin(45) + 96.24 𝑡2dt = 17.34 m

11
fig 2.5
The above figure shows different angles and entry points of a basketball into a
basketball hoop.The diameter of the hoop ring is 18 inches. As the basketball size is
smaller than the hoop ring, there is always a constant hoop margin. Hoop margin is the
amount of space left in the hoop ring after the basketball enters it.
Free throws, jump shots, and three-pointers enter at an angle that gives an oval
entrance to the hoop. This changes the given hoop margin. Apparent hoop size is the apparent
opening of the hoop to the ball. So, flatter the arc of throw, the smaller the ellipse of the hoop
ring.

An apparent hoop margin is the apparent hoop size minus the basketball’s diameter.
A basketball can be thrown in different ways and different angles. So, the apparent hoop
margin varies with each shot.

FINDING THE VELOCITY REQUIRED FOR THE BASKETBALL TO ENTERTHE


BASKET
The velocity required for the basketball shot given the height of the player’s throw
and distance from the hoop can also be found.This is the equation for a player to shoot the
basketball in order to make it enter the basket perfectly.

−16
(𝑥) = [ ]𝑥2 + (𝑡𝑎𝑛𝛼)𝑥 + ℎ
𝑣2𝑐𝑜𝑠2𝛼

where,

h is the height from which the ball is thrown

𝛼 is the angle at which the ball is thrown.

𝑣 is the speed at which the ball is thrown

12
x is the distance the ball travels.

And the formula for the range of a basketball trajectory is

𝑣02 sin(2𝛼)
𝑅𝑎𝑛𝑔𝑒 =
32

Once, the range and the α angle of throw are known, then the velocity required for
the throw can be calculated using the above formulae.
The application of calculus in sports does not end with running, baseball and
basketball. Calculus can be applied to any physical sports to optimize performance.

♦♦♦

13
CHAPTER-3

PREDICTION OF SPORTS INJURIES BY MATHEMATICAL

MODELS

A number of different methodological approaches have been used to describe the


inciting event for sports injuries. These include interviews of injured athletes, analysis of video
recordings of actual injuries, clinical studies. Sports injuries can affect any and all parts of the
body depending on the particular repetitive movement performed just like any repetitive
motion injury. While there are factors that raise the risk of injury, there are also elements that
predispose athletes to sports injuries. rehabilitation and preventative efforts should be centered
on a thorough knowledge of risk factor etiology as well as knowledge of how such factors
contribute to sports injuries.

Predictive factors of sports injuries

Predictive factors of sports injuries are biological variables and the relations between
them that can be indicators for creating a health profile or diagnosis. For example, weight can
be a predictive factor of diabetes, arteriosclerosis, and other metabolic illnesses. It is even more
useful when associated with height, BMI, and waist-hip ratio since it can then be used in
predicting hypertension, myocardial infarction, diabetes, and strokes. In order to effectively
predict health complications, the WHO recommends using anthropometry to monitor risk
factors of chronic diseases and to perform studies that define the association between the
aforementioned factors and specific outcomes, such as arterial hypertension. Predicting factors
of sports injuries can be grouped into two types of factors: Intrinsic factors and extrinsic factors.

Extrinsic factors

Sports injuries are most commonly caused by poor training methods; structural
abnormalities; weakness in muscles, tendons, ligaments; and unsafe exercising environments.
The most common cause of injury is poor training. For example, muscles need 48 hours to
recover after a workout. Increasing exercise intensity too quickly and not stopping when pain
develops while exercising also causes injury.

14
Intrinsic factors

Everyone’s bone architecture is a little different, and almost all of us have one or two
weak points where the arrangement of bone and muscle leaves us prone to injury. There is an
increase in the occurrence of injuries in children and adolescents locomotion devices when
they try to perform more ambitiously in hopes of improving their short-term performance. As
age and competition level increase, so increases the risk of injury.

Predictive factors of injuries

When an injury occurs, biomechanical, kinematic, and body composition analyses


tend to provide more predictive information than the analyses focused on training intensity,
resistance, muscle tone, agility, physical maturity, previous injuries or training methods.
Unevenness in the length of lower limbs, misalignments, anatomical abnormalities, club foot,
genu valgum, support type, or posture defects are typically factors cited as injury predictors.
Footprints have also been examined: the average arch, the foot’s plantar flexion and
dorsiflexion, excessive pronation, as well as the quadriceps’ Q angle.

The relationship between lower limb structure and sports injuries

Common predisposing factor in injuries to the ankles, legs, knees, and hips include:
Bilateral weight and structural symmetry, Quadriceps and calf girth, patella alta, a kneecap
that’s higher than usual, Q-angle of the knee (high Q angle: kneecap displaced to one side, as
with knock knees), Forefoot varus, Rear foot valgus, true and apparent leg length, uneven leg
length, excessive pronation (flat feet), cavus foot (over-high arches), bowlegged or knock-knee
alignment.

(a) Uneven leg length may lead to awkward running and increases the chance of injury,
but many people with equal-length legs suffer the same effects by running on tilted running
tracks or along the side of a road that is higher in the centre. The hip of the leg that strikes the
higher surface will suffer more strain.

(b) Pronation is the inward rolling of the foot after the heel strikes the ground, before the
weight is shifted forward to the ball of the foot. By rolling inwards, the foot spreads the shock
of impact with the ground. If it rolls too easily, however, it can place uneven stress on muscles
and ligaments higher in the leg.

15
While an overly flexible ankle and foot can cause excessive pronation, a too-rigid
ankle will cause the effects of cavus foot. Although the arch of the foot itself may be normal,
it appears very high because the foot doesn’t flatten inward when weight is placed on it.
Such feet are poor shock absorbers and increase the risk of fractures higher in the legs.
Bowlegs or knock knees add extra stress through knees and ankles over time, and may make
ankle sprains more likely. Other structural conditions that make sports injuries more common
include lumbar lordosis. Overuse injuries are caused by repeated, microscopic injuries to a part
of the body. Many long distance runners experience overuse injuries even after years of
running. For road runners, the surface is hard and sometimes uneven, and the running
movements are repetitive. In addition, there are usually both up- and downhill elements, and
these increase the stress on tendons and muscles in the lower leg. These will develop running
injuries, so use footwear that doesn’t allow side-to-side movement of the heel, and that
adequately cushions the foot.

Logistic regression equations

The purpose of regression techniques is two-fold:

1) To estimate the relation between two variables while taking the presence of other factors into
account

2) To construct a model that allows for the prediction of the value of the dependent variable (in
logistic regression, the probability of success) for specific values of a predicted group of
variables

The concept of logistic regression

The benefit of logistic regression no doubt comes from its capacity to analyse clinical
and epidemiological research data. The primary objective that this technique accomplishes is
modelling how the presence, or absence, of diverse factors and their values influence the
probability of the, typically dichotomic, occurrence of an event. This technique can also be
used to estimate the probability of the occurrence of an event with more than two categories.
These sorts of situations are approached using regression techniques. Nonetheless, lineal
regression methodology is not applicable since the outcome variable only provides two values
such as the presence/absence of a knee sprain, or the presence/absence of injury. If we classify
the value of the outcome variable as 0 when the event does not occur (the absence of a knee

16
sprain) and as 1 when it does occur (the athlete sprains his or her knee), and we look to calculate
the possible relation between the occurrence of a sprained knee and, for example, the difference
in the thickness of both thighs (considered a possible risk factor), can be determined using a
linear regression:

𝐾𝑛𝑒𝑒 𝑠𝑝𝑟𝑎𝑖𝑛 = 𝑎 + 𝑏 ∗ [𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑔ℎ 𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠]

And, based on the data, gauge the coefficients a and b of the equation through the
normal procedure of least squares. However, although this is mathematically possible, we
arrive at nonsensical results; upon calculating the resulting equation for different values of
thigh thickness, we will obtain results that generally differ from 0 and 1, while the only results
actually possible in this case are 0 and 1. Since this restriction is not imposed in lineal
regression, the outcome can theoretically take on any value. If p as the dependent variable of
probability that an athlete suffers a knee sprain,the equation can be built:
𝑝
𝑙𝑛
1−𝑝
As there is a variable taking any value, traditional regression equation can be proposrd
in order to find that value:

𝑝
𝑙𝑛 = 𝑎 + 𝑏 (𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑔ℎ 𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠)
1−𝑝

which, with a slight algebraic manipulation, can be turned into

1
𝐼𝑛𝑗𝑢𝑟𝑦 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
1 + [−𝑎−𝑏−{𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑔ℎ 𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠}]

And this is exactly the kind of equation known as a logistic model, where the number
of factors can be greater than one. Therefore, in the denominator exponent,

𝑏1.𝑑𝑖𝑓𝑓𝑒𝑟𝑒𝑛𝑐𝑒 𝑖𝑛 𝑡ℎ𝑖𝑐𝑘𝑛𝑒𝑠𝑠 + 𝑏2.𝑎𝑔𝑒 + 𝑏3. 𝑠𝑒𝑥 + 𝑏4.ℎ𝑒𝑖𝑔ℎ𝑡

Logistic model coefficients as risk quantifiers

One of the factors that make logistic regression so interesting is the relation that
logistic model coefficients preserve with a risk quantification parameter known in the field as
an “odds ratio”. The odds associated with an event is the quotient of the probability of
occurrence given the probability that it does not occur:

17
𝑝
𝑂𝑑𝑑𝑠 𝑅𝑎𝑡𝑖𝑜 =
1−𝑝

with p being the probability of occurrence. Therefore, calculate the odds of an injury
occurrence when the difference in thigh thickness is equal to or greater than a specific quantity,
which determines how much more probable it is that an injury occurs than if it were not to
occur in this situation. Likewise, calculate the odds of an injury occurrence when the difference
in thigh thickness in less than that same figure. Divide the first odds by the second, then
calculate an odds quotient, or an odds ratio, which in some way quantifies how probable the
occurrence of an injury is when the difference in thickness is greater than a specific figure (first
odds) relative to when the difference in thickness is less. The notion being measured is similar
to what find in the relative risk, which corresponds to the probability quotient that an injury
occurs when a specific factor is present (difference in thickness) compared to when it is not. In
fact, when the prevalence of the event occurring is low (<20 %), the odds value ratio and the
relative risk are very similar; but such is not the case when the occurrence of the event is quite
common, a fact that is often ignored.
𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐼𝑛𝑗𝑢𝑟𝑦 𝑡ℎ𝑒 𝑝𝑟𝑒𝑠𝑒𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑟𝑖𝑠𝑘 𝑓𝑎𝑐𝑡𝑜𝑟
𝑅𝑒𝑙𝑎𝑡𝑖𝑣𝑒 𝑅𝑖𝑠𝑘 =
𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝐼𝑛𝑗𝑢𝑟𝑦 𝑡ℎ𝑒 𝑎𝑏𝑠𝑒𝑛𝑐𝑒 𝑜𝑓 𝑡ℎ𝑒 𝑟𝑖𝑠𝑘 𝑓𝑎𝑐𝑡𝑜𝑟

Absolute risk increase=

(Post test probability if risk factor is present )

-(Post test probability if risk factor is present )

If there is a dichotomic factor in the regression equation, for example if the subject is
not a jumper, the b coefficient of the equation for this factor is directly related to the odds ratio
OR of being a smoker compared to not being one:

OR= exp (b)

where exp(b) is a measurement that quantifies the risk presented when the corresponding factor
is present compared to when it is not, assuming that the rest of the model’s variables Remain
constant.

When the variable is numerical, for example, age or body mass index, it is a
measurement that quantifies the change in risk when a variable changes its value while the rest

18
of the variables remain constant. Insomuch, the odds ratio that, in theory, moves from age X1
to age X2, with b being the coefficient that corresponds to age in the logistic model is:

OR = exp [b * (X2 - X1)]

This is a model in which the increase or decrease of risk is proportional to the change
in one factor’s value to another. In other words, it is proportional to the difference between the
two values, but not to the starting point, meaning that the change in risk, in the logistic model,
is the same when we move from 20 years old to 30 years old as move from 40 to 50. When the
variable’s coefficient b is positive, obtain an odds ratio greater than 1 that therefore corresponds
to a risk factor. On the other hand, if b is negative the odds ratio will be less than 1 and will
correspond to a non-risk factor.

𝑝𝑟𝑒 𝑡𝑒𝑠𝑡 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑖𝑛𝑗𝑢𝑟𝑦


𝑃𝑟𝑒 𝑡𝑒𝑠𝑡 𝑜𝑑𝑑𝑠 =
1 − 𝑝𝑟𝑒 𝑡𝑒𝑠𝑡 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 𝑜𝑓 𝑖𝑛𝑗𝑢𝑟𝑦
.
Pre-test odds=pre-test odds x positive likelihood ratio negative -likelihood ratio
where
𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑟𝑎𝑡𝑖𝑜 =
1 − 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦
1 − 𝑠𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦
𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 𝑙𝑖𝑘𝑒𝑙𝑖ℎ𝑜𝑜𝑑 𝑟𝑎𝑡𝑖𝑜 =
𝑠𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦
𝑝𝑜𝑠𝑡 𝑡𝑒𝑠𝑡 𝑜𝑑𝑑𝑠
𝑃𝑜𝑠𝑡 𝑡𝑒𝑠𝑡 𝑝𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦 =
𝑝𝑜𝑠𝑡 𝑡𝑒𝑠𝑡 𝑜𝑑𝑑𝑠 + 1

Qualitative variables in the logistic model

Given that the employed methodology for calculations with the logistic model is
based on using quantitative variables, the same way as in any other regression process, it is
incorrect that qualitative variables are used in regression processes, whether nominal or ordinal
variables. Assigning a number to each category does not solve the problem since the physical
exercise variable has three possible answers: sedentary, sporadically performing exercise,
frequently performing exercise; and we assign the values 0, 1, 2, respectively, to these
variables. But then, performing frequent exercise has twice the value of performing exercise
sporadically, which makes little sense. for example: civil status, did not have any ordering
relation among the outputs.

19
The solution to this problem is to create as many dichotomic variables as the number
of outputs.

These new variables, artificially created, are called “dummy”, or indicator, internal,
or design variables. Therefore, if the variable in question produces exposure data with the
following outputs: Never ran, Ex-runner, Runs less than 10 kilometers per day, Runs 10 or
more kilometers per day, we have 4 possible answers from which we will construct 3
dichotomic internal variables (values 0,1) with different possibilities for codification that lead
to different interpretations. The most frequent of which is the following:

11 12 13

Never ran 0 0 0

Ex-runner 1 0 0

Run less than 0 1 0

10 km per day

Runs 10 or more 0 0 1

Km per day

Table 3.1: Design variables.

In this type of codification the regression equation’s coefficient for each design
variable (always transformed with the exponential function), corresponds to the odds ratio for
this category given the reference level (the first output). In our example, it quantifies how the
risk changes given the situation of never having run. There are other possibilities, among which
we will highlight an example with a qualitative variable and three outputs:

20
11 12

Output 1 0 0

Output 2 1 0

Output 3 1 1

Table 3. 2: Qualitative variable and three outputs.

With this codification, each coefficient is interpreted as an average of the change in


risk upon moving from one category to the next. In the event that a category cannot naturally
be considered a reference level, for example blood group, a possible classification system is:

11 12

Output 1 -1 -1

Output 2 1 0

Output 3 0 1

Table 3.3: Classification system of category not natural.

where each coefficient of the indicator variables has a direct interpretation as a change
in risk regarding the average of the three outputs.

Representation of logistic regression results

It is common to present logistic regression results in a table wherein each variable


will be shown with a coefficient value, its standard error, a parameter (labeled chi² Wald),
which allows us to check if the coefficient is significantly different from 0 and check the p
value for this context. It also allows us to check the odds ratio of each variable, together with
its confidence interval for 95% reliability.

21
Term Coefficient Standand 𝐶ℎ𝑖2 P Interpretation

Error

Indepen. -1.2168 0.9557 1.621 0.2029 NO

Age -0.0465 0.0374 1.545 0.2138 NO

Race* *5.684 0.0583 Almost(p<0.1)

Race 1 1.0735 0.5151 4.343 0.0372 P<0.05

Race 2 0.8154 0.4453 3.353 0.0671 Almost(p<0.1)

Runner 0.8072 0.4044 3.983 0.0460 P<0.05

Injury 1.4352 0.6483 4.902 0.0268 P<0.05

Dissymmetry 0.6576 0.4666 1.986 0.1587 NO

Q Angle 0.8421 0.4055 4.312 0.0379 P<0.05

Thigh Thickness 1.2817 0.4621 7.692 0.0055 P<0.01

Table 3.4: Example of Logistic Regression Presentation.

22
Variable Odds ratio OR< 95% OR >95%

Age 0.95 0.89 1.03

Race 1 2.93 1.07 8.03

Race 2 2.26 0.94 5.41

Runner 2.24 1.01 4.95

Injury 4.20 1.18 14.97

Dissymmetry 1.93 0.77 4.82

Q Angle 2.32 1.05 5.14

Thigh Thickness 3.60 1.46 8.91

Table 3.5: Odds Ratio.

Goodness of fit

As long as dealt with a regression model, it is fundamental that the model be checked
for an appropriate adjustment to the data used in the calculation before drawing conclusions
(Bender, 1996).

In the case of logistic regression, a rather intuitive idea is to calculate the probability
of an event, the occurrence of an injury or knee sprain in our case, for all athletes from the
sampling. If the goodness of fit is acceptable, one would expect a high probability value to be
associated with the presence of an injury, and vice-versa, if the calculated probability value is
low, one would likewise expect the absence of injury. This intuitive idea is formally realized
through the HosmerLemeshow test , that basically consists in dividing the range of probability

23
in deciles of risk (which would be injury probability 0.1, 0.2, and so forth up to 1) and
calculating the distribution of both injured athletes as well as uninjured athletes that are
calculated in the equation and actually observed. These distributions, both calculated and
observed, contrast with each other through a chi² test. In the final presentation of logistic
regression data, a goodness of fit test should be included as well as a commented conclusion
drawn from the same test. With these, the HosmerLemeshow test would be more illustrative
than the mere obtained distribution values.

Logistic regression analysis

Despite the fact that accidents are unavoidable in sports, injury prediction and
prevention is a practical aspect of sports medicine considered to be the best treatment.
Regression models encompass mathematical techniques that deal with measuring the relation
between an outcome variable and predictive variables. When the outcome variable is
continuous, the preferred model is logistic regression. However, when the outcome variable is
dichotomic (injured/not injured) and the object of study is the relation between this and one or
more predictive variables (right Q angle, left Q angle, the difference in thigh thickness, lower
limb dissymmetry, age, sex, hours of training, kilometers run, etc...) the chosen regression
model is a simple logistic regression model (for one factor) or a multiple logistic regression
model (for more than one factor). Therefore, the logistic regression analysis technique is used
when it is suspected that one of the values of specific categorical variables depends on a series
of predictive or independent variables, along with the goal of finding a mathematical function
that expresses such a relation.

When the goal is to calculate the relation or association between two variables, the
regression models allow for the consideration that there may be other factors that affect this
relation. So, if the possible relation between lower limb dissymmetry and the probability of
suffering a knee injury is being studied as a risk factor, that relation can be different if other
variables are taken into account such as age, sex, or body mass index. Because of this, these
factors could be included in a logistic regression model as independent variables in addition to
dissymmetry. The other variables, in addition to the interest factor (in this example AGE, SEX,
BMI ), are called by several names: control variables, external variables, covariants, or
confounding variables.

24
Interaction

When the relation between the factor being studied and the dependent variable is
modified by the value of a third variable, we are then dealing with interaction. In our example,
we assume that the probability of suffering a sports injury increases with age when there is
lower limb dissymmetry.
In this case it is found that there is an interaction between the variables of Age and
Dissymetry.

Independent variable and probability direction

One of the first considerations we must take into account is that the relation between
the independent variable and the event probability doesn’t change direction. In such a case, the
logistic model doesn’t work for us. A very clear example of this situation arises when we
evaluate the probability of an athlete’s sports injuries in relation to the age when he or she first
began sports competitions. Up to a certain age, the probability can increase as the age at which
the athlete began competing is earlier. And starting from a mature age, the likelihood of injury
also increases compared to the older age at which an athlete competes. In this case, a logistic
model would be inadequate.

Collinearity

Another problem that may arise in regression models, and not only logistic models,
is that the variables involved may be correlated, which would lead us to a nonsensical model
and therefore to some values of the coefficients that cannot be interpreted. This situation, with
correlated independent variables, is called collinearity.

In order to understand it, an extreme case is discussed in which the same variable is
introduced in the model twice. Then,
exp (-b0 – b1 * X – b2 * X) or
exp [-b0 – (b1 + b2) * X ]

where the sum of b1+b2 allows infinite possibilities when the value of a coefficient
is divided into two addends, and therefore the calculation obtained from b1 and b2 doesn’t
make sense.

25
An example of this situation could be given if we include variables such as the length
of the lower limbs and the length of the calves in the equation, two variables that are closely
correlated.

Sample size

As a basic rule, it is necessary to have at least 10 participants, or (k + 1) cases to


estimate a model with k independent variables; in other words, at least 10 cases for each
dependent variable (the probability of the event). It is useful to point out that the qualitative
variables appear as c – 1 variables in the model, when constructing the corresponding internal
variables based on the qualitative variables.

Model selection

When talking about models that can be multivariable, an interesting topic is how to
choose the best set of independent variables to include in the model . The definition of the
“best” model depends on the type and objective of the study. In a case where something will
be predicted, the best model would be one that produces the most reliable predictions. And in
a case where the relation between two variables is being calculated (correcting the effect of
other variables), the best model will be one that obtains the most precise calculation of the
coefficient of the variable in question.

Types of differences

Whenever data is analysed, it is important to distinguish between numerical


differences, statistically significant differences, and clinically relevant differences. These three
concepts do not always coincide.

Number of variables

One must consider the maximum model, or the maximum number of independent
variables that can be included in the equation, while taking their interactions into account when
appropriate. Although there are different processes for choosing a model, there are only three
basic mechanisms for doing so: start with only one independent variable and, one by one, add
more according to the pre-established criteria (forward-moving process). Or also, starting with

26
the maximum model, eliminate the variables one by one according to a pre established criteria
( reverse moving process).
The method, called “ stepwise ”, combines the two previous mechanisms and, in each
step, a variable already present in the equation can be eliminated or another can be added. In
the case of logistic regression, the criteria for deciding if we should choose a new model or
stay with the currently used one at each step is established by the models’ likelihood ratio
logarithm .

The likelihood equation

A model’s likelihood equation is a measurement of how compatible the model is with


the actual outcome data. If upon adding a new variable to the model, the likelihood does not
increase in a statistically significant way, then that variable will not be included in the equation.
To evaluate the statistical significance of a particular variable within the model, we will focus
on the Wald 𝐶ℎ𝑖2 value corresponding to the variable’s coefficient and on its level of
probability.
To develop this equation it is necessary to perform a prior monitoring of a
representative group of athletes taking into account their age, sex, and sport during a
sufficiently long observation period that could be called a season. During this period it is
crucial to differentiate the subjects into two groups: injured and non-injured. Consequently, the
relation between the different measured variables and the final outcome of injury or no-injury
is established. In order to determine the predictive variables, we should identify those that show
significant differences among the two groups, thus establishing the relation between the
injury/no injury dependent variable given the distinct anthropometric and sports variables
(activity time, training time, team position, etc...).

Sensitivity, specificity, positive predictive value and negative predictive


value

It is useful to use control techniques to evaluate the fit of the outcome results. With
the mathematical equations defined in the logistic regression analysis. The results should be
analysed in all studied subjects, for the studied group of athletes in question, and for a control
group of both sexes and differentiating the success rate by sex.

27
Sensitivity

Proportion of injured subjects in relation to how many the equation predicted would
be injured.

𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒
𝑆𝑒𝑛𝑠𝑖𝑡𝑖𝑣𝑖𝑡𝑦(𝑆𝑛) =
(𝑇𝑟𝑢𝑒 𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒)

The following table summarizes these calculations:

POSITIVE TEST(T+) NEGATIVE TEST(T-)

INJURY PRESENT (1+) TRUE POSITIVE (TP) FALSE NEGATIVE (FN)

INJURY ABSENT (1-) FALSE POSITIVE (FP) TRUE NEGATIVE (TN)

Table 3.6: Sensitivity.

𝑆𝑛 = 𝑃[𝑇 + 𝑖𝑓 𝐷+]

𝑇𝑃
𝑆𝑛 =
(𝑇𝑃 + 𝐹𝑁)

Specificity

Proportion of uninjured subjects in relation to how many the equation predicted


would not be injured.

𝑆𝑝𝑒𝑐𝑖𝑓𝑖𝑐𝑖𝑡𝑦(𝑆𝑝) = 𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 ÷ (𝑇𝑟𝑢𝑒 𝑁𝑒𝑔𝑎𝑡𝑖𝑣𝑒 + 𝐹𝑎𝑙𝑠𝑒 𝑃𝑜𝑠𝑖𝑡𝑖𝑣𝑒)

Think of specificity as 1- the false positive rate. Notice what the denominator for
specificity is the number of healthy players. Using conditional probabilities, we can also define
specificity as:

28
Sp = P[Test is negative if Patient is healthy]

𝑆𝑝 = 𝑃[𝑇 − 𝑖𝑓 𝐼− ]

The following table summarizes these calculations:

POSITIVE TEST (T+) NEGATIVE TEST (T-)

INJURY PRESENT (1+) TRUE POSITIVE (TP) FALSE NEGATIVE (FN)

INJURY ABSENT (1-) FALSE POSITIVE (FP) TRUE NEGATIVE (TN)

Table 3.7: Specificity

𝑆𝑝 = 𝑃 [𝑇 − 𝑖𝑓 𝐷−]

𝑇𝑁
𝑆𝑝 =
(𝑇𝑁 + 𝐹𝑃)

False positives

Proportion of uninjured subjects in relation to how many the equation predicted


would be injured.

False negatives

Proportion of injured subjects in relation to how many the equation predicted would
not be injured.

In order to know the probability of whether or not a subject injures him or herself in
relation to the outcome injury ratio, we must know the positive predictive values (PPV) and
the negative predictive values (NPV) that should be defined as the following:

29
Positive predictive values: The probability of an athlete injuring him or herself when
predicted by the equation. To calculate this we use the equation:

(𝑆 ∗ 𝑃𝐿)
𝑃𝑃𝑉 =
(𝑆 ∗ 𝑃𝐿)(𝐹𝐿 ∗ 𝑃𝑁𝐿)

where S = Sensitivity; PL = Probability of injury; FP = False Positives; PNL = Probability of


noninjury;
The following table summarizes these calculations:

POSITIVE TEST (T+) NEGATIVE TEST(T-)

INJURY PRESENT (1+) TRUE POSITIVE (TP) FALSE NEGATIVE (FN)

INJURY ABSENT (1-) FALSE POSITIVE (FP) TRUE NEGATIVE (TN)

Table 3.8: False negatives.

𝑃𝑃𝑉 = 𝑃 [𝐼 + 𝑖𝑓 𝑇+]

𝑇𝑃
𝑃𝑃𝑉 =
(𝑇𝑃 + 𝐹𝑃)

Negative predictive values: The probability that the athlete does not injure him or herself
when the model has predicted a situation of non-injury. To calculate this we use the Equation:

(𝐸 ∗ 𝑃𝑁𝐿)
𝑁𝑃𝑉 =
(𝐸 ∗ 𝑃𝑁𝐿) + 𝐹𝑁 ∗ 𝑃𝐿)

30
POSITIVE TEST (T+) NEGATIVE TEST (T-)

INJURY PRESENT (1+) TRUE POSITIVE (TP) FALSE NEGATIVE (FN)

INJURY ABSENT(1-) FALSE POSITIVE (FP) TRUE NEGATIVE (TN)

Table3. 9: Negative predictive values.

𝑁𝑃𝑉 = 𝑃 [𝐼 − 𝑖𝑓 𝑇−]

𝑇𝑁
𝑁𝑃𝑉 =
(𝑇𝑁 + 𝐹𝑁)

where S = Sensitivity; PL = Probability of injury; FP = False Positives; PNL = Probability of


noninjury;

It is always necessary to find false negatives and positives beforehand, as well as the
probability of injury or non-injury for each athlete before determining the positive and negative
predictive values.

In order to perform this type of calculation, the probability that an individual exhibits
the characteristic in question (suffering an injury) is expressed in function of the predictive
variable or variables; if we make P the probability, the model is expressed as follows:

𝑃 = 𝛽0 + 𝛽1 𝑋

where βo y β1 are the model parameters and X is the predictive variable. The probability (P)is
equal to a constant β0 plus the product of the other constant β1 multiplied by the value of the
predictive variable X.The coefficient β0 is an independent or constant term and it is the value
of the outcome variable’s average. The coefficient β1 is the regression coefficient and it is
interpreted as the change in the outcome variable’s average by the unit of increase of the
predictive variable. The change will be an increase if the regression coefficient value is positive
and it will be a decrease if the value is negative.

31
It is possible that once the model parameters are calculated, the substitution of some
values of the predictive variable gives way to values that aren’t allowed for a probability. This
is why one should perform a probability transformation for the probability of showing the
characteristics in question.

This logit transformation that consists in the logarithmic odd 𝑝


that a characteristic
1−𝑝

will present itself, is modelled by the following formula:

𝑝
𝐿𝑜𝑔 [ ] = 𝛽0 + 𝛽1𝑋
1−𝑝

The Log [ ] is called logit(P)


1−𝑝

In the logistic regression model, the coefficient is the logarithm of the odds ratio
between two individuals that are differentiated in a unit in terms of the predictive variable.

Likewise, by raising e to β1 , we obtain the OR value between those two


individuals
𝐿𝑜𝑔(𝑂. 𝑅) = 𝛽
Or:
𝑂. 𝑅 = 𝑒𝛽1
where e is the number that serves as the base of the Napierian logarithm, approximately 2.72.In
the logistic regression model, β1 is the OR logarithm between two individuals that are
differentiated in a unit in terms of the β1 predictive variable, or likewise, by raising e to β1 one
obtains the OR value between these two individuals. In the case where β1 =0, it is implied that
the logit(P) = β0 + (0)X = 0, in other words does not change with X. Or equally

O.R. = 𝑒0 = 1,which indicates that the two variables are independent and there is no relation
between them. The calculation of β1 is called the logistic regression coefficient.

If we have several predictive variables and we try to study the relation between the
outcome variable and the whole set of predictive variables simultaneously, a multiple logistic
regression model will be used.
𝑝
𝐿𝑜𝑔 [ ] = 𝛽 + 𝛽 𝑋 + … … . . . +𝛽 𝑋
1−𝑝 0 1 𝑝

32
where P is also the probability of presenting the characteristic in question.

♦♦♦

33
CHAPTER-4

APPLICATION OF MATHEMATICS IN SPORTS

Mathematics is indeed a fascinating subject. We Indians have a long history of being


regarded as a country with great Mathematicians. Learning math is introduced to children at a
very young age, from Vedic maths to abacus, children are included in these classes at an early
age to encourage faster learning. Math is a part of everyday life. From calculating profits of a
company to proportions and ratios for measuring each ingredient while cooking, it is used in
almost all the aspects of life. When people think or listen about math, the first thing which
comes to their mind is, mathematics can be only applied in sciences and engineering but apart
from that mathematics has a very big role in sports as well. Athletes and their coaches
constantly try to find different ways for improving their sport and also turn towards
mathematics for help.

It is not always realised that mathematics has a crucial role in most of the sports. From
discussing a players statistics, to coaches formulating and drafting certain players or judges
scoring a particular athlete, mathematics is dynamic in nature. For that matter even the
possibility of a team or an athlete winning is a mere case of probability. From something as
simple as using matrix to application formulas which help in determining a player or team
statistics, mathematics a part of the system .

Application of mathematics in sports -

Basket ball-

At first glance, basketball and math seemingly have little in common. However, a
closer look at the sport reveals that there is a considerable amount of math in basketball.

Geometry in basketball

Whether they realize it or not, basketball players make use of many geometric
concepts while playing a game. The most basic of these ideas is in the dimensions of the
basketball court. The diameter of the hoop (18 in), the diameter of the ball (9.4 in), the width

34
of the court (50 ft.) and the length from the three point line to the hoop (19 ft.) are all standard
measures that must be adhered to in any basketball court.

The path the basketball will take once it’s shot comes down to the angle at which it is
shot, the force applied and the height of the player’s arms. When shooting from behind the free
throw line, a smaller angle is necessary to get the ball through the hoop. However, when
making a field throw, a larger angle is called for. Understanding arcs will help determine how
best to shoot the ball.

Basketball players understand that throwing the ball right at the basket will not help it
go into the hoop. On the other hand, shooting the ball in an arc will increase its chances of
falling through the hoop. Getting the arc right is important to ensure that the ball does not fall
in the wrong place. The best height to dribble can also be determined mathematically.
Understanding geometry is also important for good defense. This will help predict the player’s
moves, and also determine how to face the player. Mathematics can also be used to decide how
to stand while going on defense. The more you bend your knees, the quicker you can move.
Utilizing geometry, math in basketball plays a crucial role in the actual playing of the sport.

Statistics in basketball

fig 4.1

Statistics is essential for analysing a game of basketball. For players, statistics can be
used to determine individual strengths and weaknesses. For spectators, statistics is used to
determine the value of players and analyse the performance of an individual or the entire team.
Percentages are a common way of comparing players’ performances. It is used to get values
like the rebound rate, which is the percentage of missed shots a player rebounds while on the
court. Statistics is also used to rank a player based on the number of shots, steals and assists

35
made during a game. Averages are used to get values like the points per game average, and
ratios are used to get values like the turnover to assist ratio.

Baseball –

fig 4.2

Baseball is a game that lends itself very well to all kinds of statistics and math
calculation., here are a few important baseball numbers and how they are calculated.

Hitters Batting Average (BA)

This is perhaps the most commonly calculated statistic. You might hear more about
home runs or strike-outs, but those are really just totals. You will hear and see the hitter’s
Batting

Average for every player that comes up to the plate. Basically, the Average is a
percentage of how many times a batter hits the ball safely (reaching base – not the number of
non-injury plays) divided by the number of times he comes to the plate. Sounds easy right? Just
divide the total safe hits by the number of plate appearances and you have the Batting Average?

𝐻𝑖𝑡𝑠
𝐵𝑎𝑡𝑡𝑖𝑛𝑔 𝑎𝑣𝑒𝑟𝑎𝑔𝑒 =
𝐴𝑡 − 𝑏𝑎𝑡𝑠
Well, yes, but not quite. That is basically true, but there are a couple of things that
change those two basic numbers. The number of plate appearances isn’t exactly the number
used, and you have to be sure you understand what counts as a safe hit. The number used as
the number of plate appearances to be used in the Batting Average calculation is called an At-
bat. Take a walk, for example, (no don’t leave – just use the ‘walk’ as an example), if the batter
walks, it does not count as an At-bat. Also, if a batter gets hit by a pitch, it is not counted as an
At-bat. If the batter hits the ball and gets out, but advances a baserunner, it is called a Sacrifice
and is not counted as an At-bat. Other instances not counted in the At-bat total are when a batter
gets to go to 1st base because of an obstruction or interference call, if the batter is still batting
and a baserunner is called out for some reason to end the inning, or if he gets replaced during

36
his plate appearance (there are specific rules for 2 strikes, but that is too much detail for us…).
An At-bat is counted whenever the batter hits the ball and reaches safely, or is safe due to an
error on the play, or when the batter is called out for any reason after the ball is put into play,
or on a fielder’s choice (where the batter is safe but another baserunner is called out, like a
force-out at second). Out of all of those possible outcomes, only actual hits (singles, doubles,
triples, home runs) are counted in the Hits total – even if the batter makes safely on base due
to some other reason. So, now that you know all of that, the calculation is easy…

Batting Average = Hits/At-bats, and is rounded to the third decimal place. The number
is said as if multiplied by 1000, so a hitter that had 30 hits out of 100 Atbats (30/100=.300) is
said to be “batting three-hundred”, which is quite good at the Major League level. The closer
you get to .300 and above, the more likely you are a well known star. Hitters near .200 are not
doing so well. The highest BA ever through an entire season was .406 (Ted Williams of the
Red Sox in 1941) and the highest career average is .366 (Ty Cobb from 1905-1928). Among
Active players, Joe Mauer has the highest career BA with .323.

On-Base Percentage (OBP)

On-base percentage is similar to the Batting Average, but includes more. This number
gives a better idea of how often a hitter reaches base – which is a useful statistic for deciding
who the lead-off batter should be, since you want them on base as much as possible. The OBP
includes not only hits (H), but walks (or Base on Balls, BB) and number of time the batter is
hit by a pitch (HBP). The sum of these is divided by the total At-bats plus BB plus HBP plus
Sacrifices (SF).

On-base percentage is calculated using this formula:

𝐻𝑖𝑡𝑠 + 𝐵𝐵 + 𝐻𝐵𝑃
𝑂𝑛 – 𝐵𝑎𝑠𝑒 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =
𝐴𝐵 + 𝐵𝐵 𝐻𝐵𝑃 + 𝑆𝐹

The career leader in OBP is Ted Williams with an OBP of .4817 and the highest single
season OBP was Barry Bonds with .6094 in 2004. Any hitter with OBP of .350 or more is
doing pretty well.

37
Slugging Percentage (SLG)

The Slugging Percentage of a hitter tells you how many bases the hitter generates per
At-bat. It is simply a total of the bases gained by way of the hits, divided by the number of At-
bats (see above to know what counts as an At-bat). The total bases is pretty straightforward. A
single is 1 base, a double is 2, a triple is 3, and a home run is 4. Multiply each of these by the
number a hitter has of each, add them all together, and divide by the AB.

[𝑠𝑖𝑛𝑔𝑙𝑒𝑠 + (2 ∗ 𝐷𝑜𝑢𝑏𝑙𝑒𝑠) + (4 ∗ 𝐻𝑜𝑚𝑒𝑅𝑢𝑛𝑠]


𝑆𝑙𝑢𝑔𝑔𝑖𝑛𝑔 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =
𝐴𝐵

This measurement is useful for power hitters since they may not have as high of a
batting average, but the hits they get are usually for extra bases. This number is a good way to
compare their production, even if their BA is lower than desired.

The theoretical maximum is 4.000, if a hitter hits a home run every At-Bat – which
may happen the first time they ever come to the plate as a Major Leaguer, but is doesn’t take
long to drop from there. The highest for a season is .8634 by Barry Bonds in 2001. The highest
over a career (at least more than a single AB, that is) is Babe Ruth with .6897. A really good
power hitter would have a SLG of .500 or more.

On Base Plus Slugging (OPS)

With use of the above numbers, baseball people determined that the OBP and the SLG
combined to give a pretty good idea of the hitter’s overall production in a way that neither of
the values did individually. Eventually, the simplest combination was to simply add the two
values together, and this became known as On Base Plus Slugging, or OPS.

𝑂𝑛 − 𝐵𝑎𝑠𝑒 𝑃𝑙𝑢𝑠 𝑆𝑙𝑢𝑔𝑔𝑖𝑛𝑔 = 𝑂𝐵𝑃 + 𝑆𝐿𝐺

Pitchers Earned Run Average (ERA)

The most frequently referenced calculated statistic is the Earned Run Average. This
is the average number of earned runs allowed per 9 innings. Any run that is the result of a
defensive error is not included in the ERA, so the basic calculation is the number of earned
runs divided by the number of innings pitched, then times that ratio by 9.

38
𝐸𝑎𝑟𝑛𝑒𝑑 𝑅𝑢𝑛𝑠
𝐸𝑎𝑟𝑛𝑒𝑑 𝑅𝑢𝑛 𝐴𝑣𝑒𝑟𝑎𝑔𝑒 = ∗9
𝐼𝑛𝑛𝑖𝑛𝑔𝑠 𝑃𝑖𝑡𝑐ℎ𝑒𝑑
This shows the average runs a pitcher would give up if they pitched the entire 9 innings
and there were no errors on defense. Most pitchers throw much less than 9 innings per outing,
an so this is a good method to compare them and their different inning counts. Really good
pitchers will have an ERA below 4.00. While career ERA is important, the ERA is usually
more meaningful for the season as it shows how well the pitcher is currently keeping other
teams from scoring.

The active pitcher with the lowest career ERA is Mariano Rivera with 2.215.

Walks and Hits per Innings Pitched (WHIP)

This is a straight-forward calculation that helps us understand how well a pitcher is


keeping runners off of the bases. All hits (singles, doubles, triples, home-runs) count as 1, so it
is that total plus the number of walks issued, divided by the number of innings the pitcher has
pitched (partial innings are measured in thirds, by the number of outs recorded).

𝐵𝐵 + 𝐻
𝑊𝑎𝑙𝑘𝑠 𝑎𝑛𝑑 𝐻𝑖𝑡𝑠 𝑝𝑒𝑟 𝐼𝑛𝑛𝑖𝑛𝑔𝑠 𝑃𝑖𝑡𝑐ℎ𝑒𝑑 =
𝐼𝑛𝑛𝑖𝑛𝑔𝑠 𝑃𝑖𝑡𝑐ℎ𝑒𝑑
Mariano Rivera is also the active leader in WHIP at 1.0026 and the top 50 active
pitchers are under 1.35. This metric has much smaller differences between pitchers, but if you
think about how what this represents, it can be a big deal. Think about this, a third of a runner
every inning, or a runner every 3 innings, or 3 runners per 9 inning game. Even though the
differences in the average are small, those extra runners could make a big difference in the
outcome of a game.

Wins (W)

So the number of Wins is a simple total, but it gets really confusing when you try to
understand what counts as a win. Only a single pitcher is credited with a win in any given game,
but many pitchers on the winning team could contribute to the victory – so how do you decide?
To get the W, you have to be the pitcher at the time your team takes a lead that it does not give
up for the remainder of the game. So a starting pitcher may pitch beautifully, but if his reliever
gives up the lead, the starter will not

39
get the W even if his team retakes the lead. If the starter doesn’t complete 5 innings, he cannot
get the W no matter what, and the official scorer determines which reliever was most effective
and he is given the W.

The overall leader in Wins is Cy Young with 511. Pitchers pitch much less often than
they did when Cy Young played, so his Win total may never be surpassed. The current active
Win leaders are Andy Pettitte with 255 and Tim Hudson with 205. 300 wins in a career is very
big deal, and 20 wins in a single season is a fantastic mark.

Fielders Fielding Percentage

We can’t ignore the defence – there aren’t a lot of things to measure but they do have
Fielding Percentage. This is the number of putouts and assists divided by their total number of
chances. A putout is making an out – like catching a fly ball or touching a base on a force out.
An assist would be throwing the ball to another who gets the putout. If the player makes a
mistake doing either of these two things, they can be assigned an error. The errors are part of
the total number of chances.

𝑃𝑢𝑡𝑜𝑢𝑡𝑠 + 𝑎𝑠𝑠𝑖𝑠𝑡𝑠
𝐹𝑖𝑒𝑙𝑑𝑖𝑛𝑔 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒 =
𝑃𝑢𝑡𝑜𝑢𝑡𝑠 + 𝑎𝑠𝑠𝑖𝑠𝑡𝑠 + 𝑒𝑟𝑟𝑜𝑟𝑠

♦♦♦

40
CONCLUSION
As it has been observed, there really exist a very close relationship between sports
and maths in which the playing of all sports has been found to apply mathematical principles
like calculus,arithmetic,geometry,percentage etc. When researching the math behind sports, we
found that there are a multitude of formulas that go behind the simple actions in sports such as
basketball and baseball. To be successful in these sports, one must make their baskets, and hit
the ball a certain way. The point of this exploration is to delve into the math behind these sports,
and see what formulas occur during a game. We chose this topic, because we love sports.

Calculus is the part of mathematics that has various applications in real-life we have
observed numerous applications of calculus in different types of sports. It plays an important
role in the field of sports. Baseball is one type of sport in which we use the application of
calculus. Athletes, trainers, and coaches often use calculus to gain benefits over their
counterparts. Calculus can also be used to calculate the projectile motion of baseball's
trajectory, speed of baseball when hit, and predict if runners can make it to the next base on
time, given their running Speed. Sports injuries affecting the lower extremities in high impact
sports, such as athletics or basketball, can be predicted by means of logistic regression
equations. The first injury score was described by Shambaugh in 1991, using imbalance in
bilateral weight and deviation of the Q-angle of the quadriceps as dependent variables. Salazar
(2000) developed a mathematical equation to predict lesions based on Shambaugh's score and
constructed through logistic regression analysis, while Fernández (2004) introduced thigh
thickness as a transcendence variable in the prediction of injuries, leading to a more precise
equation. From these investigations, we observed that logistic regression analysis can be a valid
method for discriminating among anthropometric parameters related to sports injuries,
providing a simple and reliable method that could be used in the routine practice of sports
medicine.

Sport and maths are very different activities, but some aspects of the mindset required
to be successful in maths or sport can certainly help us to achieve success in the other.
Mathematics plays an essential role in sport at all levels, whether it be through human
intelligence or through the use of technology to monitor working levels. As technology and
techniques continue to evolve, the data available and performance analysis can only improve

41
further. Mathematics is everywhere from daily lives to sports. When we sit down to watch our
favourite sports star or team we should recognize the behind-the-scenes role that mathematics
is playing in bringing these events to us and making it possible to have fair, competitive and
efficient sports events. This project give us a brief insight into the world of mathematics and
how it influences the world of sports.

42
BIBLIOGRAPHY

1) https://digitash.com/engineering/mathematics/how-to-apply-calculus-in-sports/

2) http://jwilson.coe.uga.edu/EMAT6680/Huffman/Mathematics%20in%20Sports/Mathemati
csSports.html

3) Prediction of Sports Injuries by Mathematical Model – University of Granada, Department


of Physical Education and Sports, Spain

4) https://hillarydoshi.blogspot.com/2021/03/application-of-mathematics-in-sports.html?m=1

5) http://www.makemathagame.com/everyday_math/baseball-math/

43
44
45

You might also like