You are on page 1of 5

Lesson 1.

Descriptive
Statistics
Elements of Probability and Statistics
1st course of Mathematics Degree

Problem 1. The Apgar test is a quick examination performed on a newborn baby at the first and fifth minute
after birth. The test evaluates the baby’s health based on five parameters: respiratory effort, heart rate, muscle tone,
reflexes and skin color. Here are the respiratory responses of 26 newborns in a hospital:

weak – weak – crying – not breathing – crying – crying – weak – weak – weak – weak – crying – weak – crying – weak
– weak – crying - crying – crying – not breathing – weak – weak - weak – not breathing – crying – crying – crying
a) Elaborate an adequate frequency table.
b) Depict all the learned graphs that are suitable for this study case.
c) Calculate the measures of central tendency (if possible).
d) Find the quartiles and the percentiles 20 and 60 (if possible). Interpret them.
e) Calculate the measures of dispersion (if possible).
f) Calculate the measures of distribution (if possible) and interpret the result.

Problem 2. Another of the parameters analyses in the Apgar test is the hear rate. The hear rates of 20 newborns
(measured in number of heart beats per minute) were collected:
86.76 – 121.76 – 143.46 – 104.89 – 88.69 – 153.94 – 111.56 – 123.48 – 76.29 – 82.38 – 86.94 – 75.15 – 144.60 –
88.43 – 155.83 – 181.40 – 139.38 – 101.42 – 156.98 – 74.45
a) Elaborate an adequate frequency table.
b) Depict all the learned graphs that are suitable for this study case.
c) Calculate the measures of central tendency (if possible).
d) Find the quartiles and the percentiles 20 and 60 (if possible). Interpret them.
e) Calculate the measures of dispersion (if possible).
f) Calculate the measures of distribution (if possible) and interpret the result.
g) Depict a box-plot (if possible).

1
Problem 3. Parameters in Apgar test are scored from 0 to 2 depending on the observed condition, so that the
Apgar score is based on a total score of 1 to 10. It is known that the average of the scores obtained in the Apgar test
by newborns during the year 2023 in Madrid is 7.6 with a standard deviation of 2.3; while an average of 8.3 and a
standard deviation of 1.6 was obtained in Toledo the same year. The first babies born in 2024 in both cities scored 8.5
and 8.9 points, respectively. Which of the two obtained a higher relative score in the test?

Problem 4. In three high-performance centers A, B and C, an anaerobic endurance test has been performed on
100 professional athletes. A summary of the obtained scores is presented in the table. Nevertheless, the identification
of each center with its frequency distribution is lost.
Center A Center B Center C
Mean 6.66 4.99 2.65
SD 2.09 1.81 1.97
Median 7 5 2

a) Which center corresponds to each histogram? Justify (in statistical terms) the response.
b) What could you say about the skewness of the scores achieved in each center? Give its meaning in terms of the
context of the problem.
c) Which center’s scores turned out to to be more homogeneous? Why?
d) In general, which athletes had a higher anaerobic endurance? What statistic do you use for it? Justify the
response.

Problem 5. In order to measure the cognitive impairment, four versions of the Mini-Mental State Examination
(MMSE) test were used. According to the preliminary analysis showed in the table, answer the following questions:

2
Statistics MMSE 1 MMSE 2 MMSE 3 MMSE 4
Mean 17.67 17.65 17.11 17.16
Std. Dev. 4.58 4.69 5.59 6.89
Variance 21.01 22.00 31.29 47.42
Skewness -0.68 -0.63 -0.59 0.09
Std. Error of Skewness 0.11 0.11 0.11 0.11
Kurtosis -0.32 -0.42 –0.08 0.34
Std. Error of Kurtosis 0.22 0.22 0.22 0.22
Range 21.51 21.13 30.30 39.61
Minimum 4.41 4.77 0.12 0.09
Maximum 25.92 25.90 30.42 39.71
Percentiles
25 14.28 14.32 13.22 12.42
40 17.39 17.01 16.64 16.11
50 18.95 18.69 18.00 17.88
75 21.35 21.31 21.27 21.54
83 22.04 22.24 22.31 22.98

a) Provided that all versions score from 0 to 30 points, would it be necessary to resort to the coefficient of variation
to compare the dispersion between them?
b) A subject obtained 17.87 and 12.79 points in the versions 1 and 3 of MMSE, respectively. Which version did
he/she obtain a higher score compared with the rest of participants in?
c) Which version can be considered the most symmetrical one? Why?
d) Elaborate a box plot of MMSE 3 from the data provided in the table.

Problem 6. An exploratory analysis was performed to study whether certain factors of the mother affect the
newborn’s weight. According to the blox-plot showed below:
a) What race had higher weights in general? Justify the response.
b) Focusing on the race II, what group (smokers vs. non-smokers) had lower weights? Relate the reasoning with
the skewness of the variable.
c) According to the groups analyzed in b), which measure of central tendency would be more adequate for repre-
senting each group? Why?
d) Identify the group (race and smoker) that exhibited more scatter and justify the response.
e) Focusing on non-smoker race-II mothers, what is the weight exceeded by the 75 % of newborns?

3
Problem 7. We want to analyze the effectiveness of a physical exercises routine in patients with calf muscle tear.
For this purpose, a test has been performed rated from 0 to 10 the mobility of the affected area before, during and
after receiving 10 physiotherapy sessions. These were the results:

GROUP GROUP GROUP


4.97 2 6.67
5 3 7
5 3.26 8
S 1,86 S 2.43 S 2.21

4
a) The above tables collect the centralization measures (mean, median, and mode) of each group as well as the stan-
dard deviation. Identify the group to which corresponds each table and the centralization measure corresponding
to each value (x̄, M e and M o).
b) Characterize the skewness of each score distribution reasoning your answer.

c) A subject achieved the scores 2.5, 6.3 and 7 before, during and after the physiotherapy training, respectively.
Which moment did he/she obtain a higher score regarding its group in?
d) One of the factors that most affects the recovery from this type of injury is diabetes. The numerical description
of the glucose values collected from the participants is presented in the table. Elaborate a box plot from it.

Glucose (mg/dl)
Mean 104.20
Median 96.00
Mode 91
Standard Deviation 29.664
Variance 879.924
Range 328
Minimum 62
Maximum 130
Percentiles
25 89.00
50 96.00
75 108.00

You might also like