You are on page 1of 2

Statistics Project 1


(a) Population of interest

The statistical analysis of the finishing times from a population of mountain bikers who have
completed the Irish Downhill Series of races at Bree Hill Co. Wexford, through the years
2015 to 2019.

(b) Sampling Method

A sample of 100 riders was randomly selected from a population of 647 finishers, using the
Multi-Stage sampling technique, and the random number function feature on a calculator.
The population data was collected from the results section of
The data consisted of 3 stages
• Years
• Categories
• Riders
Each category was given an ID number from 0 to N. (It must be noted that the random
number function on a calculator returns the number 0, therefore it was not necessary to
sample from 1 to N+1.)
For stage 1, the random number generated, determined what year to select.
For stage 2, the random number generated, determined what category to select from that
particular year.
For stage 3, the random number generated, determined which rider to select from that
category, and the time was noted.
These three stages were repeated 100 times and ensured that every category had an equal
chance of being selected, and sampling with replacement was used.
The times recorded were in minutes and seconds. For simplicity in calculations, it was
necessary to convert these times to minutes and decimals of minutes.
When the times were converted and the sample was complete, the data was entered into
Minitab and sorted by ascending order. The sorted data was then grouped into 7 non-
overlapping groups, and a tally sheet was used to determine the frequency of each group.
This data was then entered into Excel, and a histogram was constructed.
(c) Histogram of Data1

Histogram of Mountain Bikers Race Times


≥1.2, <1.6 ≥1.6, <2.0 ≥2.0, <2.4 ≥2.4, <2.8 ≥2.8, <3.2 ≥3.2, <3.6 ≥3.6, <4.0
Riders Times (minutes)

(d) Analysis of Histogram

The histogram shows a right skewed distribution, as the tail to the right is longer than at the
left, and may be caused by the lower boundaries.
In this particular data set, the lower boundaries are the slowest times, and the sampled data
represented all finishing times across all categories. It is possible that the times greater than
2.8 minutes may represent slower riders, accidents or deteriorating conditions, whereas, the
times greater than 2.8 represents competitive times.

Mean = 2.2044

Mode = 1.8

Median = 2.1455

StDev = 0.4820

Results data obtained from

You might also like