You are on page 1of 8

STATISTISCS 30001 - CLASSES 15/21

TUTORING SESSION 2

Exercise 1

The following contingency table refers to a sample of 380 customers of a supermarket,


classified according to the most utilised ancillary service and the age bracket
(young/adult/elderly):

Construct an adequate side-by-side bar chart that highlights a potential dependence between
the variables involved. What information can you derive?

SOL

Since the weight of each kind of service differs sensibly among the different age-groups, we can
conclude that there is dependence between the kind of ancillary service used and the age
bracket. For further evidence see the graph of conditional frequencies below
Exercise 2
Doctors are interested in the possible relationship between the dosage of a medicine and the
time required for a patient’s recovery. The following table shows, for a sample of 10 patients,
dosage levels (in grams) and recovery times (in hours). These patients have similar
characteristics except for medicine dosages. Describe the data graphically with a scatter plot.

SOL.
Exercise 3
A fast-food chain has selected a sample of 150 transactions in order to monitor service-quality
of a restaurant and collected information about wait time (in minutes) before having a meal:

1. What do you mean by frequencies distribution?


2. Calculate the median for wait time.
3. Calculate the arithmetic mean for wait time.
4. BONUS: Calculate the standard deviation for wait time.

SOL
1. A frequency distribution is a table used to organize data. The left column (called classes or
groups) includes all possible responses on a variable being studied. The right column is a list of
the frequencies, or number of observations, for each class. (Newbold, § 1.3).

2. The median is the middle observation of a set of observations that are arranged in increasing
(or decreasing) order. First of all, it is necessary to find the median class which is the interval
in which the middle item lies, or where the cumulative frequency reaches 50%:

The median class is [3; 5).


Assuming that the observations are uniformly distributed in the class, the median is given by:

3. We need to compute the approximate arithmetic mean

where k is the number of classes. Therefore the approximate arithmetic mean for wait time is
given by:
4. As for the standard deviation, we need to compute
Exercise 4
A tour operator extracts from its database a sample of people who have bought an organized
holiday for a certain destination in 2011. For the clients extracted, gender and money spent (in
euro) for extra excursions (not included in the package) have been revealed:

Establish, through the use of the appropriate indexes, whether the money spent for excursions
is more variable for male or for female.

SOL
To establish whether the money spent for excursions is more variable for male or for female we compute
the related coefficients of variation

Since 𝐶𝐶𝑉𝑉𝑀𝑀 < 𝐶𝐶𝑉𝑉𝐹𝐹 , we can conclude that the money spent on excursions by females is more
variable (spread) than the money spent in excursions by males.
Exercise 5
A company has selected a sample of 130 customers in order to define the next advertising
campaign and for targeting it on the basis of the age. The following information about the
number of purchased items and the age classes (young/adult/senior) has been collected:

1. Could you say that every age class buys a different number of items? Explain your answer by
calculating the conditional frequencies and building an adequate graph.
2. What is the percentage of customers being adult and having bought at least 3 items? What is the
percentage of young among customers who have bought 4 items?

SOL
1. From the table, we can derive the marginal frequencies:

The marginal frequencies are useful to compute the conditional frequencies of number of
purchased items given age, as follows:

From the table, it is clear that the number of items purchased more frequently by the young
people is 3, by the adults is 4, and by the Seniors is just 1. Finally, we provide the following
graph:
2. The percentage of customers being adult and having bought at least 3 items is:
Exercise 6
A few months after the opening of the pilot store, TIR has processed the data on the customers
who have signed up for the store membership card. The following table shows the frequency
distribution of the variable "number of purchases during the first three months of being a
cardmember":

1. Write the expression of the cumulative distribution function of "No. of purchases" and draw
its graph.
2. BONUS: Compute the five summary numbers for the variable "No. of purchases."

SOL
1.

2. The five summary values for the variable No. of purchases are

You might also like