You are on page 1of 4

1 Bonus Exam 1 - Version A

⇤ ⇤ ⇤ ⇤
1. Question 1 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
We consider the following sample of weights of 3 people taken from a large population:
c(56.4, 71.3, 52.3).
Compute the standard deviation of this sample.

(a) 4.7133
(b) 9.9985 (100%)
(c) 14.0010
(d) 4.7133
(e) None of the proposed answers

Solution:
2 2
Pn
KnowingPnthat V (X) = E(X ) E (X) and that we can estimate it by 1/(n 1) i=1 (xi x̄)2 where
1
x̄ = n i=1 xi is the mean of the sample, we compute:

n
X
1 2 1
V (X) = s2 = (xi x̄) = [(56.4 60)2 + (71.3 60)2 + (52.3 60)2 ] = 9.99852
n 1 i=1
2

to get the (unbiased) standard deviation. The solution is therefore 9.9985.

⇤ ⇤ ⇤ ⇤
2. Question 2 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
A five-number summary is a set of descriptive statistics that provides information on a dataset. It
consists of the minimum value of the data, the first, second, and third quartile, and the maximum
value. Here is such a five-number summary for a given dataset:

• max: 12
• Q3: 10.1
• Q2: 9
• Q1: 5
• min: -3

Which of the following 4 boxplots corresponds to the dataset summarised above?

(a) A
(b) B (100%)
(c) C
(d) D
(e) Both A and C correspond to the dataset
(f) Both B and D correspond to the dataset

Solution: To build the boxplot:


1. Draw a box between the quartiles Q1 = Q(0.25) and Q3 = Q(0.75)
2. Draw a line in the box for the median M = Q(0.5) = 9.
3. Compute the lower bound LB = Q1 1.5 · (Q3 Q1) = 5 1.5 · (10.1 5) = 2.65.
4. Compute the upper bound U B = Q3 + 1.5 · (Q3 Q1) = 10.1 + 1.5 · (10.1 5) = 17.75.

1
Figure 1: Boxplots.

5. The ends of the whiskers are the lowest datum still within 1.5 IQR of the lower quartile (LW), and
the highest datum still within 1.5 IQR of the upper quartile (UW)
6. Represent the data smaller than LW or larger than U W by a symbol (a dot in this case).
To decide:

• Distance between U B Q3 : 7.65


• Distance between Q3 - Median : 1.1
• Distance between Median - Q1: 4
• Distance between Q1 LB : 7.65
• The median is closer to the Q1 than the Q3 (can be checked visually).
• We have the maximal value which is greater than the UW (in absolute value) and so it is repre-
sented by a dot.

⇤ ⇤ ⇤ ⇤
3. Question 3 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
Consider the following characteristics observed in a class of students: Gender, encoded as ”0” (male)
and ”1” (female), Average of grades (AG) in the first year of one of the GSEM bachelors. Hint: grades
go from 0 to 6 in 0.10 point increments. What are the types of these two variables?

(a) Gender is discrete, AG is discrete


(b) Gender is discrete, AG is continuous
(c) Gender is nominal, AG is continuous
(d) None of the proposed answers
(e) Gender is nominal, AG is discrete
(100%)

Solution: Gender is nominal, AG is discrete. There is no order between genders, this variable is not
ordinal. Yet, it is nominal since it is categorical/qualitative. AG is not continuous due to its fixed
increments. If it could increment by any value, as small or as big as wanted, by rational or irrational
numbers, as long as it is in the set of real numbers, it would have been continuous.

⇤ ⇤ ⇤ ⇤
4. Question 4 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
Consider the following sample of 6 observations:

2
x = c(6,3,4,2,4,1)
What is the median of this sample?

(a) 1.75
(b) 4.5
(c) 3
(d) 3.33
(e) 4
(f) None of the proposed answers
(100%)

Solution:
Rank the observations: 1 2 3 4 4 6
Compute r0.5 : 0.5 · (n 1) + 1 = 0.5 · 5 + 1 = 3.5
Compute the median: 3 + 0.5(4 3) = 3.5

⇤ ⇤ ⇤ ⇤
5. Question 5 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
Consider the histogram, obtained from a sample x having n = 85 observations.
Which one of the following statements is correct based on the following figures?

Figure 2: Histogram.

(a) Quantile-Quantile Plot A could correspond to sample x


(b) Quantile-Quantile Plot B could correspond to sample x (100%)
(c) It is not possible to determine due to the small sample size
(d) It is not possible to determine since the histogram is for nominal variables and the Quantile-
Quantile plot for continuous variables

3
Solution: The Quantile-Quantile plot is more sensitive to make this assessment by plotting the
quantiles of the unknown distribution against the quantiles of a reference distribution. In this case, the
reference distribution is the Normal distribution, which is why we speak here of a Normal Quantile-
Quantile plot.
If the data had been Normally distributed, Plot A could have been obtained. Since it is extremely
unlikely that the data are Normally distributed, we conclude that Plot B corresponds to the data used
to produce the histogram.

⇤ ⇤ ⇤ ⇤
6. Question 6 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
A researcher is curious about the IQ of students at the University of Geneva. The entire group students
is an example of a:

(a) Parameter
(b) Statistics
(c) Population (100%)
(d) Sample
(e) None of the proposed answers

Solution: The researcher is interested in the population of all students at the UNIGE. To estimate
parameters, she/he uses a sample from this entire group of students.

⇤ ⇤ ⇤ ⇤
7. Question 7 (1 pt)
multi
⇥1 point ⇥0 penalty ⇥Single ⇥Shuffle
An airline’s data analyst produces the following contingency table based on the customers.

Destination
Inside U.E. Outside U.E.
Yes 55 134
Luggage
No 23 13

According to this table, what is the estimated probability that a customer travels outside the European
Union in the knowledge that he does not have a luggage? If necessary, round the solution to four
decimals.

(a) 13/23
(b) 0.3611 (100%)
(c) 13/(134 + 13)
(d) 0.0578
(e) It is not possible to compute based on the information

Solution:
A is the event ”travel outside the U.E.” B is the event ”without a luggage”

P (A|B) = P (A \ B)/P (B) = 13/(23 + 13) = 0.361111

Total of marks: 7

You might also like