You are on page 1of 50

Statistical Problems and their Solutions

Statistical Problems and their Solutions

Writtten by Msc. Esdras BIZWINAYO

Kigali, Rwanda

February 5, 2019
1. Test of Data Manager at Hospital
Kabgayi/50 Marks
1.1 Mathematics/20 Marks

1. Study completely the following function:/10 Marks

2x2
y ≡ f (x) = . (1.1.1)
x2 − 1

Solution

(i) The Domain of Definition


Let us first of all find the domain of definition of our function as follows:

Domf = {R} − {x : x2 − 1 = 0}
= R − {−1, 1}.

(ii) Even or Odd


The function in Equation (1.1.1) is even since f (−x) = f (x).
(iii) Asymptotes: Asymptote is a line that the graph of a function approaches.
Before finding asymptotes, it is advised to reduce the function to find out a hole. A hole exists at a
value of x if that value makes both the numerator and denominator equal to zero (0).
Our function in Equation (1.1.1) can be reduced in the form

2x2
f (x) = (1.1.2)
(x − 1) (x + 1)

Now let us find out the asymptotes of the function in Equation (1.1.1).
(a) Vertical asympotes
The veritical asymptote by definition is the form x = k, where k is a constant. The value of k is found
by setting the denominator of the reduced function equal to 0. Therefore, our function has two vertical
asymptotes as given below
x = 1 and x = −1. (1.1.3)
(b) Horizontal asympotes
The horizontal line is of the form y = k, where k is a constant. The value of k is found by finding the
limit of the function when x goes to infinity. Therefore
2x2
y = lim
x→∞ x2 − 1
= 2.

Thus, the horizontal asymptote is y = 2.

1
Section 1.1. Mathematics/20 Marks Page 2

(c) Oblique asymptote


This is the line of the form y = mx + n, where m and n are calculated as below
f (x)
m = lim
x→∞ x
n = lim [f (x) − mx]
x→∞

Therefore,
2x2
m = lim =0
x→∞ x3 − x
and
2x2
n = lim =2
x→∞ x2 − 1
Thus, there is no oblique asymptote. There is only vertical and horizontal asymptotes.
(iv) First derivative and its sign; interval of increasing and of decreasing
The first derivative of our function is given as follows

4x x2 − 1 − 2x2 (2x)

0
f (x) =
(x2 − 1)2
4x3 − 4x − 4x3
=
(x2 − 1)2
0 −4x
f (x) = .
(x − 1)2
2

x −∞ -1 0 1 +∞
0
Sign of f (x) ++ k ++ 0 – k —–
Variations of f (x) % k % 0 & k &

1. The function is increasing in the intervals ]−∞, −1[ and ]−1, 0[.
2. The function is decreasing in the intervals ]0, 1[ and ]1, +∞[.

(v) Second derivative and its sign; interval of concave up and of concave down
The second derivative of our function is given as follows

2
−4 x2 − 1 + 4x 2 x2 − 1 (2x)
 
00
f (x) =
(x2 − 1)4
x2 − 1 −4 x2 − 1 + 16x2
  
=
(x2 − 1)4
−4x2 + 4 + 16x2
=
(x2 − 1)3
4 3x2 + 1

00
f (x) = .
(x2 − 1)3
Section 1.1. Mathematics/20 Marks Page 3

00
f (x) = 0 if 3x2 + 1 = 0. It is obvious that there is no x ∈ R such that 3x2 + 1 = 0.

x −∞ -1 0 1 +∞
00
Sign of f (x) ++ k – 0 – k ++
Variations of f (x) % k & 0 & k %

1. The function is concave up in the intervals ]−∞, −1[ and ]1, +∞[.
2. The function is concave down in the intervals ]−1, 0[ and ]0, 1[.

(vi) (Sketch of the graph)

2x2
Figure 1.1: Graph of f (x) = x2 −1

2. Solve the following differential equation:/10 Marks


00 0
y + y − 2y = x2 . (1.1.4)

Solution

The solution of the Equation (1.1.4) is given by

y = yh + yp ; (1.1.5)

where;

yh is the homogeneous solution and


yp is the particular solution

In order to solve the Equation (1.1.4), we first solve its corresponding homogeneous equation below
00 0
y + y − 2y = 0. (1.1.6)
Section 1.1. Mathematics/20 Marks Page 4

Let y = eλt , then


0
y = λeλt (1.1.7)
and hence
00
y = λ2 eλt (1.1.8)
By replacing the Equations (1.1.7) and (1.1.8) in the Equation (1.1.6), we obtain
00 0
y + y − 2y = 0 ⇔ λ2 eλt + λeλt − 2eλt = 0
⇔ eλt λ2 + λ − 2 = 0


⇔ λ2 + λ − 2 = 0


By solving λ2 + λ − 2 = 0, we get

λ2 + λ − 2 = 0 ⇔ λ2 − λ + 2λ − 2 = 0
⇔ λ (λ − 1) + 2 (λ − 1) = 0
⇔ (λ + 2) (λ − 1) = 0

which implies that


λ1 = −2 and λ2 = 1.
Therefore, the solution of homogeneous Equation (1.1.6) is

y = c1 e−2t + c2 et ; where c1 and c2 are constants.

Now, we find the particular solution. Let yp = Ax2 + Bx + C, then


0
yp = 2Ax + B (1.1.9)

and hence
00
yp = 2A (1.1.10)
By replacing the Equations (1.1.9) and (1.1.10) in our equation (1.1.4), we obtain the values of
A, B and C as follows

2A + 2Ax + B − 2Ax2 − 2Bx − 2C = x2 =⇒ −2Ax2 + (2A − 2B) x + 2A + B − 2C = x2 (1.1.11)

With this Equation (1.1.11), we identify to obtain


1
A = −
2
1
2A − 2B = 0 =⇒ B = −
2
3
2A + B − 2C = 0 =⇒ C = − .
4
This implies that the particular solution is

1 1 3
yp = − x 2 − x − . (1.1.12)
2 2 4
Thus, the general solution of the Equation (1.1.4) is

1 1 3
y = c1 e−2t + c2 et − x2 − x − .
2 2 4
Section 1.2. Statistics/30 Marks Page 5

1.2 Statistics/30 Marks

1. Differentiate between qualitative variable and quantitative variable by giving an example to


each./10 Marks

Solution

A qualitative variable is a variable that describes a quality of something; it is sometimes referred as


categorical. For instance, colors in the light spectrum is the qualitative variable. In contrast, a quan-
titative variable is a variable that expresses the quantity or number of something; it is something that
can be measured. For example, temperature, speed, area population, voltage and time are examples of
quantitative variables.
2. Given the following distribution frequency of students and their lengths in cm:

i Class Frequency
1 [120, 124[ 2
2 [124, 128[ 10
3 [128, 132[ 4
4 [132, 136[ 8
5 [136, 140[ 4
6 [140, 144[ 6
7 [144, 148[ 5
8 [148, 152[ 8
9 [152, 156[ 1
10 [156, 160[ 2

Table 1.1: Grouped data for 50 students’ length

Calculate;/10 Marks
(a) Average M
(b) Mode M0
(c) Median Me
(d) Variation Coefficient CV and interpret it
(e) Standard deviation S
(f) Draw Histogram Hg

Solution

We need the following table based on Table (1.1) in order to calculate the above quantities:
Section 1.2. Statistics/30 Marks Page 6

i Class Frequency(fi ) CF xi fi xi (xi − x̄) (xi − x̄)2


1 [120, 124[ 2 2 122 244 -16.4 268.96
2 [124, 128[ 10 12 126 1260 -12.4 153.76
3 [128, 132[ 4 16 130 520 -8.4 70.56
4 [132, 136[ 8 24 134 1072 -4.4 19.36
5 [136, 140[ 4 28 138 552 -0.4 0.16
6 [140, 144[ 6 34 142 852 3.6 12.96
7 [144, 148[ 5 39 146 750 7.6 57.76
8 [148, 152[ 8 47 150 1200 11.6 134.56
9 [152, 156[ 1 48 154 154 15.6 243.36
10 [156, 160[ 2 50 158 316 19.6 384.16
Total 50 6920,x̄ = 6920
50 = 138.4 16 1345.6

Table 1.2: Frequency distribution of students and their length

(a) The average M


The average length M as can be observed in Table (1.2) is M = 138.4cm .
(b) The mode M0
In order to find out the mode, we need to first of all find out the class modal. This is the class which
has most frequency. Regarding to Table (1.2), the class modal is [124, 128[ with 10 students. Now, let
us recall the formula for computing the mode. The mode is given by
 
D1
M0 = Lmo + i; (1.2.1)
D1 + D2
where;
Lmo is the lower limit of the class modal,
D1 is the difference between the frequency of class modal and
the frequency of the class before the class modal,
D2 is the difference between the frequency of class modal and
the frequency of the class after the class modal,
i is the class width.
Therefore, by applying the Equation (1.2.1), we get
 
10 − 2
M0 = 123.5 + 4
10 − 2 + 10 − 4
 
8
= 123.5 + 4
8+6
 
8
= 123.5 + 4
14
 
32
= 123.5 +
14
 
16
= 123.5 +
7
M0 = 125.8cm.
Section 1.2. Statistics/30 Marks Page 7

(c) The median Me


In order to find out the median, we need to first of all find out the class median. This is the first class
with the value of cumulative frequency equal at least n2 , where n is the total of frequency, in our problem
n = 50, and hence n2 = 25. Regarding to Table (1.2), the class median is [136, 140[ with 4 students.
Now, let us recall the formula for computing the median. The median is given by
n
2 −F

Me = Lm + i; (1.2.2)
fm
where;

Lm is the lower boundary of the class median,


n is the total frequency
F is the cumulative frequency before class median
fm is the frequency of the class median
i is the class width.

Therefore, by applying the Equation (1.2.2), we get


 
25 − 24
Me = 135.5 + 4
4
= 135.5 + 1
M0 = 136.5cm.

(d) The coefficient of variation CV and its interpretation


By formula, the The coefficient of variation CV is given by
S
CV = ; (1.2.3)

where;

S is the standard deviation and,


x̄ is the average.

With respect to Table (1.2), the standard deviation S is calculated as follows


r
1
S = 1345.6
49
S = 5.24cm.

Thus, the coefficient of variation CV is given as below


5.24 × 100
CV =
138.4
σ = 3.8%.

(e) The standard deviation


The standard deviation S as seen in previous point is S = 5.24cm.
Section 1.2. Statistics/30 Marks Page 8

(f) The Histogram Hg

Figure 1.2: Histogram for 50 students and their length

3. The sample of 100 people shows that their average age 16, test hypothesis that the average age is
less than 19 at α = 5%, given also σ = 2.1. /10 Marks.

Solution

This is a one-sided sample mean test where, we are interested in testing,

• H0 : µ = µ0

• H1 : µ < µ0

Figure 1.3: One-sided sample test graphical representation

Under H0 : µ = µ0 , the probability of a type I error is


!
X̄ − µ0
P r −zα < = 1 − α; (1.2.4)
√σ
n
Section 1.2. Statistics/30 Marks Page 9

where;

α is the significance level,


zα is the critical value that can be found on Normal distribution probability tables,
X̄ is the sample mean,
µ0 is assumed population mean,
σ is the standard deviation,
n is the sample size.

Now, for our problem, we have:


Step 1:

H0 : µ = 19
H0 : µ < 19,

Step 2: α = 0.05
Step 3: The test statistic is the sample mean X̄. We reject H0 if
σ 2.1
X̄ < µ0 − zα √ = 19 − 1.645 = 18.65
n 10

Or: We compute

X̄ − µ0
Zstatistic =
√σ
n
16 − 19
= 2.1
10
−3
= 2.1
10
= −14.3

Step 4: Since X̄ = 16 < 18.65, we reject H0 . Or, since Zstatistic = −14.3 < −Zα = −1.645, we
reject H0
Thus, the average age is less than 19.

END
2. Test on position of Agriculture Statistics/50
Marks
I. Dr. Lu is an expert in Statistics. He helps institutions to take sound decisions on the right time. He
carries out various activities including sampling, data collection and analysis.
You are required to demonstrate that you can replace Dr. Lu by answering the following questions:
1. What is the difference between data and information?/ 5 marks

Solution

The major difference between data and information is that data is either a character, text, word,
number, picture, sound or video, i.e any raw material that is to be processed, while information is
the processed data that is useful and usually formatted in a manner that humans can understand. For
instance, ticket sales on a band on tour is data, and Sales report by region and venue; tells us which
venue is most profitable is information. One can find many examples of data like: Student Data on
Admission Forms, Data of Citizens, Survey Data and Students Examination data.
2. In data analysis, Dr. Lu calculates average and standard deviation. Define each of those terms
and show why they have to be calculated./ 5 marks

Solution

• In statistics, an average is defined as the number that measures the central tendency of a given
set of numbers. There are a number of different averages including but not limited to: mean,
median, mode and range. The average has to be calculated, because it gives the information
about data being studied. Moreover, mean is the average of all numbers and is sometimes called
arithmetic mean; it obtained by adding all numbers of data set and then divide the obtained
sum by the total count of numbers. The statistical median is the middle number in a sequence
of numbers arranged in increasing order. The mode is the number that occurs most often within
a set of numbers. Lastly, the range is the difference between the highest and lowest values within
a set of numbers.

• In statistics, the standard deviation (SD, also represented by the Greek letter sigma σ) is a
measure that is used to quantify the amount of variation or dispersion of a set of data values. It
has to be calculated because it gives the information about data set. For instance, a low standard
deviation indicates that the data points tend to be close to the mean (also called the expected
value) of the set, while a high standard deviation indicates that the data points are spread out
over a wider range of values.

3. Explain how the work carried out by Dr. Lu is very important for agricultural development. /5
marks
4. Please explain how you can analyze agricultural data collected from the field and how your report
can be utilized in formulating the right policies. /10 marks
II Consider the research question: How many people are food insecure in Rwanda?

10
Page 11

a). Formulate the objective of the study. /5 marks


b). Develop a methodology that generates results that accurately represent the food security at
National, Provincial and District levels and which facilitates comparison between rural and urban
households. /10 marks
c). Describe (show) major tables and charts that would be used to present results. /5 marks.

!!!!!!!!!!!!!!!!Good Luck!!!!!!!!!!!!!!!!!!!!!!!
3. Test on position of Agriculture Statistics
Specialist/50 Marks
1. The following are some particulars of weight distribution of boys and girls in class (10 marks)

Item Boys Girls


Number 100 50
Mean Weight 60kg 45kg
Variance 9 4

Table 3.1: Particulars of weight distribution of boys and girls in class

a. Which one of the distributions is more variable.


b. Find the standard deviation of the combined data.

Solution

a. Let us compute their coefficient of variation using data in Table (3.1).



9 × 100 300
CVBoys = = = 5 and
√ 60 60
4 × 100 200
CVGirls = = = 4.4.
45 45
Since 5% is greater that 4.4%, we conclude that the weight distribution of boys is more variable.
2. The mean and standard deviation of a population of 100 observations were calculated as 40 and
5.1 respectively by a student who took by mistake 50 instead of 40 for one observation. Calculate the
correct mean and standard deviation. (10 marks)

Solution

• The correct mean can be found by making a system of two equations as below:
1 P99 50

40 = 100 i=1 xP
i + 100 50 40
1 99 40 ⇔ 40 − Correct M ean = −
Correct M ean = 100 i=1 xi + 100 100 100
This implies that
10 4000
−Correct M ean = − = −39.9.
100 100
Thus, the correct mean is 39.9 .
• The correct standard deviation can also be found by making a system of equations as follows:
(5.1)2 = 99
1 P99 2 1 2

i=1 (xi − x̄) + 99 (50 − 40) ⇔ (5.1)2 −σcorrect =
1 1
(100)− (0.1)2 = 1.01
σcorrect = 99 i=1 (xi − x̄) + 99 (40 − 39.9)2
1 P99 2 1
99 99
This implies that
−σcorrect = 1.01 − 26.01 = −25.
Thus, the correct standard deviation is 5 .

12
Page 13

3. The data on the weight of male calves are (10 marks)

71 93 101 84 88 117 86 86 93 86 106

Table 3.2: Data on the weight of male calves

a. Formulate the null hypothesis (H0 ) that the mean weight of male calves is 83kg, versus a two-sided
alternative (Ha ). Take α = 0.05.
b. Assume a normal population. Compute the test statistic.
c. What final conclusion could you make based up on results of the investigation?

Solution

a. The two-sided test in this question is as below:

H0 : µ = 83, this is the null hypothesis that mean weight of male calves is equal to 83kg,
H1 : µ 6= 83; this is alternative hypothesis that mean weight of male calves is not equal to 83kg.

b. Since we do not know the population standard deviation, we use the sample standard deviation and
sample mean. Here, we use a t-test with test statistic equals as follows:
X̄ − µ0
T = ; (3.0.1)
√s
n

where;

X̄ : is the sample mean,


µ0 : is mean weight of male calves that we are testing,
s : is the sample standard deviation and,
n : is the sample size.

From Table (3.2), we compute X̄ and s as follows:


n
1X
X̄ = Xi
n
i=1
1
= (71 + 93 + 101 + 84 + 88 + 117 + 86 + 86 + 93 + 86 + 106)
n
1011
=
11
= 91.9kg
n
1 X 2
s2 = Xi − X̄
n−1
i=1
1532.91
=
10
= 153.291

s = 153.291
s = 39.2.
Page 14

Now, by applying the Equation (3.0.1), we get


91.9 − 83
T = 39.2 = 0.75 (3.0.2)

11

Or we can compute the confidence interval

 
s s
C.I = µ0 − t α2 √ , µ0 + t α2 √
n n
 
39.2 39.2
= 83 − 2.228 × √ , 83 + 2.228 × √
11 11
= [83 − 26.3, 83 + 26.3]
= [56.7, 109.3] .

c. We reject H0 if
T < −t α2 or T > t α2 with n-1 degrees of freedom. (3.0.3)
Now, from the t-student table of distribution

t α2 with n-1 degrees of freedom = t0.025 with 10 degrees of freedom = 2.228. (3.0.4)
Since T = 0.75 < t0.015, 10 = 2.228, we fail to reject the null hypothesis that mean weight of male
calves is equal to 83kg. That is, there is sufficient evidence with 95% confidence that the mean weight
of male calves is equal to 83kg.
4. The government has shown interest and concern in food price fluctuations over the past agricultural
season. As such you are called to build a partnership and persuade the government to conduct an
adequate investigation. Write a concept note to the ministry in charge, motioning the problem, with
statistical design for the investigation, and possible government intervention(s) in case the research
results support that issue. (20 marks)
4. Exam on Statistician of Ngoma District/50
Marks
4.1 SECTION ONE (20 marks)

1. Explain the following statistical terms:


(i) A questionnaire (1 mark)
(ii) Coding (1 mark)
(iii) Editing (1 mark)
(iv) Imputation (1 mark)
(v) A parameter (1 mark)
(vi) A statistic (1 mark)

Solution

(i) A questionnaire is a research instrument consisting of a series of questions (or other types of
prompts) for the purpose of gathering information from respondents.
(ii) Coding is an analytic process in which data, in both quantitative form (such as questionnaires
results) or qualitative (such as interview transcripts) is categorized to facilitate analysis.
(iii) Editing is the activity aimed at detecting and correcting errors (logical inconsistencies) in data.
(iv) Imputation is the process of replacing missing data with substituted values.
(v) A parameter is a number that summarizes data for an entire population.
(vi) A statistic is a number that summarizes data for a sample.
2. Prove the following:
n−1
(i) E [V ar(X)] = σ 2 is a biased estimator of σ 2

n (4 marks)
(ii) V ar(X) = E(X 2 ) − [E(X)]2 (3 marks)
(iii) Given that X and Y are two independent random variables, prove that
cov(X, Y ) = 0 (3 marks)
(iv) For a discrete uniform distribution
 1
n, X = 1, 2, . . . , n
0, otherwise

Solve numerically E(X) and V ar(X) for die with the following distribution (4 marks)

15
Section 4.1. SECTION ONE (20 marks) Page 16

X 1 2 3 4 5 6
1 1 1 1 1 1
f (X) 6 6 6 6 6 6

Table 4.1: Die Probability Distribution

Solution

(i) Biased estimator


n
" #
1X 2
E [V ar(X)] = E (xi − x̄)
n
i=1
" n #
1 X
x2i − 2xi x̄ + x̄2

= E
n
i=1
n
" !#
1 X 2
= E xi − nx̄2
n
i=1
" n #
1 X 2
 2

= E xi − nE x̄
n
i=1
" n  2 #
1 X σ
σ 2 + µ2 − n + µ2

=
n n
i=1
1 2
nσ + nµ2 − σ 2 − nµ2

=
n  
2 n−1
= σ
n

Since E [V ar(X)] 6= V ar(X), we conclude that σ 2 n−1 is biased estimator of σ 2 . It would be



n
unbiased estimator if E [V ar(X)] = V ar(X).
(ii) σ 2
n
1X
V ar(X) = (xi − x̄)2
n
i=1
n
1X 2
xi − 2xi x̄ + x̄2

=
n
i=1
n
!
1 X 2
= xi − nx̄2
n
i=1
n
1X 2
= xi − x̄2
n
i=1
2
= E(Xi2 ) − E(X̄)
2
2
V ar(X) = E(X ) − E(X̄) .

(iii) cov(X, Y ) = 0, for X and Y are independent.


We know that cov(X, Y ) = E (X − µX ) (Y − µY ). Then,
Section 4.2. SECTION TWO (15 marks) Page 17

cov(X, Y ) = E (XY − XµY − µX Y − µX µY )


= E (XY ) − E (XµY ) − E (µX Y ) − E (µX µY )
= E (XY ) − E (Y ) E (X) − E (X) E (Y ) − E (X) E (Y )
= E (XY ) − E (X) E (Y )
= 0, since E (XY ) = E (X) E (Y ) , for X and Y are independent.

(iv) Discrete random variable The expectation E(X) and variance V ar(X) for the distribution in
Table (4.1) are given below as follows:
6
X 1
E(X) = i×
6
i=1
1 1 1 1 1 1
= 1× +2× +3× +4× +5× +6×
6 6 6 6 6 6
1 2 3 4 5 6
= + + + + +
6 6 6 6 6 6
21
=
6
E(X) = 3.5.

and
n
X
V ar(X) = (Xi − E(X))2 f (Xi )
i=1
6
X 1
= (i − 3.5)2 ×
6
i=1
6.25 + 2.25 + 0.25 + 0.25 + 2.25 + 6.25
=
6
17.5
=
6
V ar(X) = 2.9.

4.2 SECTION TWO (15 marks)

1. In a district X, we have 560 children under nutrition cases. The nearest hospital wants to conduct a
survey and investigate the root cause factors. As the district Statistician, you are required to determine
the sample size required and allocate the obtained sample size proportionally to the size of sectors.
Assuming that the confidence interval, CI = 95%, the level of variation, p = 70% and the margin error,
e = 5%. (3 marks)

Sectors Sector A Sector B Sector C Sector D Sector E


Number of cases 110 80 210 90 70

Table 4.2: Number of children under nutrition cases


Section 4.2. SECTION TWO (15 marks) Page 18

Solution

We use the accurately formula for computing the sample size assuming CI = 95% and the margin error,
e = 5% as below
(1.96)2 pq 3.8416 × 0.7 × 0.3 0.806736
n= 2
= 2
= = 323 (4.2.1)
e 0.05 0.0025
By using the Equation (4.2.1), the Table (4.2) becomes:

Sectors Sector A Sector B Sector C Sector D Sector E


Number of cases 110 80 210 90 70
110×323 80×323 210×323 90×323 70×323
Sample size 560 = 63 560 = 46 560 = 121 560 = 52 560 = 40

Table 4.3: Number of children under nutrition cases and sample size

2. A simple random sample of five men is chosen from a large population of men, and their heights are
measured. The five heights (in inches) are 65.51, 72.30, 68.31, 67.05, and 70.68. (3 marks)
(i) Find the sample mean
(ii) Find the sample variance
(iii) Get the standard deviation

Solution

(i) Sample mean


The sample mean denoted by X̄ is given by:
n
1X
X̄ = xi
n
i=1
1
= (65.51 + 72.30 + 68.31 + 67.05 + 70.68)
5
1
= (343.85)
5
X̄ = 68.77.
Section 4.2. SECTION TWO (15 marks) Page 19

(ii) Sample variance


The sample variance denoted by S 2 is given by:
n
2 1 X 2
S = Xi − X̄
n−1
i=1
n
1X
= (Xi − 68.77)2
4
i=1
1 
= (65.51 − 68.77)2 + (72.30 − 68.77)2 + (68.31 − 68.77)2 + (67.05 − 68.77)2 + (70.68 − 68.77)2
4
1
= (10.6276 + 12.4609 + 0.2116 + 2.9584 + 3.6481)
4
1
= (29.9066)
4
S2 = 7.5.

(iii) Sample standard deviation


The sample standard deviation denoted by S is given by:

S = S2

= 7.5
S = 2.7

3. The district Mayor obtained the mean and standard deviation of 100 observations as 40 and 5.1
respectively. It was later discovered by the Governor that the Mayor had wrongly copied an observation
as 50 instead of 40. As the Statistician of the district calculate the new mean and standard deviation.
3 marks

Solution

The correct mean can be found by making a system of two equations as below:
1 P99 50

40 = 100 i=1 xP
i + 100 50 40
1 99 40 ⇔ 40 − Correct M ean = −
Correct M ean = 100 i=1 xi + 100 100 100

This implies that


10 4000
−Correct M ean = − = −39.9.
100 100
Thus, the correct mean is 39.9 .
4. The primary pupils in the district X were estimated to be 87000 in 2017. Among these pupils, 12%
are in private schools while the rest are in public schools. Of those in private, the female students
occupy 55% while in the public; the percentage of female is 44%. If one wants to make an investigation
on pupils’ performance in that District, what is the sample size needed for both sexes and in both type
of schools? Consider the following to determine the sample size with proportional allocation:
- Level of confidence, CI = 95%
- Sampling error, e = 5%
Section 4.3. SECTION THREE (15 marks) Page 20

- Level of variability, q = 40%


- Response rate, Rr = 70% 6 marks

Solution

Let us make the following table that shows the number of population in both sexes and in both type of
school:

School/Sex Private Public


Female 5742 33686
Male 4698 42874
Total 10440 76560

Table 4.4: Number of population in both sexes and in both type of school

4.3 SECTION THREE (15 marks)

1. The table below gives the output for 5 years of experimental farm that used each of 3 fertilizers.
Assuming that the outputs with each fertilizer are normally distributed with equal variance.
(i) Find the mean output for each fertilizer and the grand mean for all the years and for
all the 3 fertilizers (2 marks)
(ii) Find the value of SSTR (Sum of Squares of Treatments),SSE (Sum of Squares
of Error) and SST (Sum of Squares Total) (3 marks)
(iii) Get the degree of freedom for SSTR, SSE and SST (3 marks)
(iv) Determine MSTR, MSE, and F-ratio (3 marks)
(v) From the results above, construct the ANOVA table (3 marks)
(vi) Test the H0 Vs H1 at 5% level of significance. Since F − table = 3.8 (1 marks)

The 5-years outputs with 3 different fertilizers table are:

F1 F1 F1
17 21 17
18 19 20
17 18 19
16 22 20
17 20 19

Table 4.5: 5-years outputs with 3 different fertilizers


Section 4.3. SECTION THREE (15 marks) Page 21

Solution

(i) Means

17 + 18 + 17 + 16 + 17
ȲF1 = = 17
5
21 + 19 + 18 + 22 + 20
ȲF2 = = 20
5
17 + 20 + 19 + 20 + 19
ȲF3 = = 19
5
17 + 19 + 20
ȲF1 ,F2 ,F3 = = 18.7
3
(ii) Find the value of SSTR(sum of squares of treatments),SSE(sum of squares of error) and
SST(sum of squares total)
5
X 2
SST R = n Ȳi − Ȳ
i=1
= (2.89 + 1.69 + 0.09)
SST R = 23.35
5
X 2
SSE = Yi − Ȳi
i=1
= 2 + 10 + 6
SSE = 18
5
X 2
SST = Yi − Ȳ = 41.35
i=1

(iii) Degree of freedom for SSTR, SSE and SST


Degree of freedom for SSTR is k − 1 = 3 − 1 = 2.
Degree of freedom for SSE is n − k = 15 − 3 = 12.
Degree of freedom for SST is n − 1 = 15 − 1 = 14.
(iv) Determination of MSTR, MSE, and F-ratio

SST R 23.35
M ST R = =
k−1 2
= 11.675
SSE 18
M SE = =
n−k 10
= 1.5
M ST R 11.675
F − ratio = =
M SE 1.5
= 7.78.
Section 4.3. SECTION THREE (15 marks) Page 22

(v) ANOVA Table

Source of variation SS df MS F
Treatment 23.35 2 11.675 7.78
Error 18 12 1.5
Total 41.35 14

Table 4.6: ANOVA Table for data given in Table (4.5)

(vi) Test the H0 Vs H1 at 5% level of significance. Since F − table = 3.8


Here, we reject H0 , since Fcal = 7.78 > Ftable = 3.8. Thus, we conclude that there is differences
between means yields for 3 fertilizers.
5. Exam for Data Management Officer in
Ngoma District/50 Marks
5.1 SECTION ONE (20 Marks)

1. Explain the following terms: (5 marks)


(i) Statistics (1 Mark)
(ii) A sample (1 Mark)
(iii) A population (1 Mark)
(iv) Data Management System (1 Mark)
(v) Metadata (1 Mark)

Solution

(i) Statistics consists of a body of methods for collecting and analyzing data.
(ii) A sample from statistical population is the set of measurements that are actually collected in the
course of an investigation.
(iii) A (statistical) population is the set of measurements (or record of some qualitative trait) corre-
sponding to the entire collection of units for which inferences are to be made.
(iv) A database management system (DBMS) is system software for creating and managing
databases. It provides users and programmers with a systematic way to create, retrieve, update and
manage data.
(v) Metadata is defined as the data providing information about one or more aspects of the data; it
is used to summarize basic information about data which can make tracking and working with specific
data easier.
2. Write in full the following terms mostly used in reporting the official statistics: (5 MARKS)
NER, GER, ASFR, MDG’s, SDG’s, CBR, CDR, TFR, CPI, DHS, EICV, EDPRS, GBV, PPI, RPHC,
GDP, MTN, MUSA, NGO’s, NISR, RSSB, WASAC, VUP, NEC. (5 Marks)

Solution

23
Section 5.1. SECTION ONE (20 Marks) Page 24

We have the following full names


Number of pupils in primary school aged 6-12
N ER = Net Enrolment Rate =
Number of country total population aged 6
Number of pupils in primary school (regardless of age)
GER = Gross Enrolment Rate =
Number of country total population aged 6-12 years
births occurring to women in the age group x to x + n
ASF R = Age − specif ic fertility rate = × 1000
number of females in the age group x to x + n
M DG0 s = M illenium Development Goals
SDG0 s = Sustainable Development Goals
total births for a given area and year
CBR = Crude birth rate = × 1000
total mid year population of the area
total deaths for a given area and year
CDR = Crude Death Rate = × 1000
total mid year population of the area
T F R = Total fertility rate
CP I = Consumer price index
DHS = Demographic and Health Survey
EICV = Enquete Integrale sur les Conditions de V ie
EDP RS = Economic Development and P overty Reduction Strategy
GBV = Gender Based V iolence
P P I = P roducer P rice Index
RP HC = Rwanda P opulation and Housing Census
GDP = Gross Domestic P roduct
MTN = Mobile Telecommunication Network
M U SA = M utuelle de Sante
N GO0 s = N on Government Organization(s)
N ISR = N ational Institute of Statistics of Rwanda
RSSB = Rwanda Social Security Board
W ASAC = Water and sanitation corporation Limited
V UP = V ision U murenge P rogram
N EC = N ational Election Commission

3. The 2002 Rwanda population census indicated a total population of 8,128,553 while Fourth (2012)
Rwanda Population and Housing Census (RPHC4) established that the population of Rwanda was
10,515,973 residents, of which 5,064,868 were males. Among the total population, 26% never attended
schools, 58% completed primary education, 14% finished their secondary education while 2% have the
university degrees. Work on the following: (5 MARKS)
(i) Determine the Rwanda population growth rate (1 Mark)
(ii) Estimate the population of Rwanda for the new vision 2050 (1 Mark)
(iii) Compute the mean year of schooling (1 Mark)
(iv) Calculate the population density for 2002 and 2012 (1 Mark)
(v) Find the male-female ratio (1 Mark)
Section 5.1. SECTION ONE (20 Marks) Page 25

Solution

(i) The population growth rate r is determined by solving the following equation:

Pt = Po ert , (5.1.1)

where;

Pt is the total population at any time t,


P0 is the total population at current time,
e is the exponential function,
r is the population growth rate,
t is the time interval.

4. A survey was conducted in district Y in 2016 and revealed that there are a total population of 389,000
(196,000 males and 192,000 females). The total number of children born alive during the previous year
was 16,400. The survey has also recorded a total of 5835 deaths (3200 males and 2635 females) during
the same year. The survey has also identified the following deaths among the population: (5 MARKS)

• children < 1 month = 370

• Children 1 month - 11 months = 1100

• Children 1 year - 4 years = 1865

• Death of mothers during pregnancy = 130


Section 5.1. SECTION ONE (20 Marks) Page 26

Based on the above data provided, calculate the following measures of fertility and mortality:
(i) Crude birth rate (CBR) (1 Mark)
(ii) Crude death rate (CDR) (1 Mark)
(iii) Infant mortality rate (IMR) (1 Mark)
(iv) Child Mortality (CMR) (1 Mark)
(v) Maternal mortality rate (1 Mark)

Solution

(i) Crude birth rate


It is defined as the number of live births per 1,000 persons in a population in a year. That is:
16, 400
Crude birth rate = × 1000
389, 000
= 42/1000.
(ii) Crude death rate
This is the number of deaths per 1,000 populations in a given year. That is:
5, 835
Crude death rate = × 1000
389, 000
= 15/1000.

(iii) Infant mortality rate


This is defined as the probability (expressed as a rate per 1000 live births) of a child born alive in a
specified period dying before reaching the age of one. That is,
370 + 1, 100
Infant mortality rate = × 1000
16, 400
1, 470
= × 1000
16, 400
= 90/1000.
(iv) Child mortality rate
This is the number of deaths of children aged one year and above but bellow 5 years of age per 1,000
live births. That is,
1865
Child mortality rate = × 1000
16, 400
= 114/1000.
(v) Maternal mortality rate
This is the ratio of the number of maternal deaths during a given time period per 100,000 live births
during the same time-period. That is,
130
Maternal mortality rate = × 100, 000
16, 400
= 793/100, 000.
Section 5.2. SECTION TWO (15 Marks) Page 27

5.2 SECTION TWO (15 Marks)

1. The table below indicates the distribution of population in UBUDEHE categories during the 2010.
With the following data, construct a pie chart. 6 Marks

Category I Category II Category III Category IV Category V Category VI F1


3,504 2,112 1,776 1,200 672 376

Table 5.1: UBUDEHE Categories

2. The following data show the height in millimeters for 106 maize plants after 2 weeks. (9 MARKS)
129 148 139 141 150 148 138 141 140 146 153 141 148 138
145 141 141 142 141 141 143 140 138 138 145 141 142 131
142 141 140 143 144 135 134 139 148 137 146 121 148 136
141 140 147 146 144 142 136 137 140 143 148 140 136 146
143 143 145 142 138 148 143 144 139 141 143 137 144 133
146 143 158 149 136 148 134 138 145 144 139 138 143 141
145 141 139 140 140 142 133 139 149 139 142 145 132 146
140 140 140 132 145 145 142 149

With the procedure for constructing a grouped frequency distribution, determine the following
(i) The range (1 Mark)
(ii) Class width (1 Mark)
(iii) Lower limit of the first class of distribution (1 Mark)
(iv) Upper limit of the first class of distribution (1 Mark)
(v) Upper limit of the high value of distribution (1 Mark)
(vi) Show the table of completed distribution (4 Mark)

5.3 SECTION THREE (15 Marks)

1. Given the following population distribution and assuming the population growth rate in the regions is
3% throughout; as the Data Management Officer you are required to provide the population projections
by filling in the following tables: (5 MARKS)

Regions Pop in 2017 Pop in 2027 Pop in 2030 Pop in 2035 Pop in 2050
A 6,252
B 12,556
C 6,745
D 9,876
E 10,569

Table 5.2: Population projections


Section 5.3. SECTION THREE (15 Marks) Page 28

Solution

Here, we use the the following formula:


Pt = P0 ert ; (5.3.1)
where;

Pt is the population projected at time t


P0 is Initial population
r is the growth rate
t is time.

2. Find the dependency ratio in based on data given in the following table: (2 MARKS)

Age Group Populations


0-14 4,000
15- 24 11,000
25-44 6,000
45- 64 2,000
65 and over 12,000

Table 5.3: Age Categories and populations

Solution

The dependency ratio is given by the fraction of number of people aged 0-14 and people aged over 65;
dependents people and number of people aged 15-64 years. That is,
Number of people aged 0-14 and those aged 65 and over
Dependency Ratio = × 100
Number of people aged 15-64
4, 000 + 12, 000
= × 100
11, 000 + 6, 000 + 2, 000
16, 000
= × 100
19, 000
Dependency Ratio = 84%

3. A data Management Officer wants to determine the relationship between the age of the four indi-
viduals and their respective weight in the table below: (5
MARKS)

Age (X) Weight (Y )


80 years 75 Kg
50 years 65 Kg
60 years 65 Kg
85 years 65 Kg

Table 5.4: Age and Weight for 4 individuals


Section 5.3. SECTION THREE (15 Marks) Page 29

Compute the following statistics:


(i) V ar(X) 1 mark
(ii) V ar(Y ) 1 mark
(iii) Cov(X, Y ) 1 mark
(iv) Coefficient of correlation r 2 mark
(v) Determination coefficient r2 1 mark
(vi) The equation of regression line 1 mark
(vii) Graph the equation on the X, Y axes 1 mark

Solution

In order to answer the question about data in Table (5.4), we need the following table:
 2  2  
Age (X) Weight (Y ) Xi − X̄ Xi − X̄ Yi − Ȳ Yi − Ȳ Xi − X̄ Yi − Ȳ
80 75 11.25 126.5625 7.5 56.25 84.375
50 65 -18.75 351.5625 -2.5 6.25 46.875
60 65 -8.75 76.5625 -2.5 6.25 21.875
85 65 16.25 264.0625 -2.5 6.25 -40.625
X̄ = 68.75 X̄ = 67.5 818.75 0 75 112.5

Table 5.5: Relationship between X and Y process

Based on Table (5.5), we have


1 Pn
2 818.75
(i) V ar(X) = n−1 i=1 Xi − X̄ = 3 = 272.92
1 Pn
2 75
(ii) V ar(Y ) = n−1 i=1 Yi − Ȳ = 3 = 25
1 n
Yi − Ȳ = 112.5
P  
(iii) Cov(X, Y ) = n−1 i=1 Xi − X̄ 3 = 37.5
Cov(X,Y )
(iv) r = = √ 112.5√ = 0.45
σx σy 818.75 75

(v) r2 = 0.452 = 0.21


(vi) The equation of line is given as follows:

Ỹ = 
b + aX̃ 
cov(X, Y ) cov(X, Y )
= Ȳ − X̄ + X̃
V ar(X) V ar(X)
37.5 37.5
= 67.5 − × 68.75 + X̃
272.92 272.92
= 67.5 − 0.13 × 68.75 + 0.13X̃
Ỹ = 58.05 − 0.13X̃.
6. Exam of Local Development Data
Management Specialist LODA/50 Marks
6.1 SECTION ONE (20 Marks)

1. Explain the following terms: (5 marks)


(i) Statistics (1 Mark)
(ii) A sample (1 Mark)
(iii) A population (1 Mark)
(iv) Data Management System (1 Mark)
(v) Metadata (1 Mark)
2. Write in full the following terms mostly used in reporting the official statistics: (5 MARKS)
SDG’s, EICV, EDPRS, RPHC, VUP
3. Main activities of Local Development Data Management Specialist
4. What is the focus of LODA.
5. Logical framework of Local development and Local economic development.

6.2 SECTION TWO (20 Marks)

1. Given populations at t = 0, in 1978 t = 24, in 2002, and t = 34, in 2012.


(i) Find carrying capacity K.
(ii) Based on results in (i), what can you say about the following statement ””.
2. Given the birth rate α and death rate µ. Let r = α − µ. Determine the deterministic model that
combines birth and death rate.
3. Determine the population projections using model found in 2. above and complete the following
table

Year Population
2012
2020
2030
2040
2050
2060

30
Section 6.3. SECTION TWO (10 Marks) Page 31

6.3 SECTION TWO (10 Marks)

1. Given the confidence interval CI = 95%, p = 70% and the population for the given region. Find
the sample size.
2. Find the life expectancy. Suppose that we have 21 intervals of age with start and end points for
each interval:

x0 = 0, x1 = 1, x2 = 5, x3 = 10, . . . , x21 = 100.


We let M = 21 × 1 vector of mortality probabilities which denote the probability of dying between ages
xi−1 and xi , for i = 1, . . . , 21.
We find life expectancy at birth (LE), by using the following equation:
21
X
LE = Li, where (6.3.1)
i=1

Li = (xi − xi−1 )pi + ai di , where (6.3.2)

i
Y
pi = (1 − Mj ) with p0 = 1
j=1
xi − xi−1
ai =
2
di = pi−1 Mi

pi is the percentage of total population that lives on to the i + 1 interval.


ai is the average number of years lived in an interval by an individual who passes away (in the same
interval).
di is the percentage of total population that dies in the interval (xi−1 , xi ).
7. Exam for Data Management Officer at
NGOMA District/50 Marks
Q1. The following sample data set lists the number of minutes 50 internet subscribers spent on the
internet during their most recent session:

50 40 41 17 11 7 22 44 28 21 19 23 37 51 54 42 88 41 78 56 72 56 17 7 69 30
80 56 29 33 46 31 39 20 18 29 34 59 73 77 36 39 30 62 54 67 39 31 53 44

(a) Construct a frequency distribution from a data set that has seven classes, relative frequency, and
cumulative frequency (10 marks)
(b) Compute from data set above:
- The mean (5 marks)
- The sample variance (10 marks)
- The standard deviation (5 marks)
Q2. You are the one who will be in charge of data manager at sector level, name sources of data that
you know which are used for decision makers and planning? (10 marks)
Q3. Explain the difference between routine data and non-routine data and describe the sources of
each type (10 marks)

Solution

Routine data sources are data that are collected on a continuous basis, for example information that
clinics collect on the patients utilizing their services. Although, those data are collected continuously,
processing them and reporting on them usually occur only periodically, for instance, aggregated monthly
and reported quarterly.

• Data from routine sources is useful because it can provide information on a timely basis. For
instance, it can be used effectively to detect and correct problems in service delivery.
• However, it can be difficult to obtain accurate estimates of catchment areas or target populations
through this method, and the quality of the data may be poor because of inaccurate record keeping
or incomplete reporting.

Non-routine data source provide data that are collected on a periodic basis, usually annually or less
frequently.

• Using non-routine data avoids the problem of incorrectly estimating the target population when
calculating coverage indicators. Another advantage is that both those using and those not using
health facilities are included in the data
• Non-routine data have two main limitations: collecting them is often expensive, and this collection
is done on an irregular basis. In order to make informed program decisions, program managers
usually need to receive data at more frequent intervals than non-routine data can accommodate.

32
8. Exam for Statistician at MINIYOUTH /50
Marks
Q1. Differentiate between: 10 marks
(a) Descriptive and Inferential statistics
(b) Sample and population
(c) Statistics and parameters
(d) Null hypothesis and research hypothesis
(e) Type I error and Type II error.
Q2. Linear regression 15 marks
Q3. Sample size 15 marks
Q4 Chi square; test whether people who drink tend to smoke. 10 marks

Drinking Smoking
Yes No Total
Yes 57 28 85
No 45 20 65
Total 102 48 150

Table 8.1: Observed Events Frequencies

Now, we use the following formula to find out the Expected Events Frequencies as below:
fC × fR
fE = N (PC PR ) = (8.0.1)
N
By using the the above equation (8.0.1), we obtain the following table:

Drinking Smoking
Yes No Total
Yes 57.8 27.2 85
No 44.2 20.8 65
Total 102 48 150

Table 8.2: Expected Events Frequencies

The Chi-square statistic is calculated as below:


X (fO − fE )2
χ2 = (8.0.2)
fE
By using data from both Tables (8) and (8) and apply the equation (8.0.2), we obtain
(57 − 57.8)2 (28 − 27.2)2 (45 − 44.2)2 (20 − 20.8)2
χ = + + +
57.8 27.2 44.2 20.8
= 0.0002 + 0.0009 + 0.0003 + 0.0015 = 0.0029.

33
Page 34

Degrees of freedom = (number of rows − 1) × (number of colums − 1)


9. Exam for Statistician at RSB /50 Marks
Q1. The Rwanda Standards Board (RSB) is a public National Body established by the government of
Rwanda, whose mandate is to develop and publish National Standards, carry out research in the areas of
standardization, and to disseminate information on standards, technical regulations related to standards
and conformity assessment, metrology for the setting up of measurement standards, among others. The
management of RSB recommended that the survey is the most appropriate way to the standards uptake
rate in Rwanda. As statistician, describe the process of conducting that survey. 25 marks
Q2. Since 2008 Rwanda Standards Board commence consumer protection activities through verification
of fuel dispensing pumps throughout the country, and this has been followed by the verification of weight
instruments in market place and removing inappropriate measuring instruments that are food in use on
the market. The following data are weight of Irish potatoes and given to the consumer as 1kg: 10
marks

0.84 0.77 0.67 0.94 0.90 0.93 0.81 0.67 0.89 .77 0.88 0.74 0.93
0.76 0.78 0.80 0.88 0.66 0.77 0.89 0.81 0.78 0.77 0.72 0.94 0.72

Table 9.1: Certified products in local markets

Calculate and give the meaning of the mean, mode, median and standard deviation in this distribu-
tion.
Q3. A consumer Agency surveyed all 250 local markets to ensure the number of certified products that
are trade in. The following table shows the frequency distribution of the data collected by the agency.

Number of S-mark certified products 5 11 23 26 41


Number of markets 12 97 73 41 27

Table 9.2: Certified products in local markets

(a) Construct a probability distribution table for the numbers of S-mark certified products traded
in these markets. 5 marks
(b) The probability that a market selected at random trades more than 11 S-mark certified
products. 5 marks
(c) As Statistician, what do you think that should be in the substantive report of this survey. 5
marks

35
10. Exam for Lecturer in Mathematics at Gishari
Q1. Is the following relation function from A to B; given set of pairs and A and B sets of numbers.

Solution

By definition, a function is a rule which relates the values of one variable quantity to the values of
another variable quantity, and does so in such a way that the value of the second variable quantity is
uniquely determined by the value of the first variable quantity. Note that each value of second variable
quantity has at least one correspondence in first variable quantity. We call the first variable quantity
domain of definition of the function and the second variable function is called range of function.
Q2. Venn diagram
Q3. A farmer is taking her eggs to the market in a cart, but she hits a pothole, which knocks over all
the containers of eggs. Though she is unhurt, every egg is broken. So she goes to her insurance agent,
who asks her how many eggs she had. She says she doesn’t know, but she remembers somethings from
various ways she tried packing the eggs.
When she put the eggs in groups of two, three, four, five, and six there was one egg left over, but when
she put them in groups of seven they ended up in complete groups with no eggs left over.
What can the farmer figure from this information about the smallest number of eggs she could have
had?.

Solution

Let N be the number of eggs she could have had. The question says that when the eggs are taken out
in groups of 2, 3, 4, 5, 6, there is always an egg left over. This essentially implies that the number of
eggs in the basket are one more than the lowest possible number which is divisible by 2, 3, 4, 5 and 6.
Thus, LCM (2 ,3, 4, 5 and 6) is:

LCM (2, 3, 4, 5, 6) = 4 × 3 × 5 = 60.

Also, since one egg is left over, the number of eggs in the basket will be of the form

N = 60m + 1; where m is any natural number, m 6= 0

Now, we know that N is divisible by 7. So, we try to split 60m + 1 as a multiple of 7.


Therefore, N becomes

N = 7(8m) + 4m + 1
4m + 1 is the remainder part.
We have to find value of m for which 4m + 1 is divisible by 7. Therefore, m = 5, 12, 19 . . .
Substituting the above values of m, we get

36
Page 37

N = 301, 721, 1141, . . . .


Q4. A furniture manufacturer has 6 units of wood and 28 hours of free time in which he will produce
decorative screens. Two models have sold well in the past, so he will restrict himself to those two. He
estimates that model A requires 2 units of wood and 7 hours of time, while model B requires 1 unit of
wood and 8 hours of time. The prices of the models are $120 and $80 respectively. Model this problem
as a linear programming problem if the furniture maker wants to maximize his sales revenue and solve
it graphically.

Solution

Decision variables are:


x1 : the number of model A to be produced
x2 : the number of model B to be produced
Objective function
Max Z = 120x1 + 80x2
Constraints
Subject to:

2x1 + x2 6 6
7x1 + 8x2 6 28
x1 , x2 ≥ 0

To solve this problem, we first of all draw the two lines below:

2x1 + x2 = 6 ⇒ x1 = 0, x2 = 6, x1 = 3, x2 = 0
7x1 + 8x2 = 28 ⇒ x1 = 4, x2 = 0, x1 = 0, x2 = 3.5
Page 38

Figure 10.1: Graph of lines 2x1 + x2 = 6 and 7x1 + 8x2 = 28

From the Figure 10.1, we can see the feasible solution has 4 critical points among which we find out
the point which gives us the maximum Z.

• For the point (0, 0), Z = 0


• For the point (3, 0), Z = 360$
• For the point (0, 3.5), Z = 280$
• For the point of intersection of the two lines
20
x2 = 6 − 2x1 ⇒ 7x1 + 48 − 16x1 = 28 ⇒ x1 = 9 .
40 14
Then, x2 = 6 − 2x1 = 6 − 9 = 9
Therefore, Z = 120( 20 14
9 ) + 80( 9 ) =
2400+1120
9 = 3520
9 = 391.1$.
Thus, the maximum revenue the furniture maker can make is

Z = 391.1$ (10.0.1)

Q5. Find coefficient of correlation using the given expanded formula.


11. Exam for Assistant Lecturer in Mathematics
at IPRC Tumba
Question One /25 marks
q
~ = ~i + 2~j + 3~k and B
a) Given A ~ = −2~i + ~j + 35 ~
2 k

~ and B
i) Find the angle between vectors A ~ /5 marks
ii) Find the area generated by the above vectors /5 marks
~ and B
iii) Find the unit vector perpendicular both vectors A ~ /5 marks
2 2nπ
b) Consider the quadratic equation tan2 t x2 + (tant) x + 1 = 0; where t 6= 2

i Find the complex numbers x1 and x2 solutions of the given equation. /5 marks
ii For each real number, show that we have
xn1 + xn2 = 2 cos 1nπ cos tn t (Hint: Apply the polar form of x1 and x2 in principal argument)

3 /5
marks
Question Two /25 marks
n
(−1)
a) Let un = (−1)n , vn = n , find
i. limn→∞ un /2.5 marks
ii. limn→∞ vn /2.5 marks
b) Calculate:
i. nk=1 51k
P
/5 marks
ii. ∞ 1
P
k=1 5k /5 marks
c) c) Show that the following differential equation is exact and solve it. 10 marks

(1 − sin x tan y) dx + cos x sec2 y dy = 0




Question Three /25 marks


a) Let R be the following equivalence relation on the set A = {1, 2, 3, 4, 5}
r = {(1, 1) , (1, 5) , (2, 2) , (2, 3) , (2, 6) , (3, 2) , (3, 3) , (3, 6) , (4, 4) , (5, 1) , (5, 5) , (6, 2) , (6, 3) , (6, 6)}
Find the partition of A induced by R means Find the equivalence classes of R. /5 marks
2
b) Find the inverse Laplace transforms of F (s) = s2 (s2 +1)
/5 marks

c) Using the properties of determinants show that


a b c
2
a
b2 c2 = (a − b) (b − c) (c − a) (a + b + c)
b + c c + a a + b

39
Page 40

d) Evaluate
i. sin2 x cos4 xdx
R
R dx √
ii. x2 16 − x2 /5 marks
Question Four /25 marks
a) i. Find the Fourier series of the following periodic function /10 marks
π2 1 1 1 1
ii. Deduce that 8 = 12
+ 32
+ 52
+ 72
+ ... /5 marks
b) b) Solve the following PDE using the separation of variables method: /5 marks

∂u ∂u
2x − 3y =0
∂x ∂y
12. Exam for Assistant Lecturer in Mathematics
at IPRC Kigali
Q1. Evaluate the sum of n terms of the series 32 + 52 + 72 + . . . /3 marks
Q2. Find the fourth roots of the complex number z = −2 + 3i /2 marks
Q3. Find the derivative of f (x) = (cos x)sin x /2 marks
π
Q4. Find the form of the string at the moment t = 2a if its vibration is given by: /3 marks

d2 u 2
2d u
= a and u(0) = sin x, ut (0) = 1
dt2 dx2

Q5. Let G(x, y) be given by G(x, y) = xy for discrete random variables x and y with the joint
probability distribution given by the joint probabilities P [X = a, Y = b]

xy 0 1 2
1
0 0 4 0
1 1
1 4 0 4
1
2 0 4 0

Table 12.1: Joint probabaility distribution

Compute E[G(x, y)] /4 marks


Q6. A. Given the parabola y 2 = 4ax, find the coordinates (R, S) of the centre of the curvature ate the
point P (at2 , 2at).
1
B. If θ = KHLV 2 where K is a constant and there are possible errors of ±1 percent in measuring
H, L and V find the maximum possible error in the calculated value of θ /3 marks
Q7. A. Find the sum of n terms of /5 marks
1 3 5 7
+ + + + ...
1×2×3 2×3×4 3×4×5 4×5×6

B. i. Find the Laplace transform of the function ii. Find the inverse Laplace transform of
2 +27
F (s) = 7s
s3 +9s
/3 marks
Q8. Calculate the root mean square value (rms) of: /4 marks
1
i = 20 + 100 sin 100πt, 0 < t <
50

Q9. A. Prove that (2x + y)dy + (x + 2y)dx = 0 given that (x − y)3 = Ax + Ay /3 marks
B. Determine all the asymptotes of the following curve /3 marks

xy 2 − x2 y + x + y = 2

41
Page 42

Q10. Find the Green’s function for the ODE /3 marks

d2 u
+ u = f (x), u(0) = 0, u(L) = 0, where L 6= nπ and n integer.
dx2
13. Exam for Statistician at SENAT
Question 1
(a) Define briefly in your own words the following general concepts often used by statistician in any
study.
(i) Population /1mark
(ii) Descriptive statistics /1mark
(iii) Inferential statistics /1mark
(iv) Census /1mark
(v) Sample survey /1mark
(b) The Rwanda Senate organization conducted a telephone survey with a randomly selected national
sample of 1005 adults, 18 years and older. The survey asked the respondents, ”How would you describe
your own physical health at this time?” Response categories were Excellent, Good, Only Fair, and
No Option.
(i) What was the sample size for this survey? /1 mark
(ii) Are the data qualitative or quantitative? /1 mark
(iii) Would it make more sense to use averages or percentages as a summary of the data for
these questions? /1 mark
(iv) Of the respondents, 29% said their health was excellent. How many individuals provided
this response? /2 mark
(c) Consider the following frequency distribution

Class Frequency
10 − 19 10
20 − 29 14
30 − 39 17
40 − 49 7
50 − 59 2

Table 13.1: Grouped data

Construct a cumulative frequency distribution and a cumulative relative distribution. /5 marks


Question 2
(a) In automobile mileage and gasoline-consumption testing in Rwanda, 13 automobiles were road
tested for 300 km in both city and highway driving conditions. The following data were recorded for
kilometers-per-gallon performance.

43
Page 44

city 16.2 16.7 15.9 14.4 13.2 15.3 16.8 16.0 16.1 15.3 15.2 15.3 16.2
Highway 19.4 20.6 18.3 18.6 19.2 17.4 17.2 18.6 19.0 21.1 19.4 18.5 18.7

Table 13.2: Automonile mileage and gasoline-consumption

Use the mean, median and mode to make a statement about the difference in performance for city
and highway driving. /10 marks
(b) Consider a sample with data values 27, 25, 20, 15, 30, 34, 28, and 25. Provide the five numbers
statistical summary for the data.
Question 3
A sales manager collected the following data on annual sales and years of experience.

Sales persons Years of experience Annual Sales($1000)


1 1 80
2 3 97
3 4 92
4 4 102
5 6 103
6 8 111
7 10 119
8 10 123
9 11 117
10 13 136

Table 13.3: Annual sales and years of experience

(i) Develop a scatter diagram for these data with years of experience as independent variable.
/2 marks
(ii) Develop an estimated regression equation that can be used to predict annual sales given the
years of experience. /4 marks
(iii) Use the estimated regression equation to predict annual sales for a person with 9 years of
experience. /2 marks
Page 45

Question 4
Five observations are taken for two variables are as follows

x 6 11 15 21 27
y 6 9 6 17 12

Table 13.4: Given two variables data

(i) Develop scatter diagram for these data. /2 marks


(ii) What does the scatterdiagram indicate about the relationship between x and y. /2 marks
(iii) Compute and interpret the sample covariance. /3 marks
(iv) Compute and interpret the sample correlation coefficient. /3 marks

GOOD LUCK
14. Exam for Data Entry Clerk at REB
1. What is the language that is used to define the structure of the relation, deleting relations and
relating schemas?
2. What is the language that produces the ability to query information from the database and to
insert tuples into, delete tuples from, and modify tuples in the database?
3. What is the ”CIA” Triad, ”Defense-in-depth”
4. What is U2F?why, how and where is it needed?
5. With supporting argument explain and enumerate the functions of 3FA?
6. With Microsoft Excel, how to find and remove duplicate in the excel data?
7. What do you understand about macros in Microsoft Excel?When can Macros be used in MS
Excel?
8. What is Pivot Table? How and where is it helpful?
9. In MS Excel, what is ”Conditional formatting”, ”Data Validation” and ”Consolidate” and their
respective features”
10. What can slicers in MS Excel of MS Office 2010 help the user?
11. What are Calculated fields in Pivot Table?
12. How many data formats are available in Excel? Name five(5) of them.
13. Specify the order of operations used for evaluating formulas in Excel.
14. Which are the two macro languages in MS-Excel?
15. How can you prevent someone from copying he cell from your worksheet?
16. Explain five useful functions to manipulate data in Excel?
17. What are three report formats that are available in Excel?
18. How would you provide a Dynamic range in ”Data source” of Pivot Table?
19. Difference between COUNT, COUNTA, COUNTIF, LOOKUP, and COUNTBLANK in Excel?
20. How can you create shortcuts to Excel functions?
21. What is indexing and what are the different kinds of indexing?
22. What are some best practices when creating complex models in Excel?
23. Describe the three levels of data abstraction?
24. How does Tuple-oriented relational calculus differ from domain-oriented relational calculus?
25. What is a data mart?

Good Luck!

46
15. Pretended
QUESTION 1
Below are the data on the weight (in Kilograms, kgs) of 12 children received at Health Facility for
medical consultation.

15 9 11 10 12 18 10 21 8 9 14 10

Table 15.1: The weight of 12 children

1.1 Draw a frequency distribution table including the frequencies, cumulative frequencies, relative fre-
quencies and cumulative relative frequencies. Calculate:
1.2 The percentage of children who have a weight less or equal to 11 Kg?
1.3 The percentage of children who have a weight less or equal to 14 Kg?
1.4 The percentage of children who have a weight less or equal to 18 Kg?
1.5 The percentage of children who have a weight between 9 and 11 Kg?
1.6 The mean weight
1.7 The median weight
1.8 The mode
1.9 The variance
1.10 The standard deviation
1.11 The range
1.12 The coefficient of variation
1.13 The standard error of the mean
1.14 Calculate a 95% confidence interval for the mean(use the value provide on t-distribution table and
α = 0.05
1.15 What does this interval tell you?

47
Page 48

QUESTION 2
The data in table below are given for nine patients with aplastic anaemia Hematologic data for patients
with aplastic anaemia.

Parient number % Reticulocytes Lymphocytes (mm2 )


1 3.6 1700
2 2 3078
3 0.3 1820
4 0.3 2706
5 0.2 2086
6 3 2299
7 0 676
8 1 2088
9 2.2 2013

Table 15.2: The Reticulocytes and Lymphocytes of 9 patients

2.1 Compute the slope (β)


2.2 Compute the intercept (α)
2.3 Write down the regression line equation relating the percentage of reticulocytes (x) to the number
of lymphocytes (y)
2.4 What is the predicted value of lymphocytes for x = 3.7% reticulocytes?
2.5 Using the F test, test for the statistical significance of the regression line after stating H0 and H1 .
2.6 Compute the coefficient of determination (R2 )
2.7 What does R2 means?
QUESTION 3
If the mean serum creatinine level measured in 12 patients 24 hours after they received a newly pro-
posed antibiotic was 1.2mg/dL. If the mean and standard deviation of serum creatinine in the general
population are 1.0 and 0.4mg/dL, respectively, then using a significance level of 0.05, test whether the
mean serum creatinine level in this group is different from that of the general population. State H0 and
H1 .
QUESTION 4
The State Health Department wants to examine data on weight and gender in high school students.
The table below shows the overweight students among males and female of 1747 students. Test the
association between gender and weight given that the Chi-square value corresponding to p-value of 0.05
is 3.84 with df = 1.
Page 49

Normal weight Overweight Total


Male 570 380 950
Female 681 116 797
Total 1251 496 1747

Table 15.3: Weight by gender of students

4.1 State the null hypothesis (H0 ) and its alternative (H1 ).
4.2 Test the association between gender and weight given that the Chi-square value corresponding
top-value of 0.05 is 3.84 with df = 1.

You might also like