You are on page 1of 3

UNIVERSITY FOR INFORMATION SCIENCE AND TECHNOLOGY

Ohrid University, Macedonia


Probability and Statistics
- First mid-term exam - 05.04.2022 -

1. (20) Blood proteins in children from Papua New Guinea. C-reactive protein (CRP) is a substance that can be
measured in the blood. Values increase substantially within 6 hours of an infection and reach a peak within 24 to 48
hours after. In adults, chronically high values have been linked to an increased risk of cardiovascular disease. In a
study of apparently healthy children aged 6 to 60 months in Papua New Guinea, CRP was measured in 90 children.
The units are milligrams per liter (mg/l). Here are the data from a random sample of 40 of these children:

After grouping the data into 6 class intervals with equal width:
a) Make a summary table for the frequency, relative frequency and “less than” grouped cumulative frequency
distribution for the given data.
b) Make a frequency histogram.
c) Find the approximate sample mean, approximate mode, approximate median and approximate sample variance.
Explain how the relationship between the mean and the median reflects the shape of the distribution.
d) Give the five-number summary and explain briefly how it reflects the shape of the distribution.

1
2. (20) In my town, it's rainy one third of the days. Given that it is rainy, there will be heavy traffic with probability ,
2
1
and given that it is not rainy, there will be heavy traffic with probability . If it's rainy and there is heavy traffic, I
4
1 1
arrive late for work with probability . On the other hand, the probability of being late is reduced to if it is not
2 8
rainy and there is no heavy traffic. In other situations (rainy and no traffic, not rainy and traffic) the probability of
being late is 0.25. You pick a random day.
a) What is the probability that it's not raining and there is heavy traffic and I am not late?
b) What is the probability that I am late.

3. (20) Of the 16 071 degrees in mathematics given by U.S. colleges and universities in a recent year, 73% were
bachelor’s degrees, 21% were master’s degrees, and the rest were doctorates. Moreover, women earned 48% of the
bachelor’s degrees, 42% of the master’s degrees, and 29% of the doctorates. You choose a mathematics degree at
random and find that it was awarded to a woman. What is the probability that it is a bachelor’s degree?
4. (15) Suppose that on a given weekend the number of accidents at a certain intersection has the Poisson distribution
with mean 0.8. What is the probability that there will be at least three accidents at the intersection during the
weekend?

5. (25) Let X be a random variable with probability density function

c 1  x  ,  1  x  1
2

fX  x  
 0, otherwise

where c is a constant. Find the following:


a) The value of c.
b) The cumulative distribution function of X.
c) Pr(X > 0),
d) Pr(X < 0.5| X > -0.5)

6. (5) A survey of students in an introductory statistics class asked the following questions: (a) how much did you spend
on food last week? (b) height; (c) do you like broccoli? (yes, no). Classify each of these variables as categorical or
quantitative and give reasons for your answers.

7. (10) At the end of a statistics course, the 27 students in the class were asked to rate the instructor on a number scale
of 1 to 9 (1 being "very poor," and 9 being "best instructor I've ever had"). Assume that the average rating in each of
the three classes is 5 (which should be visually reasonably clear from the histograms), and recall the interpretation of
the SD as a "typical" or "average" distance between the data points and their mean. Here are the histograms of the
data:

Judging from the table and the histograms, which class would have the largest standard deviation?

8. (5) A recent survey asked 90 students, How many hours do you spend on the computer in a typical day? Of the 90
respondents, 3 said 1 hour, 5 said 2 hours, 15 said 3 hours, 25 said 4 hours, 20 said 5 hours, 15 said 6 hours, 5 said 7
hours, 1 said 8 hours, and 1 said 9 hours. What is mode number of hours spent on the computer?
a) 1 b) 4 c) 9 d) 5 .
• Total Probability: Bayes’s Theorem:

f 2  f1
 Approximate Mode  b  w
 f 2  f1    f 2  f3 
b is the lower boundary of the mode class,
f1, f2, f3 are the frequency of the class that precedes the modal class, of the modal class and class that
follows modal class in the distribution, respectively and w is the class width.
n
  f1
 Approximate Median  b  2 w
f me

b is the lower boundary of the median class,

f 1 - cumulative frequency from classes that precede the median class,

fme - is the frequency of the median and w is the class width.

 Let  xi , fi  , i  1, 2,..., k is a frequency distribution of the sample X1 , X 2 ,..., X n for the variable X with
a set of observation values x1 , x2 ,..., xn , then arithmetic mean of the sample, or sample mean is
1 k
obtained by the formula: x   fi  xi .
n i 1

 The sample variance, denoted by s 2 , of a set of n observed values having a mean x is the sum of the
squared deviations divided by n − 1:
2
1 n 1  n 2 2
s  2
n  1 i 1

xi  x  or, s 2  
 xi  nx  .
n  1  i 1 

NOTE: For determining the approximate mean and variance from a grouped frequency distribution first we
obtain the class marks for the class intervals and then we use the previous formula for the sample mean and
variance.

 The position of Q1 is 0.25  n  1

 The position of Q3 is 0.75  n  1

Mean:   E  X    xp( x)
xRX

Variance:  2  ( x   ) 2 p ( x)  E  X 2    E  X  
2

where: E  X 2   x 2
p( x)
xRX

Standard deviation:    2

You might also like