You are on page 1of 19

lOMoARcPSD|12818064

STAT assignment - worksheet For introduction to statics

Introduction to Statistics (University of the People)


StuDocu is not sponsored or endorsed by any college or university
Downloaded by Fedasa Bote (fedasagete@gmail.com)
Addis Ababa University
College of Natural and Computational Sciences
Department of Statistics
Probability & Statistics for Engineers (Stat 2171) Worksheet I
GROUP MEMBERS
NAME ID
1. MIKIYAS MOHAMMED UGR/3095/12
2. MULUKEN WALLE UGR/9931/12
3. YASIN MOHAMMED UGR/4544/12
4. GELETA MATHEWAS UGR/4712/12
5. DEGSEW ABEBAW UGR/1341/12
6. TAMIRAT LEMESSA UGR/5507/12
SUBMITTED TO:- M.s SELOME BEKELE
SUBMISSION DATE:- MAY 4,2021

1, Distinguish the following statistical terms with examples:


lOMoARcPSD|12818064

a) Descriptive and inferential statistics


 Descriptive statistics : is a type of statistics that consists of the collection,
Organization, summarization, presentation of data.

 Inferential statistics: other type of statistics that consists of generalizing from


Sample to populations, performing estimations and hypothesis testing,
Determining relationships among variables and making predictions.

Example: you might stand in a mall and ask a sample of 100 people if they like shopping at Sears. You
could make a bar chart of yes or no answers (that would be descriptive statistics) or you could use your
research (and inferential statistics) to reason that around 75-80% of the population (all shoppers in all
malls) like shopping at Sears.
b) Sample and population
 Sample: It is a part of population selected in statistical manner to study the population
 Population: a totality of things, objects, peoples, etc. about which
Information is being collected.

Example: Say you are looking for a job in the IT sector, so you search online for IT jobs. The first search
result would be for jobs all around the world. But you want to work in Ethiopia, so you search for IT jobs
in Ethiopia. This would be your population. It would be impossible to go through and apply for all
positions in the listing. So you consider the top 30 jobs you are qualified for and satisfied with and apply
for those. This is your sample.
c) Parameter and statistic:
 Parameter: It is statistical value which refers to the population
Characteristics or it is a result obtained from the population.
For example: a parameter can be used to describe the mean amount of loans that are
awarded to the students of ABC University. Assuming that the population of the
university is 3,000, the researcher can start by calculating the financial aid of a few
select samples of the population, or about 10 students. With three samples of 10
students each, the researcher may obtain a mean of $2,000, $1,200, and $800. The
researcher can use this sample mean to make an inference about the population
parameter.

 statistic: It is statistical value which refers to the sample characteristics or it


is a result obtained from the sample. Example: mortality rate of a given country
d) Census and sample survey:
 Census: a complete enumeration of the population.
 sample survey: - is a method of collecting data from or about the members of
lOMoARcPSD|12818064

A population so that inferences about the entire population can be obtained from a subset or sample of the
population members
Example: doing a survey of travel time by asking everyone at school is a census of the school. But asking
only 50 randomly chosen people is a sample.
e) Quantitative and qualitative variables
 Quantitative variables: are variables that can be placed into distinct categories or
Classified in accordance with an attribute that cannot be measured or expressed in numbers.
 Qualitative variables: are variables that can be measured, ordered or ranked.

Examples of quantitative characteristics are age, BMI, and time from birth to death. Examples of
qualitative characteristics are gender, race, genotype and vital status. Qualitative variables are also called
categorical variables.
f) Continuous and discrete variables
 Discrete: is one which takes only whole number values. It is usually obtained by
counting .there is a gap between consecutive values, it varies only by finite jumps.
Example: number of students the class, number of chair in the class,
number of house along in the street.
 Continuous: is one which takes all real values between two given real values.
Between consecutive values we can assume an infinite number of values. Continuous
variables would take forever to count. In fact, we would get to forever and never finish
counting them. For example, take an age. We can’t count “age”. Because it would literally
take forever. For example, it could be 37 years, 9 months, 6 days, 5 hours, 4 seconds, 5
milliseconds, 6 nanoseconds, 77 picoseconds…and so on.

g) Nominal and ordinal scales of measurements


Both of them are for qualitative data type but,
 Nominal: level of measurement classifies data into non-overlapping and Exhausting
categories in which no order or ranking can be imposed on the data. No arithmetic and
relational operation can be applied.

Example:- Classifying students according to their field of study; Categorizing survey subjects as male and
female; Religion (Christianity, Islam, Judaism, etc.)
 Ordinal: level of measurement classifies data into categories that can be ranked; however,
precise differences between the ranks do not exist. Arithmetic operations are not applicable
but relational operations are applicable.

Examples: Rating: Excellent, Very good, Good, Fair, Poor, Letter grades: A, B, C, D, F

h) Interval and ratio scales of measurements


Both are for quantitative data type but
lOMoARcPSD|12818064

 Interval: level of measurement ranks data, and precise differences between units of
measure do exist; however, there is no meaningful zero. All arithmetic operations except
division and multiplication are applicable. Relational operations are also possible
Examples of interval variables include: temperature (Farenheit), temperature (Celcius), pH,
SAT score (200-800), credit score (300-850).
 Ratio :- level of measurement possesses all the characteristics of interval
Measurement, and there exists a true zero. All arithmetic operations
And Relational operations are also possible.
Examples of ratio variables include: enzyme activity, dose amount, reaction rate, flow rate,
concentration, pulse, weight, length, temperature in Kelvin (0.0 Kelvin really does mean
“no heat”), and survival time.

2. Classify the following sentences as belonging to the area of descriptive statistics or inferential statistics.
a) As a result of recent cutbacks by oil-producing nations, we can expect the price of gasoline
to double in the next year.
 inferential

b) At least 5% of all killings reported last year in city X were due to tourists.
 descriptive

c) Of all patients who received this particular type of drug at a clinic Y, 75% later developed
significant side effect.
 descriptive

d) Adane concludes that his chance of passing the first year this academic year is at least
80% based on the statistics that 75% of the freshmen passed last year.
 Inferential

3. Classify each of the following first as qualitative and quantitative and second as nominal, ordinal,
interval and ratio measure.
a) Monthly income of persons.
 Quantitative and ratio

b) Socio-economic status of a family when classified as low, middle and upper class.

 Qualitative and ordinal

c) Temperature inside 10 refrigerators.


 Quantitative and interval
lOMoARcPSD|12818064

d) Classification of marital status as single, married, divorced and widowed.


 Qualitative and nominal

e) Times for swimmers to complete a 50-meter race


 Quantitative and ratio

f) Months of the year Meskerm, Tikimit, …


 Qualitative and ordinal

g) Regions numbers of Ethiopia (1, 2, 3 etc.)


 Qualitative and nominal

h)The number of students in a college


 Quantitative and ratio

i) The net wages of a group of workers


 Quantitative and ratio

j)The expansion of a rod of metal when heated


 Quantitative ad ratio

k) The height of the men in the same town


 Quantitative and ratio

l. For 16 persons arrested for driving while intoxicated you record whether they live in urban,
suburban, or rural areas.
 Qualitative and nominal

7. Suppose data collected for heights (in cms) 0f 390 cows were tabulated in a frequency
Distribution and the following results were obtained.
frequency 6 25 48 72 116 60 38 22 3
Class 11 117
mark 2
lOMoARcPSD|12818064

a.)Construct a complete frequency distribution with class limits, class boundaries, Frequencies, and the
less than type cumulative frequencies.
Class limit Class boundaries Class mark Frequency LCF LCF
110 – 114 109.5 – 114.5 112 6 6
115 – 119 114.5 – 119.5 117 25 31
120 – 124 119.5 – 124.5 122 48 79
125 – 129 124.5 – 129.5 127 72 151
130 – 134 129.5 – 134.5 132 116 267
135 – 139 134.5 – 139.5 137 60 327
140 – 144 139.5 – 144.5 142 38 365
145 – 149 144.5 – 149.5 147 22 387
150 – 154 149.5 – 154.5 152 3 390

b. Determine a height above which 50% of the cows found?


 50%=median

so using median formula we can get the median


Md=Lm+ (n/2−cfb )/2w
n/2=halfway point
390/2 = 195
The median class 195th
MD=129.5+(390/2−151)/116 )5

d1+d2

M=129.5+ ( )5
¿129.5+2.2 =
131.7 is the mode
lOMoARcPSD|12818064

e. Draw a histogram, frequency polygon and less than type Ogive for the above data

histogram

frequency
140
120
100
80
60
frequency

40
20
0
5 5 5 5 5 5 5 5 5
1 4. 1 9. 2 4. 2 9. 3 4. 3 9. 4 4. 4 9. 5 4.
–1 –1 –1 –1 –1 –1 –1 –1 –1
5 5 5 5 5 5 5 5 5
9. 4. 9. 4. 9. 4. 9. 4. 9.
10 11 11 12 12 13 13 14 14

class mark

Frequency polygon
Series 1
140

120

100
frequency

80
Series 1
60

40

20

0
112 117 122 127 132 137 142 147 152
class boundary

Less type Ogive


lOMoARcPSD|12818064

450

400

less camultive frequency 350

300

250

200
Column1
150

100

50

0
6 25 48 72 116 60 38 22 3
frequency

11. Suppose the average salary of male employees is 520 Birr and that of females is 420 Birr.
The mean salary of all employees is 500 Birr. Find the ratio of the number of male and female
employees.

Nf is number of female Sm salary of male


Nm is number if male Sf salary of female
´ = Sm
X m Nm
Sm
520= Nm
Sm=520*Nm

´ = Sf
X f Nf
Sf
420=
Nf
Sf=420*Nf

X s´ = Sf +Sm =Nf∗420+520∗Nm
Nf+Nm Nf +Nm
500 (Nf +Nm)=Nf∗420+520∗Nm
500∗Nf +500∗Nm=Nf∗420+520∗Nm
Nf∗80=20∗Nm
Nm
= = =¿
Nf 20 1
lOMoARcPSD|12818064

13. The price of a commodity increased by 5% from 1996 to 1997, by 8% from 1998 to 1999 and
by 77% from 2000 to 2001. What was the average yearly price increase?

G.M=√n x1∗x2∗…. xn
G.M
G.M=1.26−1=0.26=26%
14. Suppose you have given the following distribution.

Class limits Frequency cf


0-9 4 4
10-19 16 20 If the median and mode of the
20-29 F3 20+ F3 distribution are given to be 33.5 and
30-39 F4 F3+F4+20 34.0 respectively, then.
40-49 F5 F3+ F5+F4+20 a) Determine the missing frequencies
50-59 6 F3+ F5+F4+26 n
60-69 4 F3 +F5+F4+30=230

(
Total 230

)
−cfb
md=Lm+w
f
By using the median value we can get the Lm boundary of the median that is 29.5 also
the with is 10

( )
33.5=29.5+ 115−(20+F 3) 10 F

4
From this equation we get
5F3+2F4=475………………….eq1
d1

M=Lm+ ( )w d1+d2
By using the mode value we can get the Lm and the class so
Lm=29.5 d1=F4-F3 d2=F4-F5
lOMoARcPSD|12818064

34=29.5+ ( F

4
−F3
)10 F 4−F3+F
4−F5
From this equation we get
11F3-2F4-9F5=0………………..eq 2
Using the total frequency we get
F3+F4+F5=200………..eq3
By using the above 3 equation simultaneously we can get
F3 = 55
F4 = 100
F5 = 45

b) Compute the arithmetic mean.

AM

c) Compute the value below which 25% of the observations lie.

kn

Pk=Lm+ ( f ) w

230/4=57.5th item Pk

Pk=26.32
25% of observation lies below 26.32

d) Compute the value above which 25% of the observations lie.


lOMoARcPSD|12818064

This means 75% kn

(
100 )
−cfb
Pk=Lm+f w

230*3/4=172.5th item

Pk

Pk=39.25
75% of observation lies below 39.25
15) In a certain test the pass mark is 30. The distribution of marks of passing candidates
classified by
sex is given below.
Mark 30 – 34 35 – 39 40 – 44 45 – 49 50 – 54 55 – 59 Total
Boys 5 10 15 30 5 5 70
Girls 15 30 30 20 4 1 90
Class 32 37 42 47 52 57
mark

The overall (combined) mean mark for boys including the 20 failed was 39. While the overall
(combined) mean mark for girls including the 10 failed was 37.
a) Find the mean marks obtained by the 20 boys who failed in the test.

the value of mean for boy which passed is

xb p´ ∑ fibcmi=5∗32+10∗37+15∗42+30∗47+5∗52+5∗57=44.5

70
Total mean is 39 X t´ = X
bf´ ∗20+X bp´ ∗70 90

X bf´
b) Find the mean marks obtained by the 10 girls who failed in the test.
the value of mean for girl which passed is
lOMoARcPSD|12818064

x gp´ ∑ figcmi
15∗32+20∗37+42∗30+47∗20+4∗52+1∗57 4
Total mean is 37
X t´ X gf´∗10+X gp´∗90

X bf´
17. Two sections were given an examination on a certain course. For section 1, the average
mark(score) was 72 with standard deviation of 6 and for section 2, the average mark
(score) was 85 with

)
standard deviation of 7. If student A from section 1 scored 84 and student B from section 2
scored90, then who perform better relative to the group?

Section 1 Section 2

μ 72 85

δ 6 7

X 84 90

Let us find standard score both and compare Z

Z
6∗Z1=84−72
Z1 = 2

Z= X−μ
δ

Z 2=

Z2=0.714
Therefore student A perform better than B relatively to the group

20. Four married couples have bought 8 seats in a row for a show. In
how many different ways can they be seated A, If each couple
set together

C1 C2 C3 C4

2! 2! 2! 2!
lOMoARcPSD|12818064

= 4!2!2!2!2!= 4×3×2×2×2×2×2= 384

b. if all women set together

M1 M2 M3 M4 4w
= 5!

4w
=4!

=5!4!= 5×4×3×2(4×3×2)=2880

c. if all women sit together to the right of all the men

4m 4w

4! 4!
=4! 4! = 4×3×2(4×3×2) = 576

23. One urn contains three red balls, two white balls, and one blue ball. A second urn contains
one red ball, two white balls and three blue balls.
a) One ball is selected at random from each urn.
i) Describe a sample space for this experiment ii) Find the probability that both balls
will be of the same color. iii) Find the probability that at least one of the balls is red.
b) The balls in the two urns are mixed together in a single urn, and then a sample of threeis
drawn. Find the probability that all the three colors are represented, when: i) Sampling
with replacement ii) Sampling without replacement

Urn 1 Urn2
3R 1R
2W 2W

1B 3B

a. One ball select from each random from urn

i. a sample space for this experiment is


S.P={RR,RW,BB,WR,WW,WB,BR,BW,BB}

ii. probability that both balls will be of the same color is


lOMoARcPSD|12818064

¿ iii) the probability that at

least one of the balls is red is

b) The balls in the two urns are mixed together in a single urn, and then a sample of three is drawn.

i)when sampling with replacement the probability is

ii) when sampling without replacement the probability is

24. Consider four objects, say a, b, c and d. suppose that the order in which these objects are
listed represents the outcome of an experiment. Let the events A and B be defined as follows:
A = {a is in the first position}; B = {b is in the second position}.
a. List all elements of the sample space.

S.P={abcd,abdc,acbd,acdb,adbc,adcb,bacd,badc,bcad,bcda,bdca,bdac,cabd,cadb,cdab,cdba,cb
da,cbad,dabc,dacb,dbac,dbca,dcab,dcba}
b. List all elements of the events AnB and AUB.

AnB={abcd,abdc}

AuB={ abcd,abdc,acbd,acdb,adbc,adcb,cbda,cbad,dbac,dbca}

25. A lot consists of 10 good articles, 4 with minor defects and 2 with major defects.
i. One article is chosen at random. Find the probability that
a. It has no defects,let S be the sample space, m be a lot with minor defect and M be a lot
with major defect and g be a lot with no defects(i.e. good), then; g=10 m=4 M=2
Ta=16
lOMoARcPSD|12818064

This means probability of good.


g 10 P(G)= =
Ta 16

b. It has no major defects,


P(M)c=1-p(M)
=1- 2/16

¿
c) It is either good or has major defects.
P(g or M)=P(g)+P(M)

ii. Two articles are chosen (without replacement), Find the probability that
a,Both are good

b, Both have major defects

c, At least one is good

d, at most one is good

e, Exactly one is good

f. Neither has major defects to find this 1-p(Both have major defects)-
P(exactly one is major defect)

28. Four horses (A, B, C, and D) have raced many times. It is estimated that A wins 30 percent
of the time, B 40 percent of the time, C 20 percent of the time, and D 10 percent of the time.
The game allows only one of the horses to be a winner in any race. If these horses will
compete in a race,
a) what is the probability that A will win?

P(A)=0.3
lOMoARcPSD|12818064

b) what is the probability that A or B will win?

, P(A u B)=P(A)+P(B)=0.3+0.4=0.7

c) what is the probability that A or B but not both will win?

(A u B ) n P(A n B)c=0.7-0=0.7

d) what is the probability that neither A nor B will


win?

P(A u B)c=1-0.7=0.3
e) what is the probability that A or B or C will win?

P(A u B u C)=0.3+0.4+0.2=0.9

f) are A and B independent?Yes, they are.


Because P(AnB) is zero

29. One bag contains 4 white balls and 3 black balls, and the second bag contains 3 white
ballsand 5 black balls. One ball is drawn at random from the second bag and placed in the
first bag. What is the probability that a ball now drawn from the first bag is white? Ans
P(W2)=3/8
P(B2)=5/8
One ball is randomly selected from box 2 and put inside 1
Using total theorem
P(W1)=P(W2)*P(W1/W2)+P(B)*P(W1/B)
=3/8*5/8+5/8*4/8
=15/64+20/64
=35/64

30. A factory has two machines M1 and M2 making 60% and 40% respectively of the
totalproduction. Machine M1 produces 3% defective items, and M2 produces 5% defective
items. Find the probability that a given defective part came from M1.

Let A be the set of all defective items


P(M1)=0.6
P(M2)=0.4
P(A/M1)=0.03
P(A/M2)=0.05
By using total theorem
P(A)=P(M1)*P(A/M1)+P(M2)*P(A/M2)
= 0.6*0.03+0.4*0.05
=0.038
lOMoARcPSD|12818064

By using bayes theorem


A

P ( )∗P(M 1)
M1 M1
P
(A)

P
M1

P ( )=¿ 0.47
A

You might also like