You are on page 1of 26

Introduction to statistics

CHAPER -FIVE
ELEMENTARY PROBABILITY
Probability is the chance of an outcome of an experiment.
It is the measure of how likely an out is occur.
5.1. Definitions of Basic Probability Terms
 Experiment: any process which generates well defined out comes.
 Outcome: The result of a single trial of an experiment.
 Sample space (S): is a set of all possible outcome of an experiment.
 Event: is a sub set of sample space. Remark: If S (sample space) has n members then there are exactly 2n subsets or events.
 Equally Likely Events: events which have the same chance of occurrence.
 Complement of an event: the component of an event A means non-occurrence of A and is denoted by Ac contains those points of the sample space
which doesn’t belong to A.
 Elementary event: an event having only a single element or sample event.
 Mutually Exclusive (ME) Event: two events that can’t occur simultaneously (which cannot happen at the same time).
 Independent Event: two events are independent if the occurrence of one event does not affect the occurrence or non-occurrence of the other event.
Otherwise, they are dependent events.
Example: - what is sample space for the following experiment
1. Toss a dice one time
2. Toss a dice two times.
5.2. Counting Rules
In order to calculate probabilities, we have to know the number of elements of an event and the number of elements of the sample space. In order to
determine the number of outcomes, one can use the following rules of counting:
Addition rule
 Multiplication rule
 Permutation and combination rule.
Example: A student goes to the nearest snack to have a breakfast. He can take tea, coffee, or milk with bread, cake and sandwich. How many
possibilities does he have?
Solutions
Tea bread, cake and sandwich
Coffee bread, cake and sandwich
Milk  bread, cake and sandwich
There are 9 possibilities
The Multiplication rule:
If a choice consists of K steps which the 1 st can be made in n1 ways, the 2nd can be made n2 ways…, the kth can be made in nk ways, then the whole
choice can be made in (n1*n2*…….*nk) ways.
Example: - the digits 0, 1, 2, 3 and 4 are to be used in 4 digit identification card, how many different cards are possible. If
Introduction to statistics
a. Repetitions are permitted,
b. Repetition are not allowed
Solutions: a) there are 4- steps:
Selecting the 1st digit, this can be made in 5 ways.
Selecting the 3rd digit, this can be made in 5 ways.
Selecting the 2nd digit, this can be made in 5 ways.
Selecting the 4th digit, this can be made in 5 ways.

 5*5*5*5 = 625 different cards are possible.

b) There are 4 steps


Selecting the 1st digit, this can be made in 5 ways.
Selecting the 3rd digit, this can be made in 3 ways.
Selecting the 2nd digit, this can be made in 4 ways.
Selecting the 4th digit, this can be made in 2 ways.
 5*4*3*2 = 120 different cards are possible.

Permutation: Is an arrangement of n objects in a specified order. The number of permutation of n distinct objects taken all together is n! Where n!
=n*(n-1)*(n-2)*….*3*2*1
• The arrangement of n objects in a specified order using r objects at a time is called the permutation of n objects taken r objects at time.
The formula is given as:-

The numbers of permutations of n objects in which k1 are alike k2 are alike …..Etc
Introduction to statistics
I.e. in all the enumeration methods introduced so far all the objects under consideration are assumed to be different. However, there are times when
some of the objects under considerations are similar. That is, out of k objects k 1 are first kind, k2 are of second kind ,… kn are of nth kind, where k1+k2+
…kn = k . Then the number of permutation of these objects is given by:-
nPr = _____n!_______
k1!*k2!*…..*kn!

Example:-1
1. Suppose we have a letters A, B, C, D. how many permutation are there taking all the four?
2. How many permutations are there two letters at a time?
3. How many different permutations can be made from the letters in the word “CORRECTION”?

Exercise: -2
Six different statistics books, seven different physics books, and 3 different economics books are arranged on a shelf. How many different
arrangements are possible if;
a) There is no restriction?
b) The books in each particular subject must all stand together?
c) Only statistics books must stand together?

Combination:- the selection of objects without considering to order.

Example: - Given the letters A, B, C, and D list the permutation and combination for selecting two letters.
Solutions:
Permutation: Combination
AB BA CA DA AB BC
AC BC CB DB AC BD
AD BD CD DC AD DC
Note: In permutation AB is different BA. But in combination AB is the same as BA.
B/c in a combination grouping pays no attention to order.
Combination rule:-
The number of combinations of r objects selected from n objects is given as follow:-
Introduction to statistics

Example 1: - In how many ways a committee of 5 people be chosen out of 9 people?


Solutions:-
n=9, r=5  n C r = __n!___ = __9! ___ = 126 ways
(n - r)! r! 4! 5!
Example 2:- Among 15 clocks there are two defectives. In how many ways can an inspector chose three of the clocks for inspection so that:
a) There is no restriction,
b) None of defective clock is included,
c) Only one of the defective clocks is included,
d) Two of the defective clock is included.
Solutions: - n=15 of which 2 are defective and 13 are non defective. r =3
a) if there is no restriction select three clocks from 15 clocks and this can be done in;
n=15, r=3  nCr = ____ n!___ = __ 15!_____ = 455 ways.
(n-r)!*r! 12!*3!

b) If none of the defective clock is included: this is equivalent to zero defective and three non defective, which can be done in: 2Co *13 C3 = 286
ways.
c) If only one of the defective clock is included: this is equivalent to one defective and two non defective, which can be done in; 2 C1 *13 C2 =156
ways.
d) if two of the defective clock is included: this is equivalent to two defective and one non defective, which can be done in : 2C2 *13C1 =13 ways.
Introduction to statistics
Exercise1: out of 5 mathematicians and 7 statisticians a committee consisting of 2 mathematicians and 3 statisticians is to be formed. In how many
ways this can be done if
a) There is no restriction
b) One particular statistician should be included.
c) Two particular mathematicians cannot be included on the committee.

2. If 3 books are picked at random from a shelf containing 5 novels, 3 books of poems, and
a dictionary, in how many ways this can be done if:
a).There is no restriction.
b).The dictionary is selected.
c). 2 novels and 1 book of poems are selected.
The classical approach of measuring probability:
Definition: if a random experiment with N equally likely out comes is conducted and out of theses N A outcomes are favorable to the event A, then the
probability that event A occur denoted P (A) is defined as:
N ¿ n( A)
P (A) = A = No . of out comes favorable ¿ A total no . of out comes =
N n (S)
Example:- A fair die tossed once. What is the probability of getting:-
a) Number 4
b) An odd number
c) An even number
d) Number 8?
Solutions: First identify the sample space, say S= {1, 2, 3, 4, 5, 6}  N=n(S) =6
n( A) 1
Let A be the event of number 4: A= {4}  NA = n(A) =1  P(A) = =
n (S) 6

n( A) 3
Let A be the event of odd number: A= {1, 3, 5}:  NA = n (A) =3  P (A) = = = 0.5
n (S) 6
n( A) 3
Let A be the event of even number: A= {2, 4, 6}:  NA = n (A) =3  P (A) = = = 0.5
n (S) 6

n( A)
Let A be the event of number 8: A= Ø:  NA = n (A) = 0  P (A) = = 0/6 = 0
n (S)
Example:- A box of 80 candles consists of 30 defective and 50 non defective candles. If 10 of these candles are selected at random, what is the
probability: if?
Introduction to statistics
a) All will be defective
b) 6 will be non defective
c) All will be non defective
Solutions: - Total selection = 80C10 = NA=n(S)
a) Let A be the event that all will be defective; total ways in which A occur
n( A) 30C 10∗50 C
30C10 *50C0 =NA= n (A)  P (A) = = 0.00001825
0

n (S) 80 C 10
b). Let A be the event that 6 will be non defective; total ways in which A occur:-
n( A) 30C 4∗50C
30C4 *50C6 =NA= n (A)  P (A) = = = 0.265
6

n (S) 80 C10

c). Let A be the event that all will be non defective; total ways in which A occur:-
n( A) 30C 0∗50 C
30C0*50C10 =NA= n (A)  P (A) = = = 0.00624
10

n (S) 80 C 10
Exercise: 1 what is the probability that a waitress will refuse to serve alcoholic beverages to only three minors if she randomly check’s the I.D’s of five
students from among ten students of which four are not of legal age?

Exercise: 2 if 3 books are picked at random from a shelf containing 5 novels, 3 books of poems, and a dictionary, what is the probability that: if
a) The dictionary is selected?
b) 2 novels and 1 book of poems are selected?

Basic Rules of Probabilities:


Let E be a random experiment and S be a sample space associated with E and A be are event:
a). P(A) >0.
b). P(S) =1. This implies S= sure event.
c).For any two events A &B, probability that either event A or B occur is given by:-
P(AUB) = P(A) + P(B) – P(A n B)
d). Remark: if A and B are mutually exclusive, then P (A ∩ B) = 0. Then
P (AUB) = P (A) + P (B)
e). P (Ac) = 1-P (A)
f). 0<P (A) <1
g). P (Ø) = 0. That is ø is impossible event.

Conditional probability:
The conditional probability of an event A gives that B has already occurred, denoted P(A/B) is
Introduction to statistics
P (A/B) = P (A ∩ B) , P (B) ≠ 0
P (B)

Remark: P(Ac/B)=1- P(A/B) and P(Bc/A) = 1-P(B/A)


Ex-1: For a student enrolling at fresh man at certain university the probability is 0.25 that he/ she will get scholarship and 0.75 that he/she will graduate. If the
probability is 0.2 that he /she will get scholar ship and will also graduate. What is the probability that a student who gets a scholarship will graduate?

Solution: let A= the event that a student will get a scholar ship and B= the event that a student will graduate.
Given: P (A) = 0.25, P (B) = 0.75, P (A n B) = 0.2
Required P (B/A) =?
P (B/A) = P(A∩ B) = 0.2 = 0.8
P (A) 0.25

Ex.2. If the probability that a research project will be well planned is 0.6 and the probability that it will be well planned and well executed is 0.54, what is the
probability that it will be well executed given that it will be well planned?

Solution: Let A= event that a research project will be well planned & B= event that a research project will be well executed

Given P (A) = 0.6, P (A ∩ B) = 0.54,

Required P (B/A) =?

P (B/A) = P(A ∩ B) = 0.54 = 0.9


P (A) 0.6
Exercise: A box consists of 20 defective and 80 non defective items from which two items are selected one after the other without replacement.
a).What is the probability that the 1st is defective & the second is non-defective?
b).What is the probability that both items are defective?
Probability of independent events:
Two events A&B are said to be independent if and only if P (A ∩ B) = P (A)*P (B). That is
P(A/B)= P(A), or P(B/A)=P(B)
Ex. A box contains four black and six white balls. What is the probability of getting two black balls in drawing one after the other under the following conditions?
a) The first ball drawn is not replaced
b) The first ball drawn is replaced.

Solution: let A= the first ball drawn is black and B= second drawn is black.
Required P (A ∩ B) =?

a) P (A ∩ B) = P (A)* P (B/A) = (4/10)(3/9)=2/15


Introduction to statistics
b) P (A ∩ B) = P (A)*P (B) = 4/10)(4/10)=4/25

CHAPTER - SIX
RANDOM VARIABLE AND PROBABILITY DISTRIBUTIONS

A random variable is 1
Random variables are two types: Discrete and continues random variables:-
Discrete random variable: are variables which can assume only a specific number of values. They have values that can be counted.
Examples: - tossing a coin n times and count the number of heads. Number of children in a family. Number of cars accidents per week. Number of
defective items in a given company.

Continuous random variable:- are variables that can assume all values between any two given values.
Example: height student in a certain college. Mark of a student. Life time of light bulbs. Length of time required to complete a given training.
A probability distribution: consists of a value of random variable can assume and the corresponding probabilities of the values.
Example: consider the experiment tossing a coin three times. Let X be the number of heads. Construct the probability distribution of X.
Solution: first identify the possible value that X can assume. Calculate the probability of each possible distinct value of X and express in the form of
frequency distribution.
X=x 0 1 2 3
P(X=x) 1/8 3/8 3/8 1/8

Probability distribution is denoted by P for discrete and by f for continuous variable.


Properties of probability distribution:
P(x)> 0 if X is discrete f(x)>0 if X is continuous.
ΣP(X=X)=1, if X is discrete. ∫ ¿ ¿ )dx =1, if it is continuous.

Common Discrete Probability Distributions:


Binomial Distribution: A binomial experiment is a probability experiment that satisfies the following 4 requirements:
a).The experiment consists of n identical trials.
b).Each trial has only one of the two possible mutually exclusive out comes, success or failure.

1
Refer in chapter one
Introduction to statistics
c).The possibility of each outcome does not change from trial to trial.
d).The trials are independent, thus we must sample with replacement.
Example of binomial experiments:
 Tossing a coin 20 times to see how many tails occur
 Asking 200 people if they watch BBC news.
 Registering a newly produced product as defective or non defective.
Notation:
X ~Bin (n, p)

Definition: Let P = probability of success & q= 1-p = the probability of failure. Then the probability of getting X success is n trials becomes:-
P(X=x) = nCx pxqn-x, x= 0,1,2,3…,n

Ex.1. what is the probability of getting three heads by tossing a fair coin four times?
Solution: let X be the number of heads in tossing a fair coin four times. X ~Bin (n=4, p=0.5)
P(X=x) = nCx pxqn-x, x= 0, 1,2,3,4  (X=3) = 4C3 (0.5)3 * (0.5)1=0.25
Ex.2: suppose that an examination consists of six true and false questions, and assume that a student has no knowledge of the subject matter. The
probability that the student will guess the correct answer to the first question is 30%. Likewise, the probability of guessing each of the remaining
questions correctly is also 30%.
a) What is the probability of getting more than three correct answers?
b) What is the probability of getting at least two correct answer?
c) What is the probability of getting at most three correct answers?
d) What is the probability of getting less than five correct answers?
Solution: Let the number of correct answers that the student gets.
X ~Bin (n=6, p=0.3)
a) P(X > 3) =?
P(X=x) = nCx pxqn-x , x= 0, 1,2,3,4, 5, 6
x 6-x
6Cx (0.3) * (0.7)

P(X> 3) = P(X=4) + P(X=5) + P(X=6) =0.060+0.010+0.001=0.071


Thus, we may conclude that if 30% of the exam questions are answered by guessing, the probability is 0.071 or 7.1% that more than four of the
questions are answered correctly by the student.
b). P(X>2) = P(X=2) + P(X=3) + P(X=4) +P(X=5) +P(X=6) = 0.324+0.185+0.060+0.010+0.001=0.58
c). P(X<3) = P(X=0) + P(X=1) + P(X=2) +P(X=3) =0.118 + 0.303 + 0.324 + 0.185=0.93.
Introduction to statistics
Exercise: -1 .Suppose that 4% of all TVs made by A & B Company in 2002 are defective. If eight of these TVs are randomly selected from across the
country and tested, what is the probability that exactly three of them are defective? Assume that each TV is made independently of the others.
2. An allergist claims that 45% of the patients she tests are allergic to some type of weed. What is the probability that
a). Exactly three of her next 4 patient are allergic to weeds?
b). None of her next 4 patients are allergic to weeds?
c). Explain why such experiment are not binomial?
Remark:- if X is a binomial random variable with parameters n and p then:
E(X) = np and var(x) = npq

Poisson distribution:
A random variable X is said to be have a Poisson distribution if its probability distribution is given by
P(X=x) = x e- , X=0, 1, 2… where  = the average number.
X!
Notation:-X ~ Po ( )
The Poisson distribution is used to as a distribution of rare events, such as: Number of misprints, Natural disasters like earth quick, accidents, etc…
The process that given gives rise to such events are called Poisson process.
Examples: if 1.6 accidents can be expected an intersection on any given day, what is the probability that there will be 3 accidents on any given day?
Solution: let X= the number of accidents, X~ Po (), 

P(X=x) = 1.6 x e-1.6 = P(X=3) = 1.63 e-1.6 = 0.1380


X! 3!

Exercise:- on average five smokers pass a certain street corner every ten minutes, what is the probability that during a given 10 minutes the number of
smokers passing will be a) 6 or fewer b) 7 or more c) exactly 8.

Remark : if X~Po() then E(x) =

Common Continuous Probability Distributions

F Normal distribution
Properties of normal distribution:

1. It is bell shaped.
Introduction to statistics
2. It is symmetrical about its mean and it is Mesokurtic. Mean = median = mode = 
3. It is asymptotic to the x- axis.
4. It is a continuous distribution.
5. Total area under the curve sums to 1.
6. It is unimodal. i.e. values mound up only in the center of the curve.
7. Mean = µ and variance= ð2 are the two parameters of a normal distribution.

Notation: X~ N (µ, ð2)


Standardization:
To facilitate the use of normal distribution, the following distribution known as the standard normal distribution was derived by using the
transformation: Z = X- µ

Properties of the standard Normal (Z) .Distribution: is same as a normal distribution, but also ..Mean is zero and variance is one.

Æ Given a normal distributed random variable X with mean µ and standard deviation  .
P (a < X< b) = (a-X- b- (a- b-


Note: P (a < X< b) = P (a < X< b) = P (a < X< b) =P (a < X< b)

Examples: Find the area under the standard normal distribution which lies:

1. Between Z=0 and Z= 0.96


Solution:
 area= P (0< Z< 0.96) = 0.3315
2. Between Z= -1.45 and Z= 0.
Introduction to statistics
Solution:
 area P (-1.45<Z< 0) = P (0< Z< 1.45) = 0.4265
3. To the right of Z= -0.35.
Solution:
 P (Z > -0.35) = P (-0.35< Z < 0) + P (Z > 0)
 P (0 < Z< 0.35) + P (Z > 0) =0.6368
4. To the left of Z= -0.35
Solution:
 area = P(Z< -0.35) = 1- P(Z > -0.35)
 1 – 0.6368 = 0.3632
5. Between Z= -0.67 and Z =0.75
Solution:
 P(-0.67<Z<0.75)
= P (-0.67< Z<0) + P (0 < Z < 0.75)
=P (0 < Z < 0.67) + P (0 < Z < 0.75)
= 0.2486 + 0.2734 = 0.5220
6. Between: Z = 0.25 and Z = 1.25
Solution:-
 P(0.25 < Z < 1.25) = P(0 < Z < 1.25) - P(0 < Z < 0.25)
0.3994 - 0.0987= 0.2957
7. Find the value of Z if the normal curve area between 0 and Z (positive) is 0.4726.
Solution:
 P(0< Z< z) = 0.4726 and from the table P(0 < Z < 1.96) = 0.4726 iff Z= 1.96
8. The area to the left of Z is 0.9868.
Solution:
 P(Z < z) = 0.9868 = P(Z< 0) + P(0 < Z < z) = 0.50 + P(0 < Z< z)
 P(0 < Z < z) = 0.9868 – 0.50 = 0.4868 and from the table P(0<Z<2.22) = 0.4868 iff Z= 2.22
9. A random variable X has a normal distribution with mean 80 and standard deviation 4.8. What is the probability that it will be take a value. a). Less
than 87.2 b ). Greater than 76.4 c). Between 81.2 and 86.0

Solution: X ~ N (=80, )


Introduction to statistics
* P(X < 87.2) = P(X-P (Z < P (Z < 1.5)
  
PP*P (X > 76.4) = P( X-P(Z > 
  

P(Z > -0.75)


P (Z>0) + P (0<Z<0.75)
= 0.50 + 0.2734 = 0.7734
*P (81.2<X<86.0) = P ( 81.2-X
  
= P ( 81.2-Z
4.8 4.8
= P (0.25< Z < 1.25)
= P (0 < Z< 1.25) – P (0 < Z< 0.25)
= 0.3944- 0.0987 = 0.2957

10. A normal distribution has mean 62.4. Find its standard deviation if 20.05% of the area under the normal curve lies to the right of 72.9.

Solution:
= P(X > 72.9) = 0.2005
= P (X
 
P (Z > 

P (Z > P (0 < Z < 

and from the table P (0<Z< 0.84) = 0.2995  10.5 = 0.84 =>  
A random variable has a distribution with =5. Find its mean if the probability that the
random variable will assume a value less than 52.5 is 0.6915.
Solution:
P (Z < z) = P (Z< 52.5-Pz) = 0.6915- 0.50 = 0.1915
5
But From the table => P (0<Z<0.5) = 0.1915  z = 52.5-
5
Exercise: Out of a large group of men, 5 % are less than 60 inches in height and 40 % are between 60 & 65 inches. Assuming a normal distribution,
find the mean and standard deviation of heights.
Introduction to statistics

CHAPTER- SEVEN
Sampling and Sampling Distribution

We do have two approaches of obtaining data (information) from a given population.


Census approach and Sampling approach
Census: is a process of obtaining information from each and every item of the population.
Sample: is a part of (subset) of a population, which is drown to study a given population. The process of selecting a sample is known as sampling.
Defns:
Parameter: characteristics or measure obtained from a population
Statistic: characteristics or measure obtained from a sample.
Sampling: the process or method of sample selection from the population.
Sampling unit: the ultimate unit to be sampled or elements of the population to be sampled.
Sampling frame: is the listing of all elements in a population.
ÆAdvantages of sampling approach over that of census approach are:-
@ Reduced cost
@Greater speed
@ Greater accuracy (better quality data)
@ Greater scope (more detailed information can be obtained)

7.1 Methods of Sampling (Sampling Technique)


Sampling techniques can be broadly divided in to two:

ÆProbability (Random) Sampling: - is a method of sampling where all items in a population have a pre-assigned non- zero probability to be chosen or
selected in the sample. E.g.
@ Simple random sample (SRS)
@ Stratified sampling
@ Systematic sampling
@ Cluster sampling
Random Sampling
Introduction to statistics
Sampling in which the data is collected using chance methods or random numbers.

Systematic Sampling
Sampling in which data is obtained by selecting every kth object.
Stratified Sampling
Sampling in which the population is divided into groups (called strata) according to some characteristic. Each of these strata is
then sampled using one of the other sampling techniques.
Cluster Sampling
Sampling in which the population is divided into groups (usually geographically). Some of these groups are randomly selected,
and then all of the elements in those groups are selected.

ÆNon Probability (Non Random Sampling):- is a method of sampling where the sample to be chosen (selected) depends on personal decision or
subjective judgment of the investigator. E.g.

@ Judgment Sampling
@ Convenience Sampling
@ Quota Sampling
Convenience Sampling
Sampling in which data is which is readily available is used.

Simple Random Sampling (SRS)

If a sample of size n is drawn from a population of size N in such a way that every possible sample of size n has the same probability of being
selected, then the sample is called simple random sample and the method of sampling is called simple random sampling (SRS). i.e. SRS: is a
sampling technique in which every item of the population has equal chance of being included in the sample.

There are two types of SRS:


@ SRS with Replacement: - here a unit after being selected is put back to the population before the second selection.
@ SRS without Replacement: - here a given unit does not have a chance to be included in the sample more than once.

There are two common method of selecting SRS


F Lottery method
F Using table of Random number
Introduction to statistics

CHAPTER- EIGHT
ESTIMATION AND HYPOTHESIS TESTING
Statistical Inference: is the process of making (giving) conclusion about the population of interest based on the information (data) obtained from the
sample.
Two ways of making inference in statistics are:-
F Estimation:
F Hypothesis Testing:
Estimation: - this is one ways of making inference about the population parameter. There are two ways of making estimation.
@ Point Estimation
@ Interval Estimation (confidence interval)
Point Estimation: is a procedure that results in a single value as an estimate for a parameter.
Interval Estimation (CI):- is the procedure that results in the interval of values as an estimate for a parameter.
?It deals with identifying the upper and lower limits of a parameter.
Definitions:
Estimator: is a rule or a formula used to estimate a parameter.
Estimate: is a specific value of an estimator.
1
Example: x = ∑ xi is an estimator. x = 20 is an estimate.
n
Point estimate: A single value used to estimate a parameter.
Interval Estimate: a range of values used to estimate a parameter.

Æ Point & Confidence interval estimation for 

Point estimation of 
n
1
The point estimation is x = ∑ xi
n i=1
Example:- given the values 5,9,6,4,7,11 are samples of a given population, the point estimate for becomes:-
Introduction to statistics
n
1 1
x= ∑
n i=1
xi = (5+9+...+11) = 7
6

Remark: - if the variable of interest is continuous, the probability that estimator exactly estimates a population parameter is zero. Hence, we will not
usually estimate population parameter with a point estimator. so we go for interval estimation.

Confidence interval estimation for 

There are different cases to be considered to construct confidence interval for 

Case I: if sample size (n) is large (n > 30) or if the population is normal with known variance.
 x ~N (n)  Z = x - ~ N(0, 1)
√ n
Hence, X~ N (  ,  ) 2

The (1-) 100% confidence interval for  is given by: - ¿L , x u¿ = [ x – Z/2  x + Z/2 
√ n√ n
P ¿L, <  x u¿ = 1 - 
With Where (1 - is referred to as confidence coefficient.

Remark: if 2 is unknown, then it estimates by its point estimator s2


Example: From a normal population a sample of size 25 with mean 32 was drawn. Given the population standard deviation to be 4.2, construct a 95 %
confidence interval for 
Given: - n= 25, x = 32, 2 = 4.2
95% = (1-) 100% => 0.95 1- =>  = 0.05 => /2=0.025 => Z/2 = Z0.025= 1.96

Then the (1 - confidence interval for  is given by 


( x L , x u¿ = [ x – Z/2  x + Z/2 
√ n√ n

( x L , x u¿ = [32 – 1.96 (  + 1.96 ( 


√ 25√ 25

= ( x L , x u¿ = [ 30.35 , 33.65]
Interpretation: - we are 95 % confidence that the unknown population mean  will lie in the interval
Introduction to statistics
[30.35, 33.65] i.e. there is only 5% chance that  is out of the CI [30.35, 33.65].

Exercise; Construct and interpret the 90%, 98% and 98%. CI for  of the above example.

Case II : Normal population: n = small, and 2 unknown


Use t = x - t(n-1)
s/√ n
X~ N ( ,  )
2

The (1-) 100% confidence interval for  is given by:-

( x L , x u¿ = [ x – t/2( n-1) s x + t/2(n-1) s


√ n√ n With P (¿L, <  x u¿ = 1 - 

Example:- A drug company is testing a new drug which is supported to reduce blood pressure. From six people who are used as subject, it is found
that the average drop in blood pressure is 2.28 points, with a standard deviation of 0.95 points. Construct a 95 % confidence interval for the mean
change in blood pressure.
Given: n=6 (n<30) => small, x = 2.28, s = 0.95 (2 = unknown)
0.95= 1- =>  =0.05 =>  /2=0.025
t/2 n-1 = t0.025 5 =2.571

( x L , x u¿ = [ x – t/2( n-1) s x + t/2(n-1) s


x L , x u¿ = [2.28 – 2.571
(√ n √ n (
+2.571 Answer. [1.283, 3.277]
√ 6√ 6 Interpretation: we are 95 % confident that the mean change in blood pressure is
between 1.283 and 3.277.
Case III: When sampling is from Non- normally distributed population.

If the sample size, n is sufficiently large (n >30), then we go for the central limit theorem and then...
X~ N ( , 2/n ), so then (1-)100% confidence interval for becomes:

¿L , x u¿ = [ x – Z/2  x + Z/2 


√ n√ n (When known)
H ¿L , x u¿ = [ x – Z/2  x + Z/2  ( 2Or
ypothesis =Testing
unknown)
:
√ n√ n This is also another way of making inference about population parameter.
Statistical Hypothesis: is the assumption (assertion) which is made about the population
under consideration,
Which may be true or not. OR
Introduction to statistics
It is a tentative assumption regarding the value of a parameter (some character of a population) whose validity
may be checked by statistical method.
Example: one may assert that the average age of fresh man student enrolled in this academic year is 18 years. (i.e. Ho:  = 18 years)

There are two types of Hypothesis


F Null hypothesis (Ho) = is the hypothesis to be tested or it is also known as the hypotheses of equality
(No difference)
FAlternative hypothesis (H1) = is the available alternative when Ho is rejected. It is also known as the
(Hypothesis of difference).
Example: for (Ho): years, three alternatives are:
H1: ≠ years (two tailed (sided) alternative) or
H1: years (one tailed (sided) alternative).
H1: years (one tailed (sided) alternative).
Test Statistics: is a statistic whose value is used whether reject or accept Ho. it can assume different values based on the sample data. This implies a test
statistic is a random variable.
Types and size of Errors
Testing hypothesis is based on sample data which may involve sampling and non- sampling errors. therefore, the decision to accept or reject Ho is
always subject to errors.

Actual decision Our decision


Reject Ho Don’t reject Ho
True Ho Type I error Correct decision
False Ho Correct decision Type II error

Type I error = Rejection of true Ho. Probability (Type I error) = 


Type II error = Accepting of false Ho. Probability (Type II error) = 

F Generally type I error is considered to be more serious than type II error.


Remark:  and  are inversely proportional. i.e↓ => ↑and vice versa. What we have to do is we will keep  to some minimum level and find a test
that minimizes .

Steps in Testing Hypothesis:


1st: Specify Ho and H1
2nd: Specify the level of significance 
3rd: Select an appropriate test statistics.
4th: Based on specify the rejection region (Critical region).
Introduction to statistics
5th: Calculate the test statistics (from the given data or information).
6th: Decision making.
7th: Conclusion.
Hypotheses testing about 
Suppose the assumed or hypothesized value of denoted by  , then one can formulate two sided and one sided hypothesis as follow.
Ho=  vs H1: ≠ 
Ho: =  vs H1: 
Ho: =vs H1: 

CASES:
Case I : When sampling is from a normally distributed with known known
The test statistics is Z = x - 
√ n
After specifying  we have the following regions (critical and acceptance):

Table 1: summary table for decision whether to reject Ho or not.


Ho: =  Reject Ho if Inconclusive if
The three H1: ≠  ¿ Z cal∨¿ Z cal = Z or Zcal = - Z/2
possible >Z/2
alternatives H1:  Z cal < - Z Z cal = - Z
H1:  Z cal > Z Z cal = Z
Where Zcal = x - 
√ n
Case II. When sampling is from a normal distribution with 2 unknown and n= small
The test statistic is t = x - tn
s√ n

After specifying we have the following regions (critical and acceptance):
Ho: =  Reject Ho Inconclusive if
if
The three H1: ≠ ¿ t cal∨¿ tcal = t or tcal = - t/2
possible  >t/2
alternatives H1: tcal < - t tcal = - t

H1: tcal > t t cal = t Where tcal = x -
 s√ n
Introduction to statistics
Case III : When sampling is from a non- normally distributed population
if a sample size, n is large, then go for CLT. Then:
Zcal = x -known
Or
√ n
Zcal = x - ifunknown
s√ n

The decision rule is the same as in case I.


Example 1:
Test the hypothesis that the average height content of containers of a certain lubricant is 10 liters if the content of a random sample of 10 containers are
10.2, 9.7, 10.1, 10.3, 10.1, 9.8, 9.9, 10.4, 10.3, and 9.8 liters.
Use  and assume that the distribution of contents is normal.
Solution: Suppose  = population mean,
Given information: = 10, n =10 x = 10.06, s = 0.25
Step 1: identify the appropriate hypothesis: Ho: =10 vs. H1: ≠  ≠ 10
Step 2: specify . Given  = 0.01
Step 3: select an appropriate test statistics: t- statistic is appropriate. Because  is unknown and n = 10 is small.
Step 4: identify the critical region: reject Ho at level of significant¿ tcal∨¿ > t/2(n-1) i.e
tn-1 = t(9) = 3.2498
Step5. calculate the test statistic: tcal = x - x s = 0.25 from the row data
s√ n
tcal = x - tcal = 10.06 - 
s√ n 0.25√ 10

Step 6: Decision making: since ¿ t cal∨¿ = 0.76 is not greater than t0.005(9) = 3.2498 we do not reject Ho at 1 % level of significance.
step 7 : conclusion: at 1 % level of significance, we have no evidence that the average height content of containers is different from 10 liters, based on
the given sample data.

Example 2:
The mean life time of a sample of 16 fluorescent light bulbs produced by a company is computed to be 1570 hours. The population standard deviation
is 120 hours. Suppose the hypothesized value for the population mean is 1600 hours. Can we conclude that the mean life times of light bulbs are
decreasing?
Use =0.05 and assume the normality of the population.

Solution:
Introduction to statistics
Given: n= 16 (small), x = 1570, 120, = 0.05, =1600

Step1: identify the appropriate hypothesis: Ho:  =  =1600 vs H1:  <  < 1600.
Step 2: specify Given =0.05.
Step 3. Select an appropriate test statistics: Z- statistic is appropriate. Because 2 is known.
Step 4: identify the critical region: Reject Ho at  level of significance  Zcal < - Z i.e
Zcal < -Z0.05 = -1.645
Step 5: calculate the test statistic: Zcal = x - 
/√ n 120/√ 16
Step 6: Decision making: since Zcal = -1.0 is not less than – Z0.05 = -1.645 we do not reject Ho at 5 % level of significance.
Step 7: Conclusion: at 5 % level of significance, we have no evidence to say that the mean life time of light bulbs is decreasing, based on the given
sample data.

Example 3: The mean weekly sale of chocolate bar in a candy store was 146.3 bars per store. After the advertizing campaign the mean weekly sales in
22 stores for a typical week increased to 157.7 and showed a standard deviation of 17.2. Is the evidence conclusive that advertizing was successful
at 5 % level of significance?
Solution: Given: = 146.3, n = 22(small), x = 157.7, S= 17.2 (  = unknown) , normality,  = 0.05

Step1: Identify the appropriate hypothesis Ho:  =  =146.3 vs. H1:  > 146.3
Step 2: specify Given =0.05.
Step 3. Select an appropriate test statistics: t- statistic is appropriate. Because 2 is unknown and n is small
Step 4: identify the critical region: Reject Ho at  level of significance  tcal >tn = t0.05(21) = 1.721
Step 5: calculate the test statistic: tcal = x - 
s/√ n 17.2/√ 22
Step 6: Decision making: since tcal = 2.02 is greater than t0.05(21) = 1.721 we reject Ho at 5 % level of significance.
step7: Conclusion: at 5% level of significance, we have strong evidence that advertizing the chocolate was successful in bringing an increase in mean
weekly sale, based on the given data.

Exercise: It is known in a pharmacological experiment that rats feed with a particular diet over a certain period gain an average of 40gms in weight. A
new diet was tried on a sample of 20 rats yielding a weight gain 43 gms with variance 7 gms. Test the hypothesis that the new diet is an
improvement, assume normality.

Ü State the appropriate hypothesis


Ü What is the appropriate test statistics? Why?
Ü Identify the critical region
Introduction to statistics
Ü On the bases of the given information test the hypothesis and make your conclusion.

CHAPTER - NINE

CORRELATION AND SIMPLE LINEAR REGRESSION

Here we will discuss in the method employed to determine if there exists any relationship between two variables and express the r/p numerically.
Examples: we may need to know whether there exists a r/p between:
Income & Expenditure Fertilizer & plant yield
Age & Blood pressure Height & weight

Correlation Analysis: is a statistical technique that can be used to describe the degree to which one variable is linearly related to other variable. I.e.
correlation: is the degree (strength) of linear r/p b/n two variables (say X & Y). Two variable X & Y are said to be highly correlated if they have a
strong r/p. i.e. ↑ X => ↑ Y, ↓ X => ↓Y or ↓X => ↑Y or vice versa.

If higher (lower) values of one variable (say X) is a accompanied by higher (lower) of the other variable (say Y), then we say that there exists a
positive (direct) correlation b/n X & Y. i.e. ↑ X => ↑ Y, ↓ X => ↓Y. Example: The greater the radius of a circle, the greater will be the circumstance.

If lower (higher) values of one variable (say X) is a accompanied by lower (higher) of the other variable (say Y), then we say that there exists a
negative (indirect) correlation b/n X & Y. i.e. ↓X => ↑Y or ↑ X => ↓Y. Example: saving Vs expenditure.

The simple correlation coefficient (Pearsonian’s correlation coefficient): is a measure used to determine the degree of correlation b/n two or more
variables.

∑ XY −n X Y
r = n ∑ XY – ¿¿ ¿ or
√[∑ X 2−n X 2 ] ¿ ¿ ¿
Remark: Always this r lies between -1 and 1 inclusively and it is also symmetric.

Interpretation of r
F Perfect positive linear r/p (if r = 1)
F Perfect negative linear r/p (if r = -1)
F No linear relationship (if r = 0)
F Some positive linear r/p (if r is b/n 0 and 1)
Introduction to statistics
F Some negative linear r/p (if r is b/n 0 and -1)

Example: compute the value of Pearson’s correlation coefficient based on the study of Age (X) and Blood pressure(Y) of a person.

Age = X Blood Pressure = Y XY 2 2


X Y
43 128 5504 1849 16384
48 120 5760 2304 14400
56 135 7560 3136 18225
61 143 8723 3721 20449
67 141 9447 4489 19881
70 152 10640 4900 23104
∑ X = 345 ∑ Y = 819 ∑ XY = 47634 ∑ X 2 = 20,399 ∑ Y 2 = 112,443
n= 6 and r = n ∑ XY – ¿¿ ¿
6 ( 47,634 )−(345)(819)
r= √¿ ¿ ¿
3249
= √13128993 = +0.897 (close to + 1)

Interpretation: there is strong positive linear r/p b/n age and blood pressure.
9.2 Simple Linear Regression (Regression of Y on X)
Regression:- Is the a r/p existing b/n a dependent (effect) variable and given independent (cause ) variable(s). i.e. by regression the dependent
variable (Y) is given as a function of the independent variable or variables(X).
Simple Linear Regression: is a regression where there is a linear relation b/n dependent variable (Y) and independent variables.
Example: Study Hours’ (X= independent variable= cause) versus
Grade obtained(Y= dependent variable= effect)
Two variables X and Y are said to be linearly related if their relationship can be expressed by simple linear model:

Y = x 
......, referred as regression of Y on X
Where: Y = dependent variable.., X = independent var.
 = intercept of the regression line,  = slope of regression line,
 = error (random disturbance term)
Here  (intercept) and  (slope) are the parameters. Also they are known as regression coefficients.
Introduction to statistics
There are different methods of estimating the parameters  and . The most commonly applied method is the ordinary least square (OLS) method.

The OLS method tries to estimate the parameters by minimizing the sum of squares of the error term ∑ e2
Y = x is estimated by Y^ = a + bX , where: a and b are OLS estimates of  and  respectively.

Therefore, the regression line which minimize ∑ e2 = ∑ ¿¿ )2


is Y^ = a + bX , where :
∑ ( Xi−X )( Yi−Y ) ∑ XY −n X Y n ∑ XY −∑ X ∑ Y
b= = = , and a = Y −b X
∑ (Xi− X)2 ∑ X 2−n X 2 n ∑ X −¿ ¿ ¿
2

Where a & b are the least squares estimates of the regression line of Y on X.

Examples: the following hypothetical data set are shows income and monthly food expenditure of household in hundreds of birr. Then,
Ü a). Fit the least squares regression line to the given data.
Ü b). Calculate a simple correlation coefficient (r)
Ü c). Predict the food expenditure for 800 birr (8).

Income=X Expenditure = Y XY X2 Y2
3.8 3.1 11.78 14.44 9.61
4.5 3.6 16.2 20.25 12.96
2.5 2.3 5.75 6.25 5.29
4.8 3.7 17.76 23.04 13.69
n = 11, X = 5.95 and Y = 4.03…………
7.7 4.6 35.42 59.29 21.16 a). Y^ = a + bX
5.0 4.1 20.5 25 16.81
n ∑ XY −∑ X ∑ Y 11 ( 294.97 )− (65.45 ) (44.43)
12.6 6.5 81.9 158.76 42.25 b= n ∑ X −¿ ¿ ¿
2 = 11 ( 472.19 )−(65.45)2
= 0.38
8.5 5.1 43.35 77.25 26.01
5.5 4 22 30.25 16 and
7.1 4.1 29.11 50.41 16.81
3.5 3.2 11.2 12.25 10.24 a = Y −b X = 4.03-(0.38) (5.95) = 1.79
Therefore, Y^ = a + bX = 1.79+0.38X
∑ X = 65.5 ∑ Y = 44.3 ∑ XY =¿ ¿ ∑ X 2 = 472.19 ∑ Y 2= 190.83 Interpretation: wherever income (X) is zero, the expenditure on
294.97
food will be birr 1.79(179) and for every birr increase in income
38% of it will be spent on food.
Introduction to statistics
b). r = n ∑ XY −¿ ¿ ¿ = 0.9829

c). When income X= 8 => food expenditure Y^ = a + bX= 1.79 + 0.38X = 1.79 + 0.38(8) = 4.83

Coefficient of Determination( r2): Is a measure of the proportion of the total variation in Y that is explained by its r/p with X. That is, to know how
far the regression equation has been able to explain the variation in Y we use r2
i.e. r2 = measure the goodness of fit of the regression line.

0 ≤ r2≤1 Where: r2 = explained variation and 1- r2 = unexplained variation.

Interpretation: If r2 close to 1 => r/p b/n X & Y is well explained by the regression equation (line). => Good fit

If r2 close to zero=> r/p b/n X & Y is not well explained by the regression equation (line) => not good (bad) fit.
Example: Compute the coefficient of determination (r2) for the above data.
We already calculate r = 0.9829 => r2 =0.967 = 0.97

Interpretation: 97% of the total variation in Y is explained by the regression (line)


Since r2 = 0.97 is close to one the fit is a good one.

You might also like