Professional Documents
Culture Documents
CHAPTER ONE
Definintion of Statistics
The term statistics is derived from the Latin word status, meaning
state, and historically statistics referred to the display of facts and
figures relating to the demography of states or countries. Generally, it
can be defined in two senses: plural (as statistical data) and singular
(as statistical methods).
Plural sense: Statistics are collection of facts (figures). This
meaning of the word is widely used when reference is made to facts
and figures on sales, employment or unemployment, accident,
weather, death, education, e.t.c
1 / 83
Introduction to Statistics
Cont’d
2 / 83
Introduction to Statistics
3 / 83
Introduction to Statistics
4 / 83
Introduction to Statistics
Classification of Statistics
Based on the scope of the decision, statistics can be classified into
two; Descriptive and Inferential Statistics.
Descriptive Statistics refers to the procedures used to organize
and summarize masses of data. It is concerned with describing or
summarizing the most important features of the data.
It deals only the characteristics of the collected data without
going beyond it.
That is, this part deals with only describing the data collected
without going any further: that is without attempting to
infer(conclude) anything that goes beyond the data themselves.
Examples of Descriptive statistics
a. The average age of the students in this class is 21 years. b.
Of the students enrolled in HU in this year 74% are male and
26% are female.
5 / 83
Introduction to Statistics
Classiffication..............Con’d
Inferential Statistics is the methods used to find out something
about a population, based on the sample.
It is concerned with drawing statistically valid conclusions about
the characteristics of the population based on information
obtained from sample.
In this form of statistical analysis, inferential statistics is linked
with probability theory in order to generalize the results of the
sample to the population.
Performing hypothesis testing, determining relationships between
variables and making predictions are also inferential statistics.
Examples of inferential statistics
At least 5% of the killings reported last year in city X were due
to tourists.
The chance of winning the Ethiopian National Lottery in any
6 / 83
Introduction to Statistics
Applications of Statistics
7 / 83
Introduction to Statistics
Application————Con’d
8 / 83
Introduction to Statistics
Uses of Statistics
9 / 83
Introduction to Statistics
Limitations of Statistics
10 / 83
Introduction to Statistics
VARIABLE
11 / 83
Introduction to Statistics
Variable......................Con’d
12 / 83
Introduction to Statistics
........Cont’d
Continuous variable: takes any value including decimals. Such
a variable can theoretically assume an infinite number of possible
values. These values are obtained by measuring. Example:
Height, Weight, Time, Temperature, etc.
Classify each of the following as qualitative and
quantitative and if it is quantitative classify as discrete
and continuous.
1. Color of automobiles in a dealers show room.
2. Number of seats in a movie theater.
3. Classification of patients based on nursing care needed
(complete, partial or safer).
4. Number of tomatoes on each plant on a field.
5. Weight of newly born babies.
13 / 83
Introduction to Statistics
....................... CONT’D
16 / 83
Introduction to Statistics
..........................CONT’D
3. Interval Scales of variables are those quantitative variables when
the value of the variables is zero it does not show absence of the
characteristics i.e. there is no true zero.
Zero indicates lower than empty.
For example, for temperature measured in degrees Celsius, the
difference between 5 and 10 is treated the same as the difference
between 10 and 15..
However, we cannot say that 20 is twice as hot as 10., i.e. the
ratio between two different values has no quantitative meaning.
This is because there is no absolute zero on the Celsius scale; 0.
not imply no heat
All mathematical operation are allowed/permissible except
division.
17 / 83
Introduction to Statistics
........................CONT’D
18 / 83
Introduction to Statistics
CHAPTER TWO
19 / 83
Introduction to Statistics
22 / 83
Introduction to Statistics
CONT”D
Each category of the variable represents a single class and the number of times
each category repeats represents the frequency of that class (category).
23 / 83
Introduction to Statistics
24 / 83
Introduction to Statistics
25 / 83
Introduction to Statistics
Basic Terms
Class Limits: the lowest and highest values that can be
included in a class are called class limits. The lowest values are
called lower class limits and the highest values are called upper
class limits.
26 / 83
Introduction to Statistics
29 / 83
Introduction to Statistics
30 / 83
Introduction to Statistics
SOLUTION
32 / 83
Introduction to Statistics
33 / 83
Introduction to Statistics
Data Presentation
34 / 83
Introduction to Statistics
35 / 83
Introduction to Statistics
Bar chart.......................CONT”D
Component Bar Chart: is used when there is a desire to show a
total or aggregate is divided into its component parts. The bars
represent total value of a variable with each total broken into its
component parts and different colors are used for identification.
36 / 83
Introduction to Statistics
37 / 83
Introduction to Statistics
Pie Chart
Pie chart is popularly used in practice to show percentage break down
of data.
It is a circle representing a set of data by dividing the circle into
sectors proportional to the number of items in the categories or it is a
circle representing the total, cut into slices in proportional to the size
of the parts that make up the total.
38 / 83
Introduction to Statistics
Histogram
39 / 83
Introduction to Statistics
Histogram example................cont”d
67 67 64 64 74 61 68 71 69 61 65 64 62 63 59 70 66 66 63 59 64 67
70 65 66 66 56 65 67 69 64 67 68 67 67 65 74 64 62 68 65 65 65 66
67
40 / 83
Introduction to Statistics
41 / 83
Introduction to Statistics
Cont”D
42 / 83
Introduction to Statistics
SUMMARY
In this chapter, there are two sources of data (primary and
secondary). Primary data is the one which is collected by the
investigator himself for the purpose of a specific inquiry or study
while secondary data is the data which has already been
collected by others.
The most common methods of data collection for survey, called
questionnaire,interview (telephone and personnal)
The most convenient way of organizing numerical data is to
construct a frequency distribution which is the organization of
raw data in table form, using classes and frequencies. There are
three types of frequency distributions; categorical, ungrouped
and grouped frequency distribution.
The most common diagrammatic and graphical data
presentations were used.
43 / 83
Introduction to Statistics
CHAPTER SIX
Statistical Inference
The term statistical inference deals with the collection of data
on a relatively small number of cases so as to form conclusion
about the general population from which the sample was taken
CONT’D
1 A summary measure that describes any given characteristic of
the population is known as Parameter. Eg: population mean
(µ), population variance (δ 2 ), population standard deviation (δ),
population proportion (P), population moments are parameters.
2 The summary measure that describes the characteristic of the
sample is known as Statistic.Eg: sample mean , sample variance
, sample standard deviation (S), sample proportion (p), sample
moments(r) are Statistics.
3 Statistical inference generally takes one of the two forms,
namely, estimation of the population parameter and
testing of hypothesis.
4 Estimation:-It is the process of calculating the population
parameters based on the corresponding characteristics from a
random sample estimate. Because it is not possible to make
45 / 83
Introduction to Statistics
CONTINUED
Consistency: It refers to the effect of sample size on the
accuracy of the estimator. A statistic is said to be consistent
estimator of the population parameter if it approaches the
parameter as the sample size increases.
Efficiency: An estimator is considered to be efficient if its value
remains stable from sample to sample. The best estimator would
be the one which would have the least variance from sample to
sample. From the three point estimators of central tendency,
namely the mean, median and mode, the mean is considered the
least variant and hence a better estimator.
Sufficiency: An estimator is said to be sufficient if it uses all
the information about the population parameter contained in the
sample.
47 / 83
Introduction to Statistics
48 / 83
Introduction to Statistics
Interval estimation
continued
continued
51 / 83
Introduction to Statistics
For Example
52 / 83
Introduction to Statistics
53 / 83
Introduction to Statistics
continued
54 / 83
Introduction to Statistics
continued
If the population standard deviation is unknown and the sample size
is small (< 30) ,then the confidence interval for the sample is
55 / 83
Introduction to Statistics
56 / 83
Introduction to Statistics
SOLUTION
57 / 83
Introduction to Statistics
58 / 83
Introduction to Statistics
SOLUTION
59 / 83
Introduction to Statistics
CONTINUED
62 / 83
Introduction to Statistics
63 / 83
Introduction to Statistics
CONTINUED
Case 1 :if the population standard deviation (δ) is known and n
is large(n >30), then the appropriate test of statistic:
64 / 83
Introduction to Statistics
65 / 83
Introduction to Statistics
For Example
1. The mean life of a sample of 200 tires taken from the lot is found
to be 40,000kms. Past experience shows that the standard deviation
for life of tires in the lot is 3200kms.
A. Construct a 95% confidence interval for the mean life of tire in
the lot is expected to lie?
B. Is it reasonable to suppose the mean life of tires in the lot as
41,000kms? (At 5% level of significance
66 / 83
Introduction to Statistics
SOLUTION
67 / 83
Introduction to Statistics
68 / 83
Introduction to Statistics
Another Example
69 / 83
Introduction to Statistics
SOLUTION 2A
70 / 83
Introduction to Statistics
SOLUTION 2B
71 / 83
Introduction to Statistics
HOME WORK
1. A research reports that the average salary of the sociologist is
more than $42000. A sample of 30 sociologists has a mean salary of
$43260. Test the reports claim. Assume the population standard
deviation is $5230.
2. A national magazine claims that the average college students
watches less television than the general public. The national average
is 29.4 hours per week, with a standard deviation 2 hours. A random
sample of 25 college students has a mean of 27 hours. Test the
claim. Assume normality.
3. A merchant believes that the average age of customers who
purchase a certain brand of wears is 13 years of age. A random
sample of 35 customers had an average age of 15.6 years. At
α=0.01, should this conjecture be rejected. The standard deviation
of the population is 1 year.
72 / 83
Introduction to Statistics
CHAPTER SEVEN
Continued
74 / 83
Introduction to Statistics
continued
75 / 83
Introduction to Statistics
76 / 83
Introduction to Statistics
continued
Regression Equation: is a mathematical equation that defines
the relationship between two variables. Regression of y on x is
given by
Continued
78 / 83
Introduction to Statistics
Coefficient of Determination
79 / 83
Introduction to Statistics
continued
80 / 83
Introduction to Statistics
SOLUTION For a
estimated parameters.
81 / 83
Introduction to Statistics
SOLUTION For b
82 / 83
Introduction to Statistics
83 / 83