You are on page 1of 24

INTRODUCTION TO STATISTICS &

PROBABILITY

UNIT ONE: INTRODUCTION


1.1Definition and Classification of Statistics

Definition of Statistics
Statistics: we can define it in two senses :-
a) In the plural sense : statistics are the raw data
themselves , like statistics of births, statistics of deaths,
statistics of students, statistics of imports and
exports, etc.
b) In the singular sense: statistics is the subject that
deals with the collection, organization, presentation,
analysis and interpretation of numerical data .
Classification of Statistics

Statistics may be divided into two main branches:


(1) Descriptive Statistics (2) Inferential Statistics
• Descriptive statistics: includes statistical
methods involving the collection, presentation,
and characterization of a set of data in order to
describe the various features of the data.
Descriptive statistics do not, however, allow us to
make conclusions beyond the data we have
analyzed. They are simply a way to describe data.
Continued…
• Inferential Statistics: is a method used to
generalize from a sample to a population.
Example 1 . the average income of all families
(the population) in Ethiopia can be estimated
from figures obtained from a few hundred
(the sample) families.
Continued…
Example 2 . A biologist collected blood samples of 10
students from biology department to study blood types.
Accordingly, the following data is obtained:
O O AB A A O O B A O
• Summary measures, for example, the proportion of
students with blood type O in the sample is 50% is an
example of descriptive statistics.
• However, if he/she wants to get information on the
proportion of students with blood type O in the entire
class, he/she may use the sample proportion (50%) as
an estimate of the corresponding value of the entire
class. This is an example of inferential statistics.
1.2 Stages in statistical investigation

A statistical study might involve the following


stages:
Stage 1: Data collection: this stage involves
acquiring data related with the problem at hand.
Stage 2: Organizing and presenting data: this stage
involves the classification or sorting the collected
data based on some characteristics or attributes
such as age, sex, marital status e t c. Further we
may use tables, graphs, charts so on to present the
data.
Continued…
Stage 3: Data analysis: a thorough analysis of the data is
necessary in order to reach conclusions or provide
answers to a problem. The analysis might require
simple or sophisticated statistical tools depending on
the type of answers that may have to be provided.
Stage 4: Interpretation of the result: logically a
statistical analysis has to be followed by conclusions in
order to be able to make a decision. The technical
terminology used to describe this last process of a
statistical study is referred to as interpretation.
1.3 Definition of some terms

• A population: Consists of all elements, individuals,


items or objecs whose characteristics are being studied.
The population that is being studied is called target
population.
• Sample: A portion of the population selected for study.
• Sample survey: The technique of collecting
information from a portion of the population.
• Census survey: A survey that includes every member
of the population.
• Variable: is a characteristic under study that assumes
different values for different element.
Continued…
• Quantitative variable: A variable that can be measured
numerically. The data collected on quantitative variable
are called quantitative data.
Examples include weight, height, number of students in a
class, number of car accidents, e t c.
Quantitative variables can be classified into two. These
are:
Discrete variable: a variable whose values are countable.
Examples: include number patients in a hospital,
number of white blood cells in a droplet of blood
sample, number of female students in the class of a
certain school e t c.
Continuous variable: a variable that can assume any
numerical value over a certain interval or intervals.
Examples: include weight of new born babies, weight
of students, temperature measurements e t c.
Qualitative variable: A variable that cannot assume a
numerical value but can be classified into two or more
non numerical categories. The data collected on such a
variable are called qualitative or categorical data.
Examples include sex, blood type, marital status, religion
e t c.
Continued…
• Parameter: A statistical measure obtained
from a population data. Examples include
population mean, population standard
deviation, population variance and so on.
• Statistic: A statistical measure obtained from a
sample data. Examples include sample mean,
sample standard deviation , sample variance
and so on.
• Sampling: The process or method of sample
selection from the population.
• Sample size: The number of elements or
observation to be included in the sample.
Applications, Uses and Limitations of statistics.

• Applications of statistics:
We pointed out that statistics has already become a very important subject area, and, that various tools of
statistics are being used to solve problems :

• In almost all fields of human activities, such as

• In everyday life,

• in research,

• in marketing,

• in planning,

• in production and quality control and other areas.


Uses of statistics:

• The main function of statistics is to enlarge our knowledge of


complex phenomena.
The following are some uses of statistics:
1. It presents facts in a definite and precise form.
2. Data reduction.
3. Measuring the magnitude of variations in data.
4. Furnishes a technique of comparison
5. Estimating unknown population characteristics.
6. Testing and formulating of hypothesis.
7. Studying the relationship between two or more variable.
8. Forecasting future events.
Limitations of statistics

As a science statistics has its own limitations. The


following are some of the limitations:
• Deals with only quantitative information.
• Deals with only aggregate of facts and not with
individual data items.
• Statistical data are only approximately and not
mathematical correct.
• Statistics can be easily misused and therefore
should be used be experts.
Scales of measurement

• Proper knowledge about the nature and type of


data to be dealt with is essential in order to
specify and apply the proper statistical method for
their analysis and inferences.
Measurement scale refers to the property of value
assigned to the data based on the
properties of order, distance and (true zero) fixed
zero.
Types of scale of Measurement
1.Nominal scale: it is the simplest measurement scale.
Values of nominal scale are used purely to categorize
the quantity being measured and hence there is no
natural ordering of the levels or values of the scale.
Example: sex of an individual may be male or female.
There is no natural ordering of the two sexes. Others
examples include religion, blood type, eye colour,
marital status e t c. The values of nominal scale can be
coded using numerical values; however, we cannot
perform any mathematical operations on the numbers
used to code.
2.Ordinal scale:
This measurement scale is similar to the nominal scale but the levels
or categories can be ranked or order. That is, we can compare
levels or categories of the scale. Therefore, this scale of
measurement gives better information on the quantities being
measured as compared to nominal scale.
Example 1: living standard of a family can be poor, medium or higher.
These categories can be ordered as poor is less than medium and
medium is less than higher class. However, the distance or
magnitude between the levels, say between poor and medium, is
not clearly known.
Example 2: Rating scales (Excellent, Very good, Good, Fair, poor).
Example 3: Military status.
3.Interval scale
This measurement scale shares the ordering or ranking and
labeling properties of ordinal scale of measurement.
Besides, the distance or magnitude between two values is
clearly known (meaningful). However, it lacks a true
zero point (i.e., zero point is not meaningful).
Example 1: temperature in degree centigrade of an object. If
the temperature of an object is zero degree centigrade, it
doesn’t mean that the object lacks heat. Hence zero is
arbitrary point in the scale. It doesn’t make sense to say that
80° F is twice as hot as 40° c;. We can do subtraction and
addition on interval level data but division and
multiplication are impossible to use.
Example 2: IQ test
4.Ratio scale
It is the highest level of measurement scale. It shares the ordering,
labeling and meaningful distance properties of interval scale. In
addition, it has a true or meaningful zero point. The existence of
a true zero makes the ratio of two measures meaningful.
Example 1: if your salary is 1000 birr and your wife’s is 2000 we can
say that your wife earns twice of yours. If you don’t have any source
of income, your income is zero in this scale context and it is
meaningful assignment. Other example includes, weight, height,
volume measurements e t c. We can do subtraction, addition,
multiplication and division on ratio level data.
• The more precise variable is ratio variable and the least precise is
the nominal variable. Ratio and interval level data are classified
under quantitative variable and, nominal and ordinal level data are
classified under qualitative variable.
Nominal, Ordinal, Interval, Ratio
Scales
The four data measurement scales
Offers: Nominal Ordinal Interval Ratio
The sequence
of variables is – Yes Yes Yes
established
Mode Yes Yes Yes Yes
Median – Yes Yes Yes
Mean – – Yes Yes
Difference
between
– – Yes Yes
variables can
be evaluated
Addition and
Subtraction of – – Yes Yes
variables
Multiplication
and Division of – – – Yes
variables
Absolute zero – – – Yes
Questions- ?
Thank you !!!

You might also like