You are on page 1of 14

statistics and probability

1
CHAPTER

Introduction 1
Chapter Contents
1.1.Origion of statistics
1.2.Definition of statistics
This chapter introduces the subject statistics. It consists of the definition
1.3. Basic terms
and classification of statistics. It also consists some statistical terms used 1.4. Stages in inquiry
throughout the course. In addition, uses, applications and limitations of 1.5. Uses of statistics
the subject are indicated. And finally; stages, types of variables and scale 1.6. Level of measurement
of measurement for variables is added. Brief explanation on each topic is 1.7. Review questions

given to easily master the concepts. So this chapter acts as a precondition


for the next chapters that we shall discuss or study latter.
After completing this chapter of part I, successful students will be able to
answer the questions in. 1 Chapter objectives
• To define statistics
• To describe basic terms
• To highlight application
1.1 Origin and History of Statistics areas
• To measure variables
The word statistics is derived from the Latin word ’status’ meaning states
1
indicating the historical importance of governmental data gathering. In Learning outcomes
1. Define data
political science the interest was the numerical description of political unit
2. Define statistics
such as provinces ,states, cities, towns, etc in which the main concern was 3. Mention and define the
the collection of information on revenue (tax collecting), population, po- two branches of statistics
tential man power for military services(military recruitment), areas of land 4. Write commonly used
under cultivation, births and deaths. This is the main task of statistics in terms in statistics
5. Compare and contrast :
the ancient time.
(a) Experiment and survey
Statistics like many other sciences is a developing discipline. It has grad- (b) Sample and population
ually developed during last few centuries. In different times, it has been (c) Statistic and parameter
6. Tell some uses of statis-
defined in different manners.
tics

Some earlier definitions of statistics


Earlier definition means the definition given at the time when subject is
derived (born and apply to use). Defining a subject has always been a
difficult task. Because the subject may have gradual development.And
also, a good definition of today may be discarded in future.
Modern definition of statistics 3

- The kings and rulers in the ancient times were interested in their manpower. They conducted
census of population to get information about their population. They used information to calculate
their strength and ability for wars. In those days statistics was defined as

“ The science of kings, political and science of statecraft”

- A.L. Bowley’s defined statistics as

“Statistics is the science of counting.”

This definition places the entries stress on counting only. A common man also thinks as if statistics
is nothing but counting. This used to be the situation but very long time ago. Statistics today is
not mere counting of people, counting of animals, counting of trees and counting of fighting force.
It has now grown to rich methods of data analysis and interpretation.
-A.L. Bowley’s has also defined as

“Science of averages.”

This definition is very simple but it covers only some area of statistics. Average is very simple
important in statistics. Experts are interested in average deaths rates, average birth rates, average
increase in population, and average increase in per capita income, average increase in standard of
living and cost of living, average development rate, average inflation rate, average production of
rice per acre, average literacy rate and many other averages of different fields of practical life. But
statistics is not limited to average only. There are many other statistical tools like measure of
variation, measure of correlation, measures of independence etc? Thus this definition is weak and
incomplete and has been buried in the past.
-Prof: Boddingtons has defined statistics as:

“Science of estimate and probabilities.”

This definition covers a major part of statistics. It is close to the modern statistics. But it is not
complete because it stress only on probability. There are some areas of statistics in which probability
is not used.

1.2 Modern definition of statistics


Currently the word “statistics” has two definitions.

Definition 1.1. [In plural sense],statistics deals with an aggregate or Statistics


collection of numerical or quantitative expressions of facts that can
express characteristics of interest.

Example 1.1. Statistics on industrial production, statistics or popula-


tion growth of a country in different years etc.
Modern definition of statistics 4

Definition 1.2. [As a branch of scientific method / singular sense/],


statistics means the science or a body of principles and methods used
in the collection, presentation, analysis and interpretation of numerical
data in any field of inquiry and finally drawing conclusions based on
data. These methods are used to draw conclusion about the population
parameter.
In other words, statistics as a subject is the study of making sense of
data.

Example 1.2. If we want to have a study about the distribution of


weights of students in a certain college. First of all, we will collect the
information on the weights which may be obtained from the records
of the college or we may collect from the students directly. The large
number of weight figures will confuse the mind. In this situation we
may arrange the weights in groups such as: “50 Kg to 60 Kg” ,“60 Kg
to 70 Kg” and so on and find the number of students fall in each group.
This step is called a presentation of data. We may still go further and
compute the averages and some other measures which may give us
complete description of the original data.
As noted in both, definition1.1 and 1.2 above , data is the main ingredient
of statistics. At this point you may ask what is data. Before looking at the
classification of statistics, it is better to first define the term data.

Definition 1.3. (Data): It is the information we gather with experi- Data


ments and surveys or It is the collection of raw numerical facts or Fac-
tual information (as measurements or observations or values)recorded
for each element and used as basis for reasoning, discussion, or calcu-
lation

Example 1.3. Data include records on weight, height, breaking


strength of wire, age, marital status, income, yield, etc.
Example 1.4. The set of monthly incomes collected from 25 teachers.

Classification of statistics
Broadly speaking applied statistics can be divided in to two area based on
how the data are used, namely descriptive and inferential statistics.

Definition 1.4. (Descriptive statistics ): is one part of statistics that Descriptive Statistics
consists methods or procedures of collecting, organizing, presenting ,
describing, and summarizing of sample data in to meaning full form
by using various statistical tools such as tables, charts, graphs and
summary measures .

Lecture notes (set by: Tesfaye )


Some basic statistical terminologies 5

Note 1.1. In descriptive statistics one tries to describe the situation as it


is.
Example 1.5. 1) The average no of students in a class @ wsu is 56 in
the year 2002.

2) The average age of students at college M is 20.1 years.

Definition 1.5. (Inferential or inductive statistics) :refers to tech- Inferential Statistics


niques or methods of making decisions or predictions about popula-
tion based on data obtained from sample drawn from that population.
It consists of performing hypothesis testing, determining relationships
among variables and making predictions. It also uses the concept of
probability

Note 1.2. Here one tries to make inferences from sample to populations.
For instance Suppose we want to have an idea about the percentage of
illiterates in our country. We take a sample from the population and find
the proportion of illiterates in the sample. This sample proportion with the
help of probability enables us to make some inferences about the population
proportion. This study belongs to inferential statistics.

Example 1.1. 1) A recent study showed that eating garlic can lower blood
pressure. 2) The chance that a person will be roped in a certain city is 15%.
3) There is a relationship between smoking tobacco and an increasing risk
of developing cancer.

Activity 1.1. :Test yourself


1. Write the the difference between Descriptive & Inferential statistics?

1.3 Some basic statistical terminologies


Basic terms
In this section of chapter 1, we introduce some commonly terms used while  Population
 Sample
dealing statistics.
 Variable
Definition 1.6. Statistical population: is the collection of all  Constant
 Sample size
elements-individual, objects or items whose characteristics are being
 Experiment
studied.  Survey
 Parameter
Example 1.2. total population of a country or village, Total number of  Statistic
plants in afield,All university student’s of Ethiopia,All staff members of
Wolaita Sodo University.

Definition 1.7. Sample:is a sub group or part or portion of the popu-


lation selected by some methods in order to estimate the characteristics
Some basic statistical terminologies 6

of the population. Hence sample should be representative of the pop-


ulation.
Figure 1 illustrates the selection of sample from the population.
marginnoteFig.1

Definition 1.8. Variable: is the characteristics of interest of interest Variable


about each individual element of a population / sample or it is charac-
teristics or attribute that can assume different values. In other saying,
a quantity which can vary from one individual or object to any other is
called a variable. It is usually denoted by the last letters of alphabets
.

Example 1.3. I Heights and Weights of students, Income, Temperature,


No. of Children in a family, Age, marks, etc.

Definition 1.9. Constant:A quantity which can be assuming only one


value .It is usually denoted by the first letters of alphabets a, b, c....

Example 1.4. I Value of π = 22/7 = 3.14 and value of e = 2.718?.

Definition 1.10. Sampling: is the process or method of sample se- Sampling


lection from the population. The act, process, or technique of select-
ing a suitable sample; specif: the act, process technique of selecting
a representative part of a population for the purpose of determining
parameters or characteristics of the whole population.

Definition 1.11. Sample size:the number of elements or observations


to included in the sample.It is usually denoted by n.

Definition 1.12. Experiment: a planned activity whose results yield Experiment


asset of data. this includes both the activities for selecting the elements
and obtaining the data values.

Definition 1.13. Parameter:a numerical value that describes charac-


teristics of population (calculated from population).population charac-
teristic may include population mean, variance, range

Definition 1.14. Statistic: a numerical value calculated from sample Statistic(s)


data. It describes sample characteristics. Sample X–C may include
sample mean, sample variance, range, etc.

Note 1.3. .For every parameter, there is a corresponding sample statistic.


The statistic describes the sample as the same way the parameter describes
the population. The parameter is fixed but the statistic not.
Application,use,and limitations of statistics 7

1.4 Stages in statistical investigation


There are five stages in any statistical investigation.
/ Collection of data
/ Organization of data
/ Presentation of data
/ Analysis of data
/ Inference of data
/ Recommendation (optional)

1.5 Application,use,and limitations of statistics


The application or use of statistics is unlimited. That is it is used in al-
most all fields of human activities. Statistical concepts and methods are
widely applied for gathering data in various fields (such as agriculture, bi-
ology, business, economics, education, psychology, engineering, medicine,
sociology, computer science ) and to draw valid conclusions.

1.5.1 Application areas of statistics


Statistics can be applied in the following areas.
• In science, In planning, In industry, In auditing,
• In public administration, In areas of research.

Note 1.4. To sum up, statistics can be applied in any field of study that Application area
seeks/need/ quantitative evidence.  In government
 In economics
 In business
Example 1.5. Some areas
 In education
(a) In government: In state affairs, statistics is useful in the following  In audit
ways.  In sociology
 In planning
- To collect the information and study the economic condition of  In engineering
people in the states.  In banking
- To assess the resources available in states.  In agriculture
 In health
- To help state to take decision on accepting or rejecting its policy
based on statistics.
- To provide information and analysis on various factors of state
like wealth, crimes, agriculture experts, education etc.
(b) In Economics :In economics, statistics is useful in the following ways:
- Helps in formulation of economic laws and policies.
Application,use,and limitations of statistics Application areas of statistics 8

- Helps in studying economic problems.


- Helps in compiling the national income accounts.
- Helps in economic planning.
(c) Statistics and Business
- Helps to take decisions on location and size
- Helps to study demand and supply
- Helps in forecasting and planning
- Helps controlling the quality of the product or process
- Helps in making marketing decisions
- Helps for production, planning and inventory management.
- Helps in business risk analysis
- Helps in resource long term requirements, in estimating con-
sumers Preference and helps in business research.
(d) Education
- Statistics is necessary to formulate the policies regarding start
of new Courses, consideration of facilities available for proposed
courses.
- To describe test results.
(e) Accounts and Audits
- Helps to study the correlation between profits and dividends
enable to know trend of future profits.
- In auditing sampling techniques are followed.
(f) In Sociology :Sociology is one of the social sciences aiming to discover
the basic structure of human society, to identify the main forces that
hold groups together or weaken them and to learn the conditions
that transform social life. The sociologist may be called upon for
help with a special problem such as social conflict, urban plight or
the war on poverty or crimes. His practical contribution lies in the
ability to clarify the under laying nature of social problems to estimate
more exactly their dimensions and to identify aspects that seem most
amenable to remedy with the knowledge and skills at hand.
- Collects materials or data for sociological research studies.
- Sociologists seek the help of statistical tools to study cultural
change in the society, family pattern, prostitution, crime, mar-
riage system etc.
- They also study statistically the relation between prostitution
and poverty, crime and poverty, drunkenness and crime, illiteracy
and crime etc.
(g) In Planning: Modern age is an age of planning and statistics are in-

Application areas of statistics Lecture notes


Application,use,and limitations of statistics Uses or functions of statistics 9

dispensable for planning.


- Based only on a correct assessment of various resources (human
and material) of the country proper planning can be made.
- A study of data relating to population, agriculture, industry,
prices, employment, health, education enables the planners to fix
up time-bound targets on the social and economic fronts evalua-
tion of such economic and social programs at different stages by
means of related data gathered continuously and systematically
is also done to decide whether the programs are on towards the
goal or targets set.
(h) Statistics and engineering :
- To compare the breaking strength of two types of materials.
- To determine the reliability of a product.
- To control the quality of products in a given production process.
- To compare the improvements of field due to certain additives
fertilizer, herbicides..
(i) In Banking
- Statistics play an important role in banking. The banks make
use of statistics for a number of purposes. The banks work on
the principle that all the people who deposit their money with
the banks do not withdraw it at the same time. The bank earns
profits out of these deposits by lending to others on interest.
The bankers use statistical approaches based on probability to
estimate the numbers of depositors and their claims for a certain
day.

1.5.2 Uses or functions of statistics


The main function of statistics is to enlarge our knowledge of complex
phenomena. Some important functions or uses of statistics are:
1. It presents facts in a definite and precise form.
2. Simplifies the mass of complex data (data) reduction.
3. Helps in measuring the magnitudes of variation in data.
4. Furnishes techniques of comparison.
5. Helps in estimating unknown population parameters (characteristics)
6. Testing and formulating of hypothesis.
7. To Study the relationship between two or more variables.
8. Helps in predicting future trends. To predict the future.
9. Helps Government to take decisions.
10. Help to formulate polices.

Uses or functions of statistics Lecture notes


Application,use,and limitations of statistics Limitations of statistics 10

1.5.3 Limitations of statistics


As there are many usefulness of statistical methods, there are also many
potential errors and limitations in carrying out and interpreting statistical
studies.
(1) It deals with only those subjects of inquiry that are capable of being
quantitatively measured and numerically expressed. That means it
studies qualitative characteristics indirectly and quantitative charac-
teristics directly.
Example 1.6. Beauty, honesty, poverty, and standard of living.
(2) It deals on aggregates of facts and no importance is attached to
indi-vidual items. It is suitable to if their group property/ characteristics
are desired to be studied.
(3) Statistical data are only approximately and not mathematically cor-
rect. That means statistical results are true only on average.
Example 1.7. The probability of getting ahead in tossing a coin is
1
/2 .
The germination percentage of a given variety of seed is 80%.
(4) It is sensitive to misuse. i.e., to apply statistical methods an expert
is needed.

Examples that Show misuse of statistics


• Errors of context: facts, otherwise true, may be presented or quoted
out of context in such a way as to misrepresent the real state of affairs.
Politicians, journalists and others may sometimes do so.
• Errors of generalization: generalization or conclusion based on
incomplete or inadequate or unrepresentative data can lead to wrong
conclusions.
Example 1.8. I On the basis of poor marks obtained by two or three
students we cannot conclude that all the students from a college are
equally bad.
• Errors of deduction: a general result may be wrongly applied to a
specific case.
Example 1.9. I
(i) If the students of a particular case have been showing good results
every year in the past it does not mean they will necessarily do this
year too.(ii) The number of car accidents committed in a city in a
particular year by women drivers’is 10 while that committed by men
drivers is 40. Hence women drivers are safe drivers.

Note 1.5. . There is nothing wrong with statistical tools. The fault lies
with the user of the science and not with the science.
Limitations of statistics Lecture notes
Types of variables and measurement scales Characteristics of statistical data 11

1.5.4 Characteristics of statistical data


Statistical data refers to numerical descriptions (count or measurement) of
things. In order to that numerical descriptions may be called statistics they
must possess the following characteristics. They must be
0 In aggregates and expressed in number. This means that statistics
are ?number of facts?. A single fact, even though stated numerically,
cannot be called statistics.
0 Affected to a marked extent by a multiplicity of causes: This means
that statistics aggregates of such facts only as grow out of a variety
of circumstances. It is difficult to assess the individual contribution.
0 Estimated or enumerated according to reasonable standard of accu-
racy.
0 Collected in a systematic manner for a predetermined purpose.
0 Placed in relation to each other. That means they must be comparable
to each other in terms of either in point of time, space or condition.

Note 1.6. . Even though statistical data always denote figures or numerical
descriptions it must be remembered that all numerical descriptions are not
statistical data.

1.6 Types of variables and measurement scales


Types of variables

The definition of variable is given in section 1.3. let’s define here again. Q:what is variable?
Variable is any characteristics recorded for subjects in a study.The termi-
nology variable highlights that data values vary.The data values that we
observe for a variable are referred to as the observations. There are basi-
cally two kinds or types of variables based on values they assume.
• Qualitative Variable: variables that result in qualitative informa-
tion and
• Quantitative variable: variables that result in quantitative infor-
mation.

Definition 1.15. Qualitative/ attribute/categorical variable: a variable Qualitative variable


is called categorical if each observation belongs to one of asset of cat-
egories. Arithmetic operations and averaging are not meaningful for
data resulting from a qualitative variable. A key feature to describe is
the relative number of observations (%) in the various categories.

Characteristics of statistical data


Types of variables and measurement scales 12

Example 1.10. I gender (with categories male and female), belief in life
after death (yes, no), eye color (brown, black...)

Definition 1.16. Quantitative / numerical variable: a variable is called Quantitative variable


quantitative if observations on it take numerical values that represent
different magnitudes of the variable. Key features to describe are the
center and the spreads (variability) of the data. Are variables that as-
sume values of the measurable quantity. Observations obtained though
such process is called quantitative data

Quantitative variables can also be classified in to two: discrete and contin-


uous variable.

Definition 1.17. Discrete variable: is a quantitative variable that can


assume countable number of values. That means the outcome of the
variables count such as 0, 1, 2, .... There is gab between any two values.
Any variable phrased as “the number of ...” is discrete.

Definition 1.18. Continuous variable is a quantitative variable that


can assume any measurable values with in specific range. In short
quantitative variable is continuous if it is possible values from an in-
terval.

Example 1.11. I weight, length, time


Note 1.7. To determine whether a variable is discrete or continuous, remember to look at the
variable and think about the values that might occur. Do not look at data values that have been
recorded; they can be misleading. In practice data analysis depends on types of variable. Why do
we care whether variables qualitative or quantitative or whether a quantitative variable is discrete
or continuous? We will see that the method used to analyze data will depend on the type of variable
the data represent.

Measurement scales for variables

Definition 1.19. Measurement is the process of assigning numbers to Q: What is measurement?


objects according to a set of rules. Level or scale of measurement shows
the amount of information contained in a variable of interest.

There are 4 types of measurement scale for variables. These are nominal Q: What does scale of mea-
level, ordinal level, interval level and ratio level. Now, it is time to study surement mean?
the properties of each scale in detail. 2 2
Levels of measurement
1. Nominal
¬ Nominal scale: nominal is a Latin word for “name”. It is a scale 2. Ordinal
for grouping individuals in to different categories or names. No order. 3. Interval
There is qualitative difference among categories. +, - , *, / are impos- 4. Ratio
sible to apply for data measuring in this scale or level. Comparisons
are also impossible.
Types of variables and measurement scales 13

Example 1.12. I
• Gender (with categories male and female)
• Belief in life after death (yes, no)
• Religion (Christian ,muslim, Hinduism, ...)
• Eye color (brown, black...)
• blood type (A,B,AB,O)
­ Ordinal scale: ordinal is a Latin word meaning? order?. It is a
scale for grouping and ordering of individuals for in to different cate-
gories. The intervals or spaces or gabs between the categories are not
necessarily equal. Comparison is possible not quantitative.

Example 1.13. I
5 letter grades (a, b, c, d, f)
5 Educational level (diploma, degree, master, phd...)
5 Socio-economic status(low, medium, high)
5 Rating (Strongly Agree, Somewhat Agree, Undecided, Some-
what Disagree, Strongly Disagree)
® Interval scale: have the properties the first have, plus it can specify
the amount of distance i.e., the intervals between values are the same.
No true zero.

Example 1.14. I
• Temperature (oc )
• IQ score(95, 110, 125)
• Calendar time (day, week, month)
¯ Ratio scale: have all others had, plus there is a true zero.

Example 1.15. I
• Age is a ratio data, because someone who is 40 year aged is as
old as someone who is 20.
• monthly income(Birr)
• height (cm)

Note 1.8. Discrete variables are nominal or ordinal; whereas continuous


variables can be measured at the interval or ratio level.
Types of variables and measurement scales 14

1.7 Review Exercises


I Write true if the statement is correct and false if the statement is false.
1. Data on eye color can an example of nominal level data.
2. Statistics deals on collections or aggregates of facts.
3. The highest level of measurement for variables is the ordinal level.
4. Data measured at ratio scale conveys more information than interval scale.
5. A.L. Bowley defined statistics as? statistics is science of averages.?

II Fill in the blank spaces


1. The number of absences per year a worker has is an example of what type of data?———-
2. The three types of frequency distributions are ————————-, ————————-
and ————————-.
3. In frequency distribution the number of classes should be between ———and ———.
4. Indicate which of the following variables are quantitative and which are qualitative.

a) Number of persons in a family papers


b) Colors of cars g) Monthly TV cable bills
c) Marital status of people h) Spring break locations favored by college
d) Time to commute from home to varsity students
e) Number of errors in a person’s credit re- i) Number of cars owned by families
port j) Lottery revenues of states
f) Number of typographical errors in news-
5. Fill the following table ( By saying Yes and No).

Table 1.1. Summary of characteristics of the 4 levels of data measurement scales


Level categorized ranked Differences True zero Example
measured
Nominal ...... ...... ...... ...... .....
Ordinal ...... ...... ...... ...... .....
Interval ...... ...... ...... ...... .....
Ratio ...... ...... ...... ...... .....

III Essay: read and search different materials, and then write down your ideas.
1. Define data, population, sample, variable in statistical context by providing appropriate
examples.
2. What are the applications of statistics in your field of study?
3. Write limitations of statistics with examples.
4. Why study statistics? Write a paragraph on it.
5. Suppose that your friend tells you that building height is a qualitative variable. Is she
correct? Why?
6. What do you know from the lesson you have been learnt from chapter 1 ?
7. Write comments about what you observe/see in learning & teaching of chapter 1 . This
will be input for coming chapter.

You might also like