You are on page 1of 26

Chapter 1

Introduction to Statistics
Rohana Yusoff
Faculty of Computer and
Mathematical Sciences
(UiTM Terengganu)
TOPICS
1.1 Descriptive and Inferential
Statistics
1.2 Types of Data
1.3 Types of Variables
1.4 Sampling Techniques
1.5 Data Collection Method
Objectives
By the end of this topic, you should be able to:

Distinguish between Distinguish between


Understand why people
qualitative variable and descriptive and inferential
study statistics
quantitative variable statistics

Distinguish among Distinguish between


Understand some
nominal, ordinal, interval primary data and
statistical terms
and ratio scales secondary data

Distinguish some of the


sampling techniques :
Random and Non-Random Understand various of
Sampling data collection methods
Introduction
“People often use the word statistics to refer to a group of data. For
example, they may say that they gathered statistics from their business operation.
What they are referring to is measured facts and figures. We often hear the use of the
word statistics to refer to counts of certain happenings e.g death statistics, statistics on
accidents.”

What is Statistics?
“Statistics is a science dealing with the collection, analysis, interpretation, and
presentation of numerical data”. (Webster’s Third New International Dictionary)
“Statistics is also a branch of mathematics.
In the field of statistics, the word statistics is used in at least two important ways.
First, statistics can be descriptive measures (e.g. mean, median, mode, standard
deviation, etc) computed from a sample, called sample statistics. Second, statistics
can be the distributions used in the analysis of data, e.g. a researcher using the t
distribution to analyse data might refer to the use of t statistic in analysing data.”
Terms and definitions
“Statistics, like many areas of study, has its own language. It is important
to begin our study with an introduction of some basic concepts in order to
understand and communicate about the subject.
Two types of statistics:
DESCRIPTIVE STATISTICS
INFERENTIAL STATISTICS
To understand the difference between descriptive and inferential statistics,
definitions of population and sample should be dealt first.

Population – A population is a collection of persons, objects, or items of


interest.(Webster’s Third New International Dictionary)
A researcher defines the population to be whatever he or she
is studying.”
Examples of populations
• “All vehicles
This is a widely defined population because it will include all types of
vehicles anywhere in this world.

• All Daihatsu Charade cars produced from 2000 to 2010.


This is a more narrowly defined population compared to the above.

• All Proton Wira cars produced on 3rd March, 2012, by


Proton Company at the Behrang plant.
This is the best - specific and clearly stated population”
More terms and definitions
Census /Population survey
“A census is a survey that includes every member of the population.
For example the Malaysian government conduct a ‘Banci Penduduk’ every
ten years.

The reason for doing census is to eliminate the possibility that a randomly
selected sample might not represent the population.

Even when all the proper sampling techniques are implemented, a sample
that is non representative of the population can be selected”
by chance.
More terms and definitions
Sample
“A portion of the population selected for study.
For various reasons researchers often prefer to work with a sample of the
population instead of the entire population.
For example, in conducting quality control experiments to determine the
average life of light bulbs, a light bulb manufacturer might randomly sample
only 75 light bulbs during a production run.
Sample survey – A survey conducted only on a sample.
Activity
Most of the time surveys are conducted by using samples and not a census
of the population. Find out why by listing the advantages
and disadvantages of sampling “
Types of statistics

Descriptive Inferential
Statistics Statistics
Descriptive Statistics
• “A study of the procedures of data collection, classification,
summarization and presentation, by using tables, graphs
and summary measures.

• These procedures only describe and reach conclusions on


the sample being surveyed.

• For example, if a lecturer collects data on his class


examination results, summarizes and reaches conclusions
about the class only, he is doing descriptive statistics.”
Inferential Statistics
• “A study of the process of arriving conclusions about the
population under study based on the data collected from the
sample.
• Inferential statistics are sometimes referred to as inductive
statistics.
• One application of inferential statistics is in pharmaceutical
research. Some new drugs are expensive to produce,
therefore tests must be limited to small sample of patients.
Researchers design experiments with small randomly selected
samples of patients and attempt to reach conclusions and
make inferences about the population.”
Difference in notations between Population
parameters and Sample statistics

Population Parameters Sample Statistics


i. Population Size, N i. Sample Size, n
Ii. Mean,  ii. Mean, 𝒙
Iii. Variance, 2
iii. Variance, s2

When we conduct a sample survey, we will get


sample statistics. With these sample statistics,
we can estimate the population parameters
More terms and definitions
Sampling Frame
• A list of all population element from which the sample is taken. It
should be comprehensive, complete and up-to-date.
• Examples of sampling frame: Electoral Register; Postcode Address
File; telephone book, school lists, trade association lists etc.
Example 1
Example 1 - continued
c. “The Registrar of College Excel conducted a survey to study their students’
perceptions towards the Student Leadership course. All students staying in
the hostels had attended this course. There were 15 blocks of hostels and
each block consisted of 4 levels. In this survey, the registrar chose 5 blocks
of hostels randomly and then selected only 2 levels randomly from the
selected blocks. Finally, he collected information from the sample of
students in the selected levels.
i. State the population for this survey.
All College Excel students who stay in the hostels and had attended
Student Leadership course.
ii. State a possible sampling frame for this survey.
A list of all College Excel students who stay in the hostels and had
attended Student Leadership course.”
Example 2
“State the type of statistics suitable for the following situations.
i. Out of 6000 students at UiTMT, 700 were surveyed to find out if they entered
the courses of their first choice. 31% of them entered the courses of their first
choice, 40% entered the courses of their second choice, and the rest entered
the courses of their third, fourth, and fifth choices.
ii. Out of 6000 students at UiTMT, 700 were surveyed to find out if they entered
the courses of their first choice. 31% of them entered the courses of their first
choice, 40% entered the courses of their second choice, and the rest entered
the courses of their third, fourth, and fifth choices.
From the above results, it was assumed that 31% of UiTMT students entered
the courses of their first choice, 40% entered the courses of their second
choice, and the rest entered the courses of their third, fourth, and fifth
choices.
iii. The process of using sample statistics to draw conclusion about the
parameters of the true population is called ………………………..”
Types of data
• Secondary data can be
• Data obtained by obtained from other • Find three advantages
researchers sources for example, of primary and
themselves. They newspapers, economic secondary data.
initially gather the reports, statistical • Find three
data and first publish abstracts and disadvantages of
them. How/Where economic journals, primary and
can you get primary Statistics Department, secondary data.
data? The answer is Labour Department,
• (Reminder: Will
to conduct your own Police Station, internet
etc. come out in quiz!)
research.

Primary Secondary
Data Activity
Data
Variables
What is a Variable?
“A variable is a characteristic under study or investigation. It assumes
different values for different population elements.
Examples:
• The Registrar of College Excel conducted a survey to study their
students’ perceptions towards the “Students Leadership Course”.
• MARA wishes to conduct a survey on their sponsored students who
are currently staying overseas. The objective of the study is to obtain
information regarding the social problems faced by their students.”
Types of variables
Discreet Quantitative variable
A discreet quantitative variable is a
variable obtained by counting e.g the
number of cars, houses etc.
Quantitative
variable
Continuous Quantitative variable
A variable obtained by measuring and can
assume any numerical values over a
certain interval. The accuracy depends on
VARIABLES the instrument used. e.g length, height etc.

Qualitative A qualitative variable is a variable


that cannot assume a numerical
variable value but can be classified into two
or more categories.
E.g race, colour, brand name, etc.
Scales of measurement
Nominal level
• “A nominal level is one that allows the researcher to assign subjects to certain
categories or groups. E.g. male, female or coded as 1, 2.
• The order of the selection of answers is of no importance. The categories can
be arranged in any manner. In this case, a male respondent is not superior
compared to a female respondent or vice versa.
• The groups or numbers are just labels which are non-overlapping or mutually
exclusive categories.
• Note that the categories are also collectively exhaustive i.e there is no third
category into which respondents would fall.
• The information that can be generated from nominal scaling is to calculate the
percentage or frequency in our sample of respondents. It only gives some
basic, categorical, gross information .
• Data of this nature have a limited number of statistical
techniques applicable.”
Ordinal level
• “An ordinal level not only categorizes the variables in such a way as to
denote differences among the various categories, it also rank-orders
the categories in some meaningful way.
• E.g Level of education, which can be categorized as primary, secondary,
university and postgraduate or coded as 1,2,3,4.
• This variable looks quite similar to the nominal one in the sense that
they are categorical (qualitative), the difference is in the ordering of the
categories. The normal sequence of pursuing one’s education is from
primary to secondary to university and finally to postgraduate.
• However, ordinal scaling doesn’t provide an indication of the
magnitude of preference between the different characteristics.
• Nominal and ordinal data are sometimes called qualitative
data and allow similar statistical analyses”
Interval level
• “An interval level is the next highest level of data measurement.

• It includes all the characteristics of ordinal level, but in addition, the distance
between values is constant size. It has an arbitrary zero (0).

• It allows us to perform more statistical analyses such as finding the means and
the standard deviations of the responses on the variables.

• E.g Thermometer to measure temperature is an interval level scale


Caution : There is an on-going debate on the issue of whether the rating scales
commonly used in Behavioral Sciences such as the Likert scale is
an interval scale. If you are interested you can google to find out “
more.
Ratio level
• “The ratio level overcomes the deficiency of the arbitrary origin point of
the interval level in that it has an absolute zero point which is a
meaningful measurement point. Ratio data is flexible; all descriptive and
inferential techniques are applicable.
For example:
• The weighing machine – absolute zero point, equal distance between
points on scale.
A person weighing 60 kilograms is twice as heavy as one who weighs
30 kilograms.
• Actual age, sales, costs, income and the number of
organizations individuals have worked for are some other “
examples.
Quiz

Click the Quiz button to edit this quiz

You might also like