You are on page 1of 37

BMS 3244

Ethics, Research Methodology and


Statistics

Dr. Tharaka Ranathunge


Introduction to Statistics
What is Statistics?

• Statistics involves planning, collecting, analysing data, and reporting


and interpreting results.
• Information gained from data enables sound analytical decision
making.
• Descriptive statistics – collecting, summarising and describing data
sets.
• Inferential statistics – estimate characteristics of data to discover
patterns and make inferences about the population (or the future).

3
Why we need to know about statistics ?
• How to properly present and describe
information
• How to draw conclusions about large
population based only on information
obtained from samples
• How to improve the processes
• How to obtain reliable forecasts
Who Uses Statistics?
Statistical techniques are used extensively by
marketing, accounting, quality control, consumers,
hospital administrators, educators, politicians,
physicians, etc...
Sources of Statistical Data
● For Researching problems
usually requires published data. Statistics on these problems can be
found in published articles, journals, and magazines.
● Published data is not always available on a given subject. In such cases,
information will have to be collected and analyzed.
● One way of collecting data is via questionnaires.

What are the other data collection methods?


5
Types of Statistics
A population is a collection of all possible individuals,
objects, or measurements of interest. A parameter is a
descriptive measure of the entire population of all
observations of interest

A sample is a portion, or part, of the population of


interest. A statistic describes a sample and
serves as an estimate of the corresponding
population parameter(a characteristic of the sample)
Types of Statistics

Descriptive Statistics Inferential Statistics

Descriptive Statistics:
Methods of organizing, summarizing, and presenting data
in an informative way.

Inferential Statistics:
A decision, estimate, prediction, or generalization about a
population, based on a sample.

7
Types of Statistics
(examples of inferential statistics)

Eg 1: The accounting department of a large firm will


select a sample of the invoices to check for accuracy for
all the invoices of the company.

Eg 2: Wine tasters sip a few drops of wine to make a


decision with respect to all the wine waiting to be
released for sale.

8
Types of Variables
DATA

Qualitative or Quantitative or
attribute Numerical
(type of car owned)

Discrete Continuous
(number of (time taken for
children) an exam)

9
Types of Variables
For a Qualitative or Attribute variable the
characteristic being studied is nonnumeric.
Gender, religious affiliation, type of automobile
owned, state of birth, eye color are examples.

In a Quantitative variable information is reported


numerically.
balance in your checking account, minutes remaining
in class, or number of children in a family.

10
Levels of Measurement

● Categorical variables can be:


⚪ Nominal – there is no particular order to the categories (eg
male/female, yes/no)
⚪ Ordinal – there is an order to the categories (eg first year
at uni, second year at uni, third year at uni etc)
● Numerical variables can be:
⚪ Interval – there is no true zero point (eg temperature –
degrees Celsius & Fahrenheit have different zero points)
⚪ Ratio – there is a true zero point (eg a person’s height,
weight etc)

11
Levels of Measurement
1)Nominal level:
Data that is classified into categories and cannot be
arranged in any particular order.
eye color, gender, religious affiliation

Mutually exclusive:
An individual, object, or measurement is
included in only one category.
Exhaustive:
Each individual, object, or measurement must
appear in one of the categories.
12
Levels of Measurement

2)Ordinal level:
involves data arranged in some order, but the
differences between data values cannot be determined
or are meaningless.

During a taste test of 4 soft drinks, Mellow Yellow was


ranked number 1, Sprite number 2, Seven-up number 3, and
Orange Crush number 4.

13
Levels of Measurement (Cont..)
3)Interval level:
similar to the ordinal level, with the additional property
that meaningful amounts of differences between data
values can be determined. There is no natural zero point.

Temperature on the Fahrenheit scale.

4) Ratio level:
The interval level with an inherent zero starting point.
Differences and ratios are meaningful for this level of
measurement.
Monthly income of surgeons, or distance traveled by
manufacturer’s representatives per month.
14
Level of
data

Nominal Ordinal Interval Ratio

Meaningful Meaningful
Data may only 0 point &
Data are ranked difference
be classified ratio
between values between values

Classification of Your rank for


students by Number of
this course Temperature
district study hours
module

15
Excercise1
For each of the following random variables,
determine whether the variable is categorical or
numerical.
If the variable is numerical, determine whether the
variable of interest is discrete or continuous.
In addition, determine the level of measurement.
Variable Qualitative / Discrete / Level of Data
Quantitative Continuous

16
Solution 1
b) Mobile phone service provider
c) Number of text messages sent per month
e) Length (in minutes) of longest call made during month
f) Colour of mobile phone
g) Monthly charge (in dollars and cents) for calls made
h) Ownership of a car charge kit
i) Number of calls made per month
j) Whether there is a telephone line connected to a
computer modem in the household
k) Whether there is a fax machine in the household

17
SAMPLING TECHNIQUES
Sample
• A Sample is a tool to infer something about a
population by selecting a sample from that
population.

Therefore, Sampling is the only way to determine


something about the population
A sample is a subset of a larger
Sampling population of objects individuals,
households, businesses,
organizations and so forth.

Sampling enables researchers


to make estimates of some
Population unknown characteristics of
the population in question
(parameter)
A finite group is called population
whereas a non-finite (infinite)
group is called universe
Sample
A census is a investigation of all
(statistic)
the individual elements of a
population

20
1.The physical Why Sampling?
impossibility of checking
all items in the population.

2.The adequacy of sample results


in most cases.
when sampling is done with care the
results are expected to be accurate 3. Budget and time
and a good representative of Constraints
population The time-consuming
aspect of contacting the
whole population.
4.The destructive nature of
certain tests. For some elements, 5.The cost of studying
sampling is the way to test, since all the items in a
tests destroy the element itself. population.
The Sampling Process
4.Plan procedure for
selecting sampling units

3.Determine if a probability
3 or non-probability sampling 5.Determine sample size 5
method will be chosen

2 2.Select a 6.Select actual sampling units 6


Sampling Frame

1.Define the Target


1 7.Conduct fieldwork 7
population
22
Graphical Depiction of
Sampling Errors
Respondents
Planned (actual
Sampling Frame Sample sample)

Non-Response Error
Sampling Frame Error

Random Sampling Error


Total
23
Population
Probability and
Non-Probability Sampling
Probability Sampling – Every element in the
population under study has a non-zero probability of
selection to a sample, and every member of the
population has an equal probability of being selected

Non-Probability Sampling – An arbitrary means of


selecting sampling units based on subjective
considerations, such as personal judgment or
convenience. It is less preferred to probability
sampling
25
Methods of Probability Sampling
• Simple Random Sample

• Systematic Random Sampling

• Stratified Random Sampling

• Cluster Sampling
Non-probability sampling

Four types of non-probability sampling techniques


⚫ Very simple types, based on subjective criteria
1. Convenient sampling
2. Judgmental sampling

⚫ More systematic and formal


3. Quota sampling

⚫ Special type
4.Snowball Sampling
Probability Sampling

1. Simple Random Sampling


1.Simple Random Sampling – This is a technique which
ensures that each element in the population has an equal
chance of being selected for the sample. Randomly pick
individuals to include in the sample
▪ Example: Choosing raffle tickets from a drum,
▪ computer-generated selections, random-digit telephone dialing

28
Probability Sampling
Simple random sampling
• The major advantage of simple random sampling is
its simplicity
• As sample size increases, sample becomes more and more
representative of population.
• Sampling is generally without replacement
• Problem: can be very costly if population is large. Choices
come from a list; who makes the list?

29
Probability Sampling

2. Systematic Sampling
This is a technique which in which an initial starting point is
selected by a random process, after which every nth number on
the list is selected to constitute part of the sample

▪ Example: From a list of 1500 name entries, a name on the list is randomly
selected and then (say) every 25th name thereafter. The sampling interval in this
case would equal 25.

▪ For systematic sampling to work best, the list should be random in nature and not
have some underlying systematic pattern

30
Probability Sampling

3. Stratified Sampling
This is a technique which in which simple random subsamples are drawn from
within different strata that share some common characteristic

▪ Example: The student body of BSC-MIT is divided into two groups


(management , IT) and from each group, students are selected for a sample
using simple random sampling in each of the two groups, where by the size
of the sample for each group is determined by that group’s overall strength

▪ Stratified Sampling has the advantage of giving more representative


samples and less random sampling error; the disadvantage lies therein, that
it is more complex and information on the strata may be difficult to obtain

31
• A population is first divided into subgroups (strata), and a
sample is selected from each subgroup.
Ex:
Suppose it is necessary to study the advertising expenditures
of 352 largest companies in SLanka. We are asked to use a
sample of 50 companies. Objective of study is to determine
whether the larger companies with higher profit Spent more
on promotional activities.

32
Sub Profitability # of Percent Number
group Firms of Total Sampled
1 Over 30% 8 2 1
2 20 up to 30% 35 10 5
3 10 up to 20% 189 54 27
4 0 up to 10% 115 33 16
5 Deficit 5 1 1
Total 352 100 50

• Advantage: more accurately reflect the accuracy


of the population.
33
Probability Sampling

4. Cluster Sampling
A population is first divided into primary units (clusters),
and then selecting a sample of these primary units. All
observations in these selected clusters are included in
the sample.
1 2

3 4
6
5
34
• Example :
Let’s say that a researcher is studying the
academic performance of high school students in
Sri Lanka and wanted to choose a cluster sample
based on geography. First, the researcher would
divide the entire population of the Sri Lanka
States into clusters, (different provinces or
district). Then, the researcher would select either
a simple random sample or a systematic random
sample of those clusters/states. Let’s say he or
she chose a random sample of 15 districts and he
or she wanted a final sample of 5,000 students.
The researcher would then select those 5,000
high school students from those 15 districts either
through simple or systematic random sampling.
This would be an example of a tcluster sample.
• Cluster Sampling: Advantages and Disadvantages

❖ cluster sampling generally provides less precision than either simple


random sampling or stratified sampling.

❖ the cost per sample point is less for cluster sampling than for other
sampling methods. Given a fixed budget, the researcher may be able
to use a bigger sample with cluster sampling than with the other
methods.

❖ When the increased sample size is sufficient to offset the loss in


precision, cluster sampling may be the best choice.
37

You might also like