Module1-Basic Statistical Concepts

Module 1
Basic Statistical
Concepts
At the end of the lesson the students are expected to:
1. define and cite the importance of statistics

2. differentiate descriptive and inferential statistics;
3. differentiate discrete and continuous variables;
4. familiar and differentiate the sampling techniques;
5. determine the sample size of a given population.
Introduction
Most areas of human endeavors utilize statistics which implies that it is a very
important tool in researches and studies.
The study of statistics requires primarily the understanding of basic concepts,
symbols and mathematical notions.
Statistical designs and experiments are utilized to gather more information from a
limited body of observation. Various statistical techniques are used in the laboratories,
experimental fields or other controlled conditions. The utilizations of these tools in
statistics is needed to obtain accurate and reliable results.
BASIC CONCEPTS IN STATISTICS
Statistics – the science of conducting studies that collect, organize, present, analyse
and interpret data and to make decisions. The word Statistics came from
the following words:
Statisticum Collegium (Latin) which means “council of state”

Statista (Italian) which means “statesman or politician”
Statistik (German–Gottfried Achenwall) which means “science of state” or “political
arithmetic”
IMPORTANCE OF STATISTICS
1. Statistics can give a precise description of data.
2. Statistics can predict the outcome of experiment or behavior of an individual.
3. Statistics can be used to test a hypothesis.
Two Branches of Statistics
1. Descriptive Statistics
- is concerned with techniques that are used to describe or characterize the obtained
data.
- consists of methods for organizing, displaying and describing data by using
tables, graphs, and summary measures.
Elementary Statistics 2
Examples:
a. An annual stockholders’ report details the asset of the corporation.
b. A physics instructor tells his class the number of the students who received a
passing score on a recent exam.
c. Calculating the mean of a sample set of scores to characterize the sample.
2. Inferential Statistics
- involves techniques that use the obtained sample data to infer to populations.
- consists of generalizing from samples to populations, performing
hypothesis testing, determining relationships among variables, and making
predictions.
Examples:
a. Using a sample data from a poll to estimate the opinion of the population.
b. Conducting a correlational study on sample to determine whether educational
level and income in the population are related.
c. It is predicted that an average number of automobiles each household owns
will increase next year.
Population –the complete collection to be studied; it contains all subjects of interest.
Example:
1. Scores of entire students of secondary level
2. All children of any age who have older or younger siblings
 Parameter – is a number calculated on population data that quantifies a

characteristic of the population.
Example:
1. The average score of the entire employees of higher position
2. The average number of siblings of all children of any age
Sample – part of the population of interest; a sub-collection selected from a population.
Example:
1. Satisfactory rating of employees in an office
2. The 40 employees who actually participate in one specific study about time
management
 Statistic – is a number calculated on sample data that quantifies a characteristic

of the sample.
Example:
1. The average scores of students in a class
2. The average number of siblings of 40 children who actually participate
in one specific study
Example:
In a recent state referendum, a political initiative was introduced to increase the
number of years of students to obtain basic education. An exit poll conducted by a
TV network of 852 voters indicate that 51% of the voters in favor of the initiative.
When the final result of the referendum was released, only 45% of the voters
supported the initiative.
Population: All of the registered voters of the state
Parameter: The 45% support of the initiative
Sample: The 852 voters who responded to the exit
poll Statistic: The 51% support of the initiative
VARIABLES AND DATA
A Variable is a characteristic under study that assumes different values for different
elements. In contrast to a variable, the value of a constant is fixed.
For example:
1. Incomes of companies
2. Number of houses built in a city per month during the past year
A data set is a collection of observations on one or more variables.

For example:
1. List of the prices of 25 recently sold homes
2. Scores of 15 employees
3. Opinions of 100 voters
Classification of Variable:
Qualitative Variable – comes Quantitative Variable– comes

from the word “quality”, indicating from the word “quantity”,
a property, characteristic, feature indicating amount, measure, size,
or attribute. These represent etc. These are numerical in
differences in quality, or kind but nature and can be ordered or
not in amount. ranked.
Mathematical Classification of Quantitative Variable:

 Discrete Variables - variables whose values can be counted using integral
values. Examples:
1. Number of books
2. Departments preferred by high positioned employees
3. Votes of “Yes” and “No”
4. Student enrolment in University of Makati
 Continuous Variables - variables that can assume any numerical value over
an interval or intervals.
Examples:
1. Height 3. Temperature
2. Weight 4. Time
Determine whether the following are Qualitative or Quantitative. If it is Quantitative,
classify whether Discrete or Continuous.
1. Number of employees in an office
2. Gender of the next customer in a cafe
3. Brand of computer purchased
4. Weight of the chemicals
5. Type of car owned
Classification of Variables According to Relationship
 Independent Variable – An independent variable is the variable you have control

over, what you can choose and manipulate. It is usually what you think will affect
the dependent variable. In some cases, you may not be able to manipulate the
independent variable. It may be something that is already there and is fixed,
something you would like to evaluate with respect to how it affects something
else.
 Dependent Variable – A dependent variable is what you measure in the

experiment and what is affected during the experiment. The dependent variable
responds to the independent variable. It is called dependent because it "depends"
on the independent variable. In a scientific experiment, you cannot have a
dependent variable without an independent variable.
Example: You are interested in how stress affects heart rate in humans. Your
independent variable would be the stress and the dependent
variable would be the heart rate. You can directly manipulate stress
levels in your human subjects and measure how those stress levels
change heart rate.
LEVELS OF MEASUREMENT
The concept of measurement has been developed in conjunction with the concepts of
numbers and units of measurement. Statisticians categorize measurements according to
levels. Each level corresponds to how this measurement can be treated mathematically.
1. Nominal- Nominal data have no order and thus only gives names or labels to
various categories. In nominal measurements the numerical values just "name" the
attribute uniquely. No ordering of the cases is implied.
TAKE NOTE:
The essential point about nominal scales is that they do not imply
any ordering among the responses. For example, when classifying
people according to their favorite color, there is no sense in which
green is placed "ahead of" blue. Responses are merely categorized.
Nominal scales embody the lowest level of measurement.
Examples: gender; handedness; favorite color; religion; jersey numbers in
basketball; marital status; names of schools attended;
telephone numbers; species of flowers
2. Ordinal - In ordinal measurement the attributes can be rank-ordered. Here,

distances between attributes do not have any meaning.
TAKE NOTE:
For example, on a survey you might code Educational Attainment as
0=less than H.S.;

1=some H.S.;
2=H.S. degree;
3=some college;
4=college degree;
5=post college.
In this measure, higher numbers mean more education. But is distance from
0 to 1 same as 3 to 4? Of course not. The interval between values is not
interpretable in an ordinal measure.
Examples: social class or incomes

student evaluation (excellent, very good, good, poor)
grades (A, B, C, D, F)
customer satisfaction
ranking in a contest (first, second, third places)
3. Interval - At the interval level numbers represent fixed measurement units but have
no true zero point. However, the distance between numbers does have meaning.
TAKE NOTE:
In interval measurement the distance between attributes does have meaning.
For example, when we measure temperature (in Fahrenheit), the distance from
30-40 is same as distance from 70-80. The interval between values is
interpretable. Because of this, it makes sense to compute an average of an
interval variable, where it doesn't make sense to do so for ordinal scales.
A temperature zero degrees does not signify an absence of temperature.

Similarly, zero AD does not signify an absence of time.
Examples: temperature in Fahrenheit; aptitude test scores
4. Ratio - Ratio data have the highest level of measurement. Ratios between
measurements as well as intervals are meaningful because there is a starting point
(zero).
TAKE NOTE:
It possesses the characteristics of interval scale with the additional property
that its zero position indicates the absence of the quantity being measured.
Examples: amount of money you have in your pocket
weight of a new born baby
annual salary of a call center agent
DATA COLLECTION:
Data are needed whenever we make studies or researches. They are used to
explain particular problems or to provide a basis which certain decisions are generated. The
next step after the problem has been defined in the study is data collection.
There are two types of data according to sources: primary and secondary data.
Primary data are data collected directly by the researcher himself. These are first–hand or
original sources. They can be collected through: (a) direct observation or measurement; (b)
interview; (c) use of questionnaires or rating scales; (d) experimentation; and (e)
registration. Secondary data are information taken from published or unpublished
materials previously gathered by other researchers such as books, newspapers, magazines,
journals, published and unpublished theses and dissertations.
SAMPLING
- Methods in choosing samples.
ADVANTAGES:
1. It saves time, money and effort.
2. It is more effective.
3. It is faster and cheaper.
4. It is more accurate.
5. It gives more comprehensive information.
SAMPLING TECHNIQUES
o Probability or Random Sampling

- Each member in the population is given a chance to be included in the sample.
1. Simple Random Sampling

- Each member in the population has an equal chance to be selected as
sample.
- Also known as lottery or fishbowl method
- Is used when the size of the sample frame is known and can number all the
units of the frame.
2. Systematic Random Sampling
- Members of the population are arranged in some fashion or pattern.
- Every nth member of the population is included in the sample.
3. Stratified Random Sampling
- Population is first divided into subsets based on homogeneity called strata.
- Select a few member or representatives from each group.
4. Clustered/Cluster Random Sampling
- Referred to as an area sampling.
- Population is divided into groups, then, use one group for the samples.
o Non–Probability or Non–Random Sampling
- Not all of the members in the population are given an equal chance of being
included in the sample.
1. Convenience Sampling
- This design is used because of the convenience it offers to the researcher.
2. Purposive Sampling
- This design is based on choosing individuals as samples according to the
purposes of the researcher as his controls.
- An individual is chosen as part of the sample because of good evidence that
he is a representative of the total population.
3. Quota Sampling
- This design is popular in the field of opinion research because it is done by
merely looking for the individuals with the requisite characteristics.
Examples:
Identify the sampling technique being used in each of the situation.
_1. Every 21st customer entering a food chain is asked to select his/her favorite
meal for lunch.
_2. Circuit board production associates are selected using random numbers in
order to determine annual salaries.
_3. Every 50th person is selected from a list of registered voters.
_4. Twenty teams are randomly selected per department and each employee in
one team is given a survey to complete.
_5. Barangay officials of Metro Manila are divided into four groups. Thirty are
selected from each group and interviewed.
_6. A hair expert would like to find out the satisfaction of customers of his latest
hair product. He would likely ask participants of the study with long hair
rather than those who are bald.
_7. An interviewer is asked to obtain answers to interview questions for fifty
people. She positions herself in a shopping area and starts interviewing
people one by one.
_8. A manager wants to know their customers’ satisfaction. He stands outside
the main door and asks the first twenty diners who get out of the restaurant.
SAMPLE SIZE
In doing research, if the population is too big to handle, an extensive number

of samples is acceptable. Determining the sample size is very important
consideration because too large samples may cause waste of time, resources and
money, too small sample may lead to inaccurate results.
Using Slovin’s formula:
Determining n
n N
where: N = population size and e = margin of error
1  Ne2
Determining e
𝑁−𝑛
𝑒=√
𝑁𝑛
Examples:
I. Find the sample size:
1. Given: N = 1,000; e = 5%
N 1,000
n 1,000   285.71  286
1  Ne2  1  (1,000) 3.5
(0.05)2
2. Given: N = 40,000; e = 10%
N 40,000
n 40,000   99.75  100
1  Ne  1  (40,000) (0.1)2
2
401
3. A researcher is conducting an investigation regarding the factors affecting the

efficiency of 185 faculty members of a certain college with a margin of error of 5%.
N
n 185 185
1  Ne2  1  (185)  1.4625  126.49  126
(0.05)2
4. If the population size is 250 at 95% accuracy.
N 1  Ne2
n
250 250
 1  (250) (0.05)2  1.625 
153.85
 154
II. Find the margin of error (e), given:
1. N = 10 000 and n = 2 000
𝑁−𝑛
𝑒 = √ 𝑁𝑛
10 000 − 2 000
𝑒=√
10 000 (2 000)
𝑒 = 0.02 or 𝑒 = 2%
2. N = 7 250 and n = 379
𝑁−𝑛
𝑒 = √ 𝑁𝑛
𝑒 = √7 250 − 379
7 250 (379)
𝑒 = 0.05 or𝑒 = 5%
For Systematic Random Sampling
Find the sample number of employees – respondents in a population of 1,000

employees in Archgames.corp using systematic random sampling such that the margin
of error is 10%.
N 1,000
n 1,000   90.91  91
1  Ne  1  (1,000)
2
11
(0.1)2
N 1,000
k   10.99  11
n 91
This means that every 11th element will be gotten as sample.
 For Proportional Stratified Random Sampling
Given the distribution of respondents below, how many samples of each

category will be included in the sample if the margin of error is 5%?
N 1,500
n 1,500   315.79  316
1  Ne  1  (1,500)
2
4.75
(0.05)2
Category Population Size (N) Number of Samples (n)

900
Supervisors 900 n  316  190
1,500
500
Team Leaders 500 n  316  105
1,500
100
Agents 100 n  316  21
1,500
TOTAL 1,500 316

Module1-Basic Statistical Concepts

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Module1-Basic Statistical Concepts

Uploaded by

Copyright:

Available Formats

Module 1

1. define and cite the importance of statistics

BASIC CONCEPTS IN STATISTICS

Statisticum Collegium (Latin) which means “council of state”

Two Branches of Statistics

Population –the complete collection to be studied; it contains all subjects of interest.

 Parameter – is a number calculated on population data that quantifies a

Sample – part of the population of interest; a sub-collection selected from a population.

 Statistic – is a number calculated on sample data that quantifies a characteristic

VARIABLES AND DATA

A data set is a collection of observations on one or more variables.

Qualitative Variable – comes Quantitative Variable– comes

Mathematical Classification of Quantitative Variable:

Classification of Variables According to Relationship

 Independent Variable – An independent variable is the variable you have control

 Dependent Variable – A dependent variable is what you measure in the

2. Ordinal - In ordinal measurement the attributes can be rank-ordered. Here,

0=less than H.S.;

Examples: social class or incomes

A temperature zero degrees does not signify an absence of temperature.

Examples: temperature in Fahrenheit; aptitude test scores

o Probability or Random Sampling

1. Simple Random Sampling

Identify the sampling technique being used in each of the situation.

In doing research, if the population is too big to handle, an extensive number

Using Slovin’s formula:

I. Find the sample size:

2. Given: N = 40,000; e = 10%

3. A researcher is conducting an investigation regarding the factors affecting the

4. If the population size is 250 at 95% accuracy.

1. N = 10 000 and n = 2 000

2. N = 7 250 and n = 379

For Systematic Random Sampling

Find the sample number of employees – respondents in a population of 1,000

This means that every 11th element will be gotten as sample.

Given the distribution of respondents below, how many samples of each

Category Population Size (N) Number of Samples (n)

You might also like