You are on page 1of 89

STATISTICS

MEANING OF STATISTICS

• singular
• plural
• general
Statistics in singular sense refers to the
branch of mathematics which deals with
the systematic collection, tabulation,
presentation, analysis, and
interpretation of quantitative data which
are collected in methodical manner
without bias.
Statistics in its plural sense denotes
a set of quantitative data or facts.
Statistics in its general sense is divided into
statistical methods and statistical theory or
mathematical statistics
Statistical methods indicate those procedures
and techniques used in the collection,
presentation, analysis and interpretation of
quantitative data.
Statistical theory or mathematical statistics
deals with the development and exposition of
theories with constitutes the bases of the
statistical methods
QUALITIES OF GOOD STATISTICIAN

• Scientific • Terrific
• Talented • Innovative
• Active • Creative
• Tenacious • Interpretative
• Inventive • Accurate
• Skillful • Noble

S T A T I S T I C I AN
FUNCTIONS OF STATISTICS

• Statistics provides researchers the means to scientifically


measure the conditions that may be involved in a given
problem and evaluating the way in which these conditions
are related.
• Statistics shows the laws underlying facts and events that
cannot be determined by individual observation.
FUNCTIONS OF STATISTICS

• Statistics shows relations of cause and effect that


otherwise may remain unknown.
• Statistics observes trends and behaviour in related
conditions which otherwise may remain unclear.
IMPORTANCE OF STATISTICS TO RESEARCH

1. Statistics permits the most exact kind of description.


2. Statistics forces the researcher to be definite and exact in
his procedures and in his thinking.
3. Statistics enables the researcher to summarize results in
a meaningful and convenient form.
IMPORTANCE OF STATISTICS TO RESEARCH

4. Statistics enables the researcher to draw general


conclusions. The process of extracting conclusions is carried
out according to accepted rules.
5. Statistics enables the researcher to predict “how much”
of a thing will happen under conditions he knows and has
measured.
TWO FIELDS OF STATISTICS

• Descriptive Statistics is concerned with collection,


classification, presentation, analysis and interpretation of
data and to describe the collected summarized values of
group characteristics of data
Examples: measures of central tendency, measures of dispersion,
skewness, kurtosis, etc).
TWO FIELDS OF STATISTICS

• Inferential Statistics aims to give information about large


groups of data (population)without dealing with each and every
element of these groups. It only uses a small but representative
portion (sample) of the total set of data in order to draw
conclusions or judgments regarding the entire set of data.
Examples: sampling/sampling distribution, estimation, testing of
hypothesis using z-test, t-test, chi-square test, F-test, and the like.
EXAMPLE
A recent study examined the math and verbal SAT
scores of high school seniors across the country.
WHICH OF THE FOLLOWING STATEMENTS ARE
DESCRIPTIVE IN NATURE AND WHICH ARE INFERENTIAL.

• The mean math SAT score was 492. DESCRIPTIVE


• The mean verbal SAT score was 475. DESCRIPTIVE
• Students in the Northeast scored higher in math but lower in
verbal. INFERENTIAL
• 80% of all students taking the exam were headed for college. DESCRIPTIVE
• 32% of the students scored above 610 on the verbal SAT. DESCRIPTIVE
• The math SAT scores are higher than they were 10 years ago.
INFERENTIAL
MATHEMATICAL PRELIMINARIES

Population: A collection, or set, of individuals or objects or events


whose properties are to be analyzed.
Two kinds of populations: finite or infinite.

Sample: A subset of the population.


POPULATION
SAMPLE
• Mrs. Jara wants to know the nutritional status of the
first year students in her school so she got 150 first
year students to represent the year level.
• When Sandra bought a sack of rice, she examined a
handful from the sack to check if it is the variety she
wants.
• A doctor wants to know what causes the infection in a
patient so he requested for the patient’s blood
examination. The medical technologist extracted only
10 cubic centimeters of blood from the patient for
examination.
• The chef wants to check if the food being cooked
tastes as he wants it to be so he tasted a spoonful of it.
Variable: A characteristic about each individual element of a population
or sample.
Data (singular): The value of the variable associated with one element
of a population or sample. This value may be a number, a word, or a
symbol.
Data (plural): The set of values collected for the variable from each of
the elements belonging to the sample.
Experiment: A planned activity whose results yield a set of data.
Parameter: A numerical value summarizing all the data of an entire
population.
Statistic: A numerical value summarizing the sample data.
Example: A college dean is interested in learning about the average age of faculty. Identify the basic terms
in this situation.

population The population is the age of all faculty members at the college.
sample A sample is any subset of that population. For example, we might select 10
faculty members and determine their age.
The variable is the “age” of each faculty member.
variable
One data would be the age of a specific faculty member.
data (singular)
The data would be the set of values in the sample.
data (plural).
The experiment would be the method used to select the ages forming the
experiment sample and determining the actual age of each faculty member in the sample.
parameter The parameter of interest is the “average” age of all faculty at the college.
statistic The statistic is the “average” age for all faculty in the sample.
Two kinds of variables:
Qualitative, or Attribute, or Categorical, Variable: A variable
that categorizes or describes an element of a population.
Note: Arithmetic operations, such as addition and averaging,
are not meaningful for data resulting from a qualitative
variable.
Quantitative, or Numerical, Variable: A variable that
quantifies an element of a population.
Note: Arithmetic operations such as addition and averaging,
are meaningful for data resulting from a quantitative variable.
Example: Identify each of the following examples as attribute (qualitative) or
numerical (quantitative) variables.

1. The residence hall for each student in a statistics class. QUALITATIVE


2. The amount of gasoline pumped by the next 10 customers at the local
Unimart. QUANTITATIVE
3. The amount of radon in the basement of each of 25 homes in a new
development. QUANTITATIVE
4. The color of the baseball cap worn by each of 20 students. QUALITATIVE
5. The length of time to complete a mathematics homework assignment.
QUANTITATIVE
6. The state in which each truck is registered when stopped and inspected at a
weigh station.
QUALITATIVE
Example: Identify each of the following as examples of qualitative or quantitative variables:

1. The temperature in Barrow, Alaska at 12:00 pm on any given


day. QUANTITATIVE
2. The automobile driven by each faculty member. QUALITATIVE
3. Whether or not a 6 volt lantern battery is defective. QUALITATIVE

4. The weight of a lead pencil. QUANTITATIVE


5. The length of time billed for a long distance telephone call.
QUANTITATIVE
6. The brand of cereal children eat for breakfast.
QUALITATIVE
7. The type of book taken out of the library by an adult.
QUALITATIVE
Qualitative and quantitative variables may be further subdivided:

VARIABLE

QUALITITATIVE QUANTITATIVE

NOMINAL ORDINAL DISCRETE CONTINUOUS


Nominal Variable: A qualitative variable that categorizes (or describes, or
names) an element of a population.
Ordinal Variable: A qualitative variable that incorporates an ordered
position, or ranking.
Discrete Variable: A quantitative variable that can assume a countable
number of values. Intuitively, a discrete variable can assume values
corresponding to isolated points along a line interval. That is, there is a
gap between any two values.
Continuous Variable: A quantitative variable that can assume an
uncountable number of values. Intuitively, a continuous variable can
assume any value along a line interval, including every possible value
between any two values.
Note:
In many cases, a discrete and continuous variable may be distinguished by
determining whether the variables are related to a count or a
measurement.
1. Discrete variables are usually associated with counting. If the variable
cannot be further subdivided, it is a clue that you are probably dealing
with a discrete variable.
2. Continuous variables are usually associated with measurements. The
values of discrete variables are only limited by your ability to measure
them.
Example: Identify each of the following as examples of (1) nominal, (2)
ordinal, (3) discrete, or (4) continuous variables:
1. The length of time until a pain reliever begins to work. CONTINUOUS
2. The number of chocolate chips in a cookie. DISCRETE
3. The number of colors used in a statistics textbook. DISCRETE

4. The brand of refrigerator in a home. NOMINAL

5. The overall satisfaction rating of a new car. ORDINAL


6. The number of files on a computer’s hard disk. DISCRETE
7. The pH level of the water in a swimming pool. CONTINUOUS
8. The number of staples in a stapler. DISCRETE
IMPORTANT STATISTICAL TERMS

Population:
a set which includes all
measurements of interest
to the researcher
(The collection of all responses,
measurements, or counts that are of
interest)

Sample:
A subset of the population
Statistics is a tool for converting data into
information:
STATISTICS

Data Information

But where then does data come from?


How is it gathered?
How do we ensure its accurate?
Is the data reliable?
Is it representative of the population from which it was drawn?
SOURCES OF DATA

• PRIMARY DATA

• SECONDARY DATA
SOURCES OF DATA

• PRIMARY DATA – It means original data that has been collected specially for the purpose in mind.
It means someone collected the data from the original source first hand. Data collected this way is
called primary data.

• SECONDARY DATA – It is the data that has been already collected by and readily available from
other sources.
PRIMARY DATA
• SURVEY
 It is the most commonly uses method in social sciences, management,
marketing and psychology to some extent.
• Interview
 It is a face to face conversation with the respondent. It is slow, expensive, and
they take people away from their regular jobs, but they allow in – depth questioning
and follow – up questions
• Observation
 In can be made in natural setting as well as in artificially created environment
• Questionnaire
 These are a list of questions either an open – ended for which the
respondent give answers.
SECONDARY DATA

• Published Printed Resources


 Books
 Journals/ Periodicals
 Magazines/Newspapers

• Published Electronic Resources


 E – journals
 General websites
 Weblogs
SAMPLING

Sampling is the process of selecting the


elements of a sample from the population
being studied.

Two Types of Sampling


• Non-probability samples

• Probability samples
NON PROBABILITY SAMPLES

• Probability of being chosen is unknown.


• Cheaper- but unable to generalise potential for
bias
NON PROBABILITY SAMPLES

 Convenience samples (ease of access) sample is selected from elements of a


population that are easily accessible
 Snowball sampling (friend of friend….etc.)
 Purposive sampling (judgemental)
• You chose who you think should be in the study
 Quota sampling
 method in which researchers create a sample involving individuals that
represent a population
PROBABILITY SAMPLES

• Random sampling
• Each subject has a known probability of being selected
• Allows application of statistical sampling theory to results
to:
• Generalise
• Test hypotheses
METHODS USED IN PROBABILITY SAMPLES

 Simple random sampling


 Systematic sampling
 Stratified sampling
 Cluster sampling
SIMPLE RANDOM
SAMPLING. . .

A simple random sample is a sample selected in such a way that


every possible sample of the same size is equally likely to be
chosen.

Drawing three names from a hat containing all the names of the
students in the class is an example of a simple random sample:
any group of three names is as equally likely as picking any other
group of three names.
Simple random sampling
Stratified
  Random Sampling. . .

A stratified random sample is obtained by separating the population into


mutually exclusive sets, or strata, and then drawing simple random samples from
each stratum.
EXAMPLE:

MALE: 75
FEMALE: 25 100
Sample: 20

75 25
 

  75
𝑀𝑎𝑙𝑒= 𝑥 20   25
100 𝐹𝑒𝑚𝑎𝑙𝑒= 𝑥 20
100
  1500
𝑀𝑎𝑙𝑒=  
100
𝑀𝑎𝑙𝑒=15
   𝐹𝑒𝑚𝑎𝑙𝑒 =5
Systematic sampling

Sampling fraction
Ratio between sample size and population size

•  
where
k = Interval
N = Population Size
n = sample size
Systematic sampling
CLUSTER
SAMPLING. . .
A cluster sample is a simple random sample of groups or clusters of elements
(vs. a simple random sample of individual objects).

This method is useful when it is difficult or costly to develop a complete list of


the population members or when the population elements are widely dispersed
geographically.

Cluster sampling may increase sampling error due to similarities among cluster
members.

i.e. crowding together in the same area or neighborhood


CLUSTER SAMPLING
Section 1 Section 2

Section 3

Section 5

Section 4
EXAMPLE
The clinic teacher wants to determine the average height
and weight of the first year students. How can she
randomly select 50 students consisting of 250 male
students and 300 female students to represent the
population using (a) simple random
technique? (b) systematic random technique?(start at 5)
(c) stratified random technique?
The clinic teacher wants to determine the
average height and weight of the first year
students. How can she randomly select 50   Systematic random
b.
students consisting of 250 male students technique (start at 5)
and 300 female students to represent the
population using (a) simple random
technique? (b) systematic random
technique?(start at 5) (c) stratified random
technique?
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, … , 550
Given 5 16 27 38 49 . . .
Male student: 250 c. Stratified random technique
Female student: 300
Sample size (n): 50 250 300
Total Population (N): 550
 

  250   300
a. Simple random technique 𝑀𝑎𝑙𝑒= 𝑥 50 𝐹𝑒𝑚𝑎𝑙𝑒= 𝑥 50
550 550
Randomly select sample by the
use of draw lots.   12500   15000
𝑀𝑎𝑙𝑒= 𝐹𝑒𝑚𝑎𝑙𝑒=
550 550
𝑀𝑎𝑙𝑒=22
  .73 ≈  23 𝐹𝑒𝑚𝑎𝑙𝑒=27
  . 27 ≈  27
A researcher wants to know the average age of
teachers in a certain community. Fifty teachers
from the elementary and 25 teachers from the
secondary levels were interviewed for the
purpose. How will the researcher choose a
sample size of 20 using:
a. simple random sampling
b. systematic random sampling (start at 3)
c. stratified random sampling
A researcher wants to know the average age
of teachers in a certain community. Fifty
teachers from the elementary and 25   Systematic random
b.
teachers from the secondary levels were technique (start at 3)
interviewed for the purpose. How will the
researcher choose a sample size of 20 using:
a. simple random sampling
b. systematic random sampling (start at 3)
c. stratified random sampling
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, … , 75
Given 3 7 11 15 19 ...
elementary: 50 c. Stratified random technique
secondary: 25
Sample size (n): 20 50 25
Total Population (N): 75
 

  50   25
a. Simple random technique 𝐸𝑙𝑒𝑚𝑒𝑛𝑡𝑎𝑟𝑦 = 𝑥 20 𝑆𝑒𝑐𝑜𝑛𝑑𝑎𝑟𝑦 = 𝑥 20
75 75
Randomly select sample by the
1000 500
use of draw lots. ¿  ¿ 
75 75
¿  13 .33 ≈  13 6.66
  ≈  7
PRESENTATION OF
DATA
THREE FORMS

• TEXTUAL FORM
• TABULAR FORM
• GRAPHICAL FORM
TEXTUAL FORM
• This is the simplest method of presenting data when there
are few numbers to be presented.

The performance of instructors and professors at the State


Universities and Colleges in Region I (Ilocos Region) are as
follows: 15 or 8.33 percent have outstanding performance; 80 or
44.44 percent have very satisfactory performance; 55 or 30.56
percent, satisfactory performance; and 30 or 16.67 percent, fairly
satisfactory
TABULAR FORM
• Presenting the data by means of statistical tables in systematic way of
arranging them in rows and columns. Each category in the table is placed in a
row or column and the data are placed in their respective cells.

Table 2.2. Performance of Instructors and Professors at the State Universities and Colleges in Region 1
(Ilocos Region)

Instructors and Professors Performance Frequency Percentage (%)


Outstanding 15 8.33
Very Satisfactory 80 44.44
Satisfactory 55 30.56
Fairly Satisfactory 30 16.67

Total 180 100.00


FOUR ESSENTIAL PARTS

• Table caption
• Stub
• Box heads
• Body of the table
TABLE CAPTION
This includes the table number and heading. The researcher usually use a double number for the tables
wherein the first number refers to the Chapter number and the second number refers to the number of
table in the Chapter.

Table 2.2. Performance of Instructors and Professors at the State Universities and Colleges in Region 1
(Ilocos Region)
STUB
This refers to the rows of the table which is found at the left.

Outstanding
Very Satisfactory
Satisfactory
Fairly Satisfactory
BOX HEADS

These are the headings within the box of the table wherein the data are emphasized.

Instructors and Professors Performance Frequency Percentage (%)


BODY OF THE TABLE

this refers to the main part of the table containing the figures which are placed in columns aligned with the
box heads.

15 8.33
80 44.44
55 30.56
30 16.67
180 100.00
FREQUENCY DISTRIBUTION

• It is a tabulation or grouping of data


into appropriate categories showing
the number of observations in each
group or category.
CATEGORICAL FREQUENCY DISTRIBUTION

CLASS FREQUENCY PERCENTAGE

TOTAL

  𝑛
𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒( %)= 𝑥 100
𝑁
EXAMPLE
Given data: Marital Status of certain individual.

Single Married Married Married Single


Widow Single Married Single Single
Married Married Single Single Married
Married Single Single Married Single
Single Single Widow Widow Married
Single Single Married Single Single
Widow Widow Single Married Widow
Married Single Single Married Single
Single Single Single Single Single
CLASS FREQUENCY PERCENTAGE

SINGLE 25 55.56
WIDOW 6 13.33
MARRIED
TOTAL 14 31.11
45 100
Single Married Married Married Single
Widow Single Married Single Single
  𝑛
Married Married Single Single Married (%)= 𝑥 10
𝑁
Married Single Single Married Single
Single Single Widow Widow Married   25
𝑥 100
45
Single Single Married Single Single   14
𝑥 100
6 45
Widow Widow Single Married Widow  
𝑥 100
45
Married Single Single Married Single
Single Single Single Single Single
EXAMPLE
Given data: Year level of students in certain school.
Sophomore Junior Freshman Senior
Sophomore Senior Freshman Senior
Freshman Freshman Sophomore Junior
Junior Sophomore Junior Sophomore
Junior Senior Senior Junior
Senior Junior Sophomore Freshman
Sophomore Senior Senior Sophomore
Junior Sophomore Junior Freshman
Freshman Freshman Freshman Freshman
Sophomore Junior Freshman Senior
Sophomore Senior Freshman Senior
Freshman Freshman Sophomore Junior
Junior Sophomore Junior Sophomore
Junior Senior Senior Junior
Senior Junior Sophomore Freshman
Sophomore Senior Senior Sophomore
Junior Sophomore Junior Freshman
Freshman Freshman Freshman Freshman
CLASS FREQUENCY PERCENTAGE
FRESHMAN 10 27.78
SOPHOMORE 9 25
JUNIOR 9 25
TOTAL SENIOR 8 22.22
36 100
NUMERICAL FREQUENCY TABLE
CLASS LIMIT Class Frequency Class Mark/ Cumulative RF %
(Integral) Boundaries Midpoint Frequency
(Real Limit) < >

TOTAL
PARTS OF FREQUENCY TABLE

1. Class Limits – groupings or categories defined by lower and upper limits


2. Class Size – width of each class intervals
3. Class Boundaries are the numbers used to separate class but without gaps created by class limits.
4. Class marks are the midpoints of the lower and upper class limits.
STEPS IN CONSTRUCTING A FREQUENCY
DISTRIBUTION TABLE

•  
CUMULATIVE FREQUENCY DISTRIBUTION

The “less than” cumulative frequency distribution (<cf) is


obtained by adding frequencies successively from the
lowest to the highest interval while “more than” cumulative
frequency distribution (>cf) is obtained by adding
frequencies from the highest class interval to the lower
class interval
RELATIVE FREQUENCY DISTRIBUTION

•  
Ages of children in a community 5 13 8 6 13 10 5 13 15 16
Stem, LEAF 8 12 15 10 12 16 12 9 3 7
0 2,3,3,5,5,5,5,6,6,6,7,7,8,8,9,9,9,9,9,
9,9,9 11 15 11 7 15 2 13 5 9 12
1 0,0,0,0,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,
3,3,3,4,4,5,5,5,5,6,6,6,6,7,8,8,8,8,9 13 9 12 9 9 14 12 11 19 13
 Claas Width
RANGE 16 18 3 13 18 10 15 14 18 11
HV – LV
19 – 2
17 C= 10 12 6 9 5 17 9 6 9 18

CLASS LIMIT Class Frequency Class Mark/ Cumulative RF %


(Integral) Boundaries Midpoint Frequency
(Real Limit)
< >
Stem, LEAF
0 2,3,3,5,5,5,5,6,6,6,7,7,8,8,9,9,9,9,9,  Class Mark=
9,9,9  
RF=  𝑥 100
1 0,0,0,0,1,1,1,1,2,2,2,2,2,2,2,3,3,3,3,  Class Mark==
3,3,3,4,4,5,5,5,5,6,6,6,6,7,8,8,8,8,9
 Class Mark==
k= 3
CLASS LIMIT Class Frequency Class Mark/ Cumulative RF %
(Integral) Boundaries Midpoint Frequency
(Real Limit) < >

2–4 1.5 4.5 3 3 60 3 0.05 5


3 5–7 4.5 7.5 9 6 57 12 0.15 15
7.5 10.5 14 9 48 26 0.23 23
8 – 10
11 – 13 10.5 13.5 18 12 34 0.3 30
44 17
14 – 16 13.5 16.5 10 15 16 54 0.17
17 – 19 16.5 19.5 6 18 6 60 0.1 10
TOTAL ------ 60 1 100
GRAPHICAL
REPRESENTATION
GRAPHICAL FORM

• It is a geometric image or a mathematical


picture of a set of data. Presenting the
data in this form gives a clearer picture to
the readers.
LINE GRAPH

• It is made by plotting the data with a dot


and connecting the plotted points by
means of straight lines.
LINE GRAPH
90

80 80

70
F 60
R 55
E
Q 50
U
E 40
N
C 30 30
Y
20
15
10

0
Outstanding Very Satisfactory Satisfactory Fairly Satisfactory

Figure 2.1. Performance of Instructors and Professors at the State Universities and Colleges in Region 1
(Ilocos Region)
BAR GRAPH

Bar Graph is another way of presenting data in


graphical form. It represents data by areas in the
form of vertical rectangles or bars. The bars are
drawn with their base equal to each other and the
height corresponds to the data in the X – axis.
BAR GRAPH
90

80

70
F 60
R
E
Q 50
U
E 40 80
N
C 30
Y 55
20
30
10
15
0
Outstanding Very Satisfactory Satisfactory Fairly Satisfactory

Figure 2.2. Performance of Instructors and Professors at the State Universities and Colleges in Region 1
(Ilocos Region)
CIRCLE GRAPH

• Circle Graph or Pie Graph is a way of presenting


data in circular form. The data divide the circle into
parts and are represented in percent or in actual
figures.
CIRCLE GRAPH

15
30

Outstanding
Very Satisfactory
Satisfactory
55 80 Fairly Satisfactory

Figure 2.3. Performance of Instructors and Professors at the State Universities and Colleges in Region 1
(Ilocos Region)
PICTOGRAPH

It is a kind of graph which uses picture or


symbols to represent information.
PICTOGRAPH
It s a kind of graph which uses picture or
symbols to represent information.
DAILY PROFIT OF FISH VALUE – ADDED PRODUCTS FOR FIVE DAYS

MONDAY
TUESDAY
WEDNESDAY
THURSDAY
FRIDAY

Each stands for 100 pesos

Figure 2.5. Daily Profit of Fish Value – Added Products for Five Days
HISTOGRAM

• Frequently used to graphically present interval


and ratio data
• Is often used for interval and ratio data
• The adjacent bars indicate that a numerical
range is being summarized by indicating the
frequencies in arbitrarily chosen classes
FREQUENCY POLYGON
• Another common method for graphically presenting
interval and ratio data
• To construct a frequency polygon mark the frequencies on
the vertical axis and the values of the variable being
measured on the horizontal axis, as with the histogram.
• If the purpose of presenting is comparation with other
distributions, the frequency polygon provides a good
summary of the data
OGIVE
• A graph of a cumulative frequency distribution
• Ogive is used when one wants to determine how many
observations lie above or below a certain value in a
distribution.
• First cumulative frequency distribution is constructed
• Cumulative frequencies are plotted at the upper class
limit of each category
• Ogive can also be constructed for a relative frequency
distribution.

You might also like