Professional Documents
Culture Documents
FUNCTIONS OF STATISTICS
Branches of Statistics
1. Descriptive Statistics. This type of statistics is used to describe a group of
individuals or describe the data that have been collected. In short, this type of
statistics is devoted to summarization and description of data sets.
Statistical tools used: frequencies, percentages, measures of central
tendency, graphs, measures of variability
2. Inferential Statistics. This type of statistics is used when one makes decision,
estimates prediction or generalization about a population based on a sample
a. Parametric Test
- a test of significance appropriate when the data represent an interval or
ratio scale of measurement and
-it is stronger, the sample size is large n>30
-distribution is normal
-sampling is done at random
b. Non-parametric test
a test of significance appropriate when the data represent an ordinal or
nominal scale
-sample size is small
-the distribution is free
-the samples are not randomized (purposive)
2
Scales of Measurement
1. Nominal
- involves naming or labeling; that is of placing cases into categories and
counting their frequency of occurrence
-distinguishes responses into attributes or categories
ex. religion, gender( real dichotomy), nationality, aggression ( either active
or passive- artificial dichotomy)
2. Ordinal
-distinguishes among categories arranged in rank order, grouped
according to rank/ ranges
Example of ordinal scale: military rank, comparing and rank-ordering of
socio-economic status (high, middle, low); state of happiness (very happy, not so
happy, unhappy, very unhappy); rank in an oratorical contest ( It cannot be
concluded that the 1st places is twice as good as the 2nd placer.
4. Ratio- like the interval measurements are also expressed in numbers and the
differences between any two successive numbers are consistent
- it has a true zero, meaning measurement starts with zero
ex. no of children, height, speed, capacity, years of experience
Sources of Data
1. Primary-from eye or ear witness of past
2. Secondary- information furnish by a person who was not a direct observer or
participant of the event
3. Documentary data- data obtained from records of offices, hospitals etc.
3
Slovin’s Formula
N
n= 2
1+ N e
where: n=sample size
N=population size
e= desired margin of error
Lynch Formula
[N Z 2 x p ( 1− p ) ]
n= 2 2
N d + z p (1− p)
n = sample size
N= Population size
Z = the standard value of (2.58) of 1% level probability with
0.99 reliability ( 1.96 for 5%)
d= margin of error (.05)
p= largest possible proportion (0.50) for getting the correct
number of sample from the population
Solve the sample size for N= 3590 using the slovin’s formula and lynch
formula. Compare the results.
3590
Using the slovin’s formula, n =
¿¿
n = 360
3447.836
n=
9.9354
n= 347
Using stratified random sampling, get the sample size for each group of
respondents using slovin’s and lynch formula
4
n 360
a. Using Slovin’s Formula, the multiplier is which is
N 3590
360
1. 40 x =4
3590
360
2. 385 x = 39
3590
DMMMSU Community
n 347
b. Using Lynch Formula, the multiplier is which is
N 3590
347
1. 40 x =4
3590
347
2. 385 x = 37
3590
DMMMSU Community
Exercise: 1. Solve for the sample size using slovin’s and lynch formula
Exercise: 2. Solve for the sample size using slovin’s and lynch formula
SAMPLING TECHNIQUE
Sampling Techniques- is selecting a part of the population to represent the
population
Non probability sampling- (selective) not all members are given equal chance of
being selected
TOOLS
1. Questionnaire
- researcher puts his questions on paper and asks the respondents to
answer them,
-survey form
-paper pencil data gathering method
Guidelines:
a. Make all directions clear.
b. Use correct grammar.
c. Make all questions unequivocal.
d. Avoid asking biased questions
e. Objectify the responses.
f. Relate all questions to the topic under study.
g. Create categories or classes for approximate answers.
h. Group the questions in logical sequence.
i. Create sufficient number of response categories.
j. Word carefully or avoid questions that deal with confidential or
embarrassing information.
k. Explain and illustrate difficult questions.
l. State all questions affirmatively
m. Make as many questions as would supply adequate information for the
study.
n. Add a catch-all word or phrase to options of multiple response
questions
o. Place all spaces for replies at the left side
p. Make the respondents anonymous
For discussion: What are the advantages and disadvantages of using
questionnaire?
2. Interview
-it is defined as a purposeful face to face relationship between two
persons, one of whom called the interviewer who asks questions to gather
information and the other called the interviewee or the respondent who supplies
the information asked for.
3. Observation
-a means of gathering information for research, may be defined as
perceiving data through senses: sight, hearing, taste, touch, and smell. It is
widely used in studying behavior.
4. Tests
-a specific type of measuring instrument whose general characteristic is
that, it forces responses from a person and the responses are considered to be
indicative of the person’s skill, knowledge, attitudes, etc.
Classification
A. According to Standardization
1. Standard test-prepared by specialist, norms are established
2. Non-standard test-prepared by teachers to measure achievement of
their students.
B. According to Function
1. Psychological test such as intelligence test, aptitude, personality and
vocational and professional interest inventory
Characteristics:
1. Validity
2. Reliability
a. Adequacy
b. Objectivity
c. Same procedure and condition
3. Usability
5. Registration
- is a process of listing down items of the same kind in some systematic
manner for record purposes.
-registered matter may be classified alphabetically, chronologically,
quantitatively, qualitatively or otherwise.
Reliability
9
1. Test-retest method
The same measuring instrument is administered twice to the same group
of subjects. The scores of the first and second administrations of the test are
determined by correlation coefficient.
The disadvantages are:
1. when the time interval is short, memory effects may operate. The subjects may
recall of his previous responses and tends to make the correlation of the test high
2. when the interval is long, such factors as unlearning, forgetting, among others
may occur and may result to low correlation of the test
3. regardless of the time interval separating the two administrations, other
varying environmental conditions such as noise, temperature, lighting and other
factors may affect the correlation of the test
r = 1- 6 Σ D2
N3 –N
Where: Σ D2 =the sum of the squared difference between ranks
N = the total number of cases
For example, 10 students in second year high school are used as pilot sample to
test the reliability of an achievement test in Biology. Determine the reliability
coefficient given their scores in the two administrations of the test.
Students s1 s2
1 89 90
2 85 85
3 77 76
4 80 81
5 83 83
6 87 85
7 90 90
8 73 72
9 85 85
10 80 83
2. Split-half Method
The test in this method may be administered once, but the test items are
divided into two halves. The common procedure is to divide the test into odd and
even items. The two halves of the test must be similar but not identical in content,
difficulty, means and standard deviations. Each student obtained two scores, one
on the odd and the other on the even items, in one test. The scores obtained in
the two halves are correlated. The result is reliability coefficient for a half test.
Since the reliability holds only for half test, the reliability coefficient for the whole
test may be estimated by using the Spearman-Brown formula. This formula is:
rwt = 2 rht
1 + rht
where:
rwt = reliability of the whole test
10
3. Kuder-Richarson 21 Formula
r = nσ2 – M (n-M)
(n-1) σ2
σ2 = Σ ( x-M)2
N
M = Σx
N
N- number of respondents
Presentation of Data
The mode is used with nominal data or any distribution when haste is
necessary.
The median is used with ordinal data or higher especially when data
depart from normal or the distribution is badly skewed.
The mean is used with interval or ratio data. It is associated with a
symmetrical or normal distribution.
Mean
The arithmetic mean or simply mean (popularly called the average) is the
sum of all the scores divided by the number of scores.
The formula is
x = ∑x
N where ∑x is the sum of all the scores
N is the number of scores
Solve for the mean of the ff numbers:
2, 4, 6, 8, 10, 15, 20, 25, 30, 45, 50
30 2
20 3
26 3
25 20
23 18
21 6
12
19 3
18 2
total (N) 57
∑f x
use x= where ∑fx is the sum of the product of the frequency and each
N
score
Weighted Mean
Research Question: What is the perceived level of instructional competence
along teaching skills of Mathematics Instructors in the HEI’s of La Union
Level of Instructional Competence Student Teache Head Mean DE
(345) r (15)
(23)
1. Substantiality of teaching 3.83 4.29 4.54 3.89 VG
2. Quality of faculty member’s 3.71 4.15 4.42 3.76 VG
explanation
3. Receptivity to students’ ideas and 3.68 3.95 4.18 3.72 VG
contribution
4. Quality of questioning procedure 3.70 4.10 4.43 3.75 VG
5. Selection of teaching methods 3.56 4.11 4.15 3.62 VG
6. Quality of information and 2.70 3.20 3.01 2.74 VG
communication technology utilized
Mean 3.53 3.97 4.12 3.58 VG
The median is the midpoint of the distribution. Half of the value in the
distribution fall below the median and the other half above it. For distributions
having an even number of arrayed observed values, the median is the average of
the two middlemost values. But for odd number of arrayed observations, it is the
middlemost value.
THE MODE
The mode of a given set of data is the value that appears with the highest(
greatest) frequency. That is, the value that appears most often.
A distribution with more than two modes is multimodal. It is also possible that a
distribution may not have any mode at all.
Mo= 3(Median)- 2 (Mean)
MEASURES OF VARIABILITY
-It indicates the extent to which value in a distribution are spread around the
central tendency. These measures describe how item values cluster or scatter in
a distribution.
Range, Mean Absolute Deviation, Interquartile Range, Standard Deviation,
Variance Quartile Deviation
Formulas:
MAD= ∑ │x-x│
N
Where x – each score
x- the mean
S2 or variance = ∑ ( x- x) 2
N
S or standard deviation is the square root of the variance
S= √ ∑ ( x- x) 2
N
The greater the value of the tool for variability the more varied the group is.
Consider the following data sets:
Set A: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11
Set B: 4, 5, 5, 5, 6, 6, 6, 7, 7, 7, 8
Which of the two sets of data is more variable or more spread out?
2. Given the data below, solve for the sample size and solve for the no. of
members to be taken from each subset. 10 pts.
14
3. . Given the scores below in the even and odd numbered items, determine the
reliability of the given test using the appropriate method .10 pts.
Odd Numbers Even Numbers
8 6
9 7
10 6
6 8
7 7
5 6
6 8
7 3
methods are
used
Sub Mean
5. Given the data below, solve for the mean, median, mode, range, and standard
deviation
82 85 90 94 92 88 80 78 86 85 68 85
6. The following table shows the final grades of ten students in Basic Math and Algebra.
Basic Math (X) 82 78 86 72 91 80 95 72 89 74
Algebra (Y) 75 80 93 65 87 71 98 68 84 77
a. Find the coefficient of correlation using pearson r and spearman rho rank
correlation and interpret the result.
b. Solve for the coefficient of correlation and coefficient of determination and
interpret.
c. Test for the significance of r
d. Draw the scatter diagram
e. Predict grade in Algebra if grade in Basic Math is 93
7. . An English spelling test was given to a random sample of urban and rural
children to infer possible effects of media on children’s spelling abilities. Test
which group spells better given their scores below.
URBAN 48 36 23 32 35 60
RURAL 21 36 40 18 29 16
9. MULTIPLE CHOICE: Choose the letter of the correct answer. 1 pt. each
1. It defines where and when the study is conducted and who the subjects are.
a. title b. rationale c. scope and delimitation
2. It is an expected answer to a problem.
a. title b. hypotheses c. rationale
3. It is a definition of a term indicating the meaning of the term in the study.
a. operational b. conceptualc. descriptive
4. It discusses the contributions of the study to the field of knowledge.
a. rationale b. importance of the study c. assumptions
5. It reflects the general problem.
a. rationale b. title c. scope and delimitation
6. It describes the existing and prevailing situation.
a. rationale b. title c. scope and delimitation
7. It is a part in which key terms of research are clearly defined.
a. scope and delimitation b. definition of terms c. rationale
8. It is a definition of a term found in the dictionary.
a. operational b. conceptualc. descriptive
9. It is a type of variable which is considered presumed cause.
a. independent b. dependent c. moderator
10. It is a type of variable which is considered presumed effect.
a. independent b. dependent c. moderator
11. TRUE OR FALSE: Read each statement below. If it is correct, write true.
Otherwise, write false. RIGHT MINUS WRONG. 1 pt. each
1. The questionnaire method can be used to those who are illiterate.
2. In constructing questionnaire, make directions unequivocal.
3. There is no anonymity in interview method.
4. It is alright to argue in an interview.
5. Interview method is the superior technique of collecting information.
6. Validity means a test measures what it intends to measure.
7. Reliability of a test means accuracy, stability, repeatability or
consistency of test results.
8. Standard test are valid and reliable.
18
12. Given the research paradigm below, construct the questions which will
complete the Statement of the Problem of the Study. 15 pts
Given the same research paradigm, construct an appropriate title for the study.5
pts.