Stat File

MEANING OF STATISTICS/RESEARCH
What is Statistics?
The term statistics has several meanings. It is either singular (s) or plurals (pl). In its singular
sense, it refers to the branch of mathematics which deals with the systematic collection,
presentation, analysis and interpretation of quantitative data.
1. Collection of Data - refers of data gathering using one or a combination of the following
methods: interview, questionnaire, registration or observation.
2. Presentation of Data - refers to the organization of data into tables, graphs or charts so
that the reader will be able to get the clear picture of the various relationships.
3. Analysis of Data - refers to the process of extracting relevant information from the given
data.
4. Interpretation of Data - refers to the task of drawing conclusions from analyzed data.
When considered in another sense, it is the plural form of statistics, which refers to the set of
related quantitative data, or some numerical computations derived from a set of data. The
average height of a group of freshmen of school X is a statistic for it is derived from the data on
heights of a group of freshmen of said school. From the same set of freshmen, their average
weight can also be obtained from their data on weight, such is also a statistic: Then together, the
average height and the average weight are called statistics.
What is Research?
Research is defined as a “careful, critical and exhaustive investigation to discover new facts
which will test a hypothesis, revise accepted conclusions or contribute positive values to society
in general.” By Hildreth Hoke McAshan
The research process includes collecting and processing data to arrive at answers to questions
identified in the investigation. Research is linked to statistics because statistics is a tool of
research.
USES OF STATISTICS
 The use of statistics is spread through all fields, namely, fisheries, agriculture, commerce,
trade and industry, health, education, nursing, medicine, biology, economics, psychology,
sociology, engineering, chemistry, physics and many others.
 It is said that statistics is the "tool" of all sciences. It is called the "language of research
 In the field of fisheries, statistics is used in the analysis and interpretation of experimental
data. The weight and length relationship of fish cultured in controlled and semi-controlled
environments using different supplemental feeds is determined through statistics. The
acceptability, nutritive values and economics of processed foods using different fish
processing methods and techniques are similarly determined. The significant differences
in the quality attributes, i.e., color, odor, flavor, and texture, of these processed fishery
products become clear through statistics.
 Statistics is widely used in the field of agriculture. Statistical treatment is needed in the
analysis and interpretation of data in their experiments as well as in agricultural economy.
 In commerce, trade and industry, statistical techniques are of vital importance in the
planning, production and marketing of commodities, prices, costs, and profits. The
statistical results serve as basis for making policies on efficient management.
 In education, statistics is a vital tool in evaluating the achievements of the students and
the performance of mentors, staff and administrators. Statistical results serve as basis for
promotion and retention of students. Statistical treatment determines also the
effectiveness and ineffectiveness of instruction, research, extension and production.
 In health, nursing and medicine, statistics is an indispensable tool. Determination of the
effectiveness of treatment is based on a collection of records of clinical trials devised in
such a scale and such form that valid conclusions can be drawn. Also, nurses and
physicians have a better understanding of nursing and medical research journals,
respectively, if they have knowledge of statistical methods. T
 Statistical techniques are of vital importance in evaluating, analyzing, and interpreting
results of biological experiments.
 In economics, the supply and demand of commodities need statistical analysis and
interpretation for better understanding.
 In psychology, the scores in personality tests, intelligence tests, aptitude tests, prognostic
tests, diagnostic tests and many others have to be analyzed and interpreted for better
understanding of an individual.
 In sociology, statistical tools are used to determine the social problems. problems and
needs of society can be solved and determined by analyzing and interpreting the
observations made by the people living in a particular society.
 In chemistry and physics, statistical analysis and interpretation of data of their
experiments are needed at valid and reliable results.
Importance of STATISTICS in RESEARCH
Statistics plays a vital role in research. Practically, no research can be complete without statistics.
Even in anthropological studies, the use of statistics though minimal is unavoidable. Some of the
uses of statistics in research are the following:
1. Statistics helps the researcher in making his research design. This is especially true in
experimental research. When an investigator designs his research project, he most likely asks
himself the following questions: What statistical data have to be collected? What sampling
technique has to be used? What statistical methods have to be employed in the treatment of the
data? Unless these questions are answered satisfactorily, the design cannot be complete.
2. Statistical techniques help the researcher in determining the validity and reliability of his
research instruments. Validity and reliability are important characteristics of instruments used for
gathering data. Statistics is utilized to make research instruments valid and reliable.
3. Statistical manipulations organize raw data systematically to make them appropriate for study.
Raw data, unless organized in some way, have very little or no meaning at all. They must be
organized systematically so that they can be studied and inferences can be drawn out of them.
Setting them in tables is one example.
4. Statistical treatments give meaning and interpretation to raw data and hence, are used to test
the hypotheses. Statistical methods are usually used to determine whether the hypotheses which
are drawn up at the beginning of the study are true or not.
5. Statistical methods determine the levels of significance of the research findings. In many
studies, the levels of significance of findings are indispensable in drawing up inferences,
conclusions and other forms of generalizations.
All the facts stated above make statistics a very essential part of research.
DESCRIPTIVE AND INFERENTIAL STATISTICS
Descriptive Statistics and Inferential Statistics as major areas of Statistics

Descriptive Statistics is only concerned with summarizing values to character gathered from a
sample or population. It employs graphs, tables, measures of averages, and position and
variability. It does attempt to draw conclusion about anything that pertains to more than the data
themselves.
Inferential Statistics is concerned with making generalizations from a small group of
observations or from a sample to a bigger group of observation or to a population.
Example:
The tastes of two or three pieces of lanzones fruit from a basket are descriptive of these two/three
pieces. However, if we say that all other pieces this of is lanzones fruit in the basket may have
the same taste as these two/threes, the this is the concern of inferential statistics.
TYPES OF STATISTICS
What are the types of Statistics?

The field of statistics may be divided into descriptive and inferential statistics.
Descriptive statistics is only concerned with summarizing values to describe group
characteristics of the data after gathering, classifying, and presenting of data. To do this, it
employs graphs, tables and frequency distributions, percentages, measures of central tendency
and position, and measures of variability. It does not need to generalize or make conclusions.
Whereas, inferential statistics is concerned with a higher order or critical thinking and
judgment. And it needs more complex mathematical procedures. Its aim is to give generalization,
conclusion or information regarding large groups of data called the population without
necessarily dealing with each and every element of these groups. It only uses a small portion of
the total set of data or only a representative portion called a sample to give conclusions or
generalizations regarding the entire population.
To do this, it uses either parametric or nonparametric statistics. Parametric statistics are
inferential techniques which make the following assumptions regarding the nature of the
population from which the observations or data are drawn:
1. The observations must be independent. This means that in choosing any element from
the population to be included in the sample, it must not affect the chances of other
elements for inclusion.
2. The population must be drawn from normally distributed populations. The crude way
of knowing that the distribution is normal is when the mean, median and mode are all
equal (mean = median = mode). If we are going to draw the curve, we can produce a bell-
shaped curve which has an area of one and is symmetrical with respect to the x-axis.
3. If we analyze two groups/populations, these populations must have the same variance and
we call this as homoscedastic populations.
4. The variables must be measured in the interval or ratio scale, so that we can interpret
the results.
While the non-parametric statistics make fewer and weaker assumptions like:
1. The observations must be independent and the variable has the underlying continuity.
2. The observations are measured in either the nominal or ordinal scales. To have a better
understanding on when to use the parametric and non-parametric statistics, please refer to
the table below:
Inferential Techniques Distribution Measurement
Parametric Statistics Normal Interval or Ratio
Non-Parametric Statistics Unknown or any distribution Nominal or Ordinal
Page |1
POPULATION AND SAMPLE as sources of data for the research/investigation
Population is an aggregate or a set of all units/cases (may be people, things, events, etc.) being
studied having at least one common characteristics.
Example:
1. The total number of carabaos in Barangay X.
2. All students of Notre Dame of Midsayap College during the second semester of SY 2007-
2008.
Sample is a subset of units/cases drawn or taken from a population.
Example:
1. Some carabaos in Barangay X.
2. Some students of Notre Dame of Midsayap College during the second semester of SY
2007-2008.
PARAMETER AND STATISTICS as descriptive measures in a research/investigation
Parameter is a characteristic of a population.

Example:
1. The average age of all the carabaos in Barangay X is 10 years.
2. The grade-point average of all students of Notre Dame of Midsayap College during the
second semester of SY 2007 – 2008.
Statistics is a characteristic of a sample.
Example:
1. The average age of some carabaos in Barangay X is 9.5 years.
2. The grade-point average of some students of Notre Dame of Midsayap College during the
second semester of SY 2007 – 2008.
CENSUS AND SURVEY as process of investigation
Census is process referred to when information is gathered from all the units of population.
Sample survey is when a part of the population is used to obtain data.
CONSTANT AND VARIABLES
Constant is a quantity that takes on a single fixed numerical value, it does not change or does not
show differences in value.
Variable is a trait, attribute or property of things, persons or places that changes in quality,
quantity or magnitude.
Examples of variables are height, length, age, methods of teaching, efficiency.
ASSUMPTION and HYPOTHESIS
Assumption is a statement that is accepted as true without proof.

Hypothesis is a belief; a conjecture; a tentative theory or supposition provisionally adopted to
account for certain facts.
KINDS OF VARIABLES
Variables can be assigned numerical values called variates. The kinds of variables depend on the
kind of numbers they can be assigned.
1. Discrete or Categorical Variables can be assigned counting numbers only as variates.
Example:
Household size
Number of times you visit your dentist per year
2. Continuous variable - can be assigned counting number, fractions or decimals, which are
results of measurements as variates.
Example: Age, height, weight
TYPES OF DATA
Data are those that are manipulated or computed statistically. There are results of counting or
measurements or observations of variable.
1. Qualitative Data
Refer to attributes or characteristics of a population or a sample. These are facts for
which no numerical measure exists. They are usually expressed as categories. (Example:
Color of the skin, sex, religion)
2. Quantitative Data
These are the results of counting or measurement.
(Example: Monthly salary, grade, number of units enrolled)
MEASUREMENT SCALES/LEVELS OF MEASUREMENTS
It is necessary to give attention to different levels of measurement especially when

contemplating the use of statistics. The measurement scale is an important factor in determining
the appropriate statistical methods to be used in analyzing the data of a particular research study.
It is classified into nominal scale, ordinal scale, interval scale, and ratio scale.
Nominal Scale is the first and the lowest level of measurement. It is merely grouping or
classifying different objects into categories based upon some defined characteristics without
paying attention to order or arrangement. Following the identification of the various categories,
frequencies or the number of objects in each category are counted.
Properties of the nominal data as follows:

1. The data are mutually exclusive, (an object can belong to only one category).
2. The data categories have no logical order or arrangement. There are two ways of
classifying: the one-way classification and the two-way classification.
Example:
Variable Categories Data
Male – 1 No. of males in class – 20
Sex Female – 2 No. of females in class – 25
No. of adults with:
Black – 1 Black colored hair – 6
Hair Color Brown – 2 Brown colored hair – 10
White – 3 White colored hair - 15
Example of a one-way classification:

Example No. 1: Students may be classified according to College.
COLLEGE FREQUENCY
College of Arts and Sciences 50
College of Agriculture 100
College of Veterinary Medicine 45
College of Education 75
Example No. 2: Responses to a questionnaire.

RESPONSE FREQUENCY
Strongly Agree 50
Agree 30
Moderately Agree 20
Disagree 10
Strongly Disagree 10
In the two-way classification, an individual may be classified twice. For example, Peter is
classified as male under sex and at the same time, he is classified Yes or under Neutral or
whatever is his response.
Page |1
Example of two-way classification:
Example 1.
SEX YES NEUTRAL NO TOTAL
Male 20 10 30 60
Female 45 10 20 75
Total 65 20 50 135
The ordinal scale is the second level of measurement. In here, there is logical ordering or
arrangement of categories aside from categories being mutually exclusive. measurement is the
same as the nominal scale where number of objects are counted in each category. However, we
can discern which is the highest or lowest. For example, rank in military, we know that the
private < corporal<sergeant < lieutenant etc.
Example:
RANK FREQUENCY
Private 20
Corporal 15
Sergeant 10
Lieutenant 25
Example:
Variable Categories Data
Rich No. of rich people in town X – 2
Economic Status
Poor No. of poor people in town X – 200
No. of people who die of:
Heart Disease (1st) Heart disease – 100
Cause of death Cancer (2nd) Cancer – 80
Cerebrovascular Disease (3rd) Cerebrovascular Disease – 50
No. of responses:
Often (3) Often – 3
Frequency Sometimes (2) Sometimes – 6
Never (1) Never – 8
The following are the properties of ordinal data:

1. Data categories are mutually exclusive.
2. Data categories have some logical orders.
3. Data categories are scaled according to the amount of the particular characteristics they
possess.
Interval scale is the third higher level of measurement. It possesses all the properties of the
preceding scales with some additional properties. Another additional property is the difference
between the various level of categories on any part of the scale are equal.
A common variable measured on an interval scale is temperature. The difference between
temperature of 65 and 88 is regarded as the different between temperature 13 and 16. Here, zero
is just another point on the scale. It does not mean that there is no temperature. In fact, this is the
freezing point of water.
Page |2
Example:
 Temperature reading – the zero-temperature reading is artificial because it does not
represent the total absence of heat.
 Attitude scale
 IQ score
The properties of interval data are as follows:

1. Data categories are mutually exclusive.
2. Data categories have a logical order.
3. Data categories are scale according to the amount of the characteristics they possess.
4. Equal difference in the characteristics is represented by equal difference in the numbers
assigned to the categories.
5. The point zero is just another point in the scale.
Ratio scale is the highest level of measurement. All properties of the interval scale are
applicable in the ratio scale plus one additional property which is known as the "true zero point"
which reflects the absence of the characteristics measured.
Example, if the teacher in statistics gives a quiz and the student got zero, then the student got
no correct answer (score = 0).
Example:
 Scores in the test
 Height
 Weight
In Summary:
 The nominal scale categorizes without order.
 The ordinal scale categories with order.
 The interval scale categories with order and established an equal unit in the scale.
 The ratio scale categories with order, establishes an equal unit in the scale, and contains a
true zero point.
Page |3
SAMPLING DESIGNS/TECHNIQUES
WHY SAMPLE?
In many fields of investigation, the researcher may use one of the two research designs illustrated
below.
Research Design I Research Design II
Population Sample
Descriptive Descriptive
Statistics Statistics
Inferential
Statistics
Parameter Statistics
The research design I requires the total units under investigation known as population.
The method of gathering the facts of interest on every unit of the population is called census. It
is well known that it is not always possible to get timely accurate and economic data by the use
of census. The method that is widely used nowadays is sampling.
Sampling is the process of selecting a part called sample from a given population with
ultimate goal of making generalization about unknown characteristics of the given population.
This is shown in research design II.
ADVANTAGES OF SAMPLING
1. Sampling enables the investigation of a large population.
When the population is too big, then it is almost impossible to collect data from all the
elements of the population.
2. Sampling reduces cost.
3. Sampling enables the completion of the study within a reasonable period of time.
4. Sampling avoids consuming all the sources of data.
SAMPLE SIZE DETERMINATION

There are so many ways of determining the sample size one of which is that of the
formula of Slovin (1960):
N
n=
1+ N e 2
Where:
n = a sample size
N = population size
e = desired margin of error (any percent value less than or equal to 10%)
Page |1
Example:
If the population consists of the College of Health Sciences students enrolled this
semester with a size of 1600, what could be a good sample size for a survey involving these
students?
Solution.
Given: N = 1600
e = 5% (a number within the range of 1% to 10%)
Required: n = sample size
Equation:
N 1600 1600 1600 1600
n= = = = = =320
1+ N e 1+[1600∗( 0.05 ) ] 1+(1600∗0.0025) 1+ 4 5
2 2
Conclusion:
A group of 320 students from the CHS constitutes the sample.
Always remember, however that the assumption of a normal distribution of the

population should be considered. When the normal approximation of the population is small or
poor, tis sample size formula does not apply.
Gay (1976) offers some minimum acceptable sizes depending on the type of research as
follows:
a. Descriptive research – 10% of the population. For smaller populations, a minimum of
20% may be required.
b. Correlational research – 30 subjects.
c. Expost facto or causal comparative research – 15 subjects per group.
d. Experimental research – 15 subjects per group. Some authorities believe that 30 per
group should be considered minimum.
SAMPLING TECHNIQUES
Once the sample size is determined and the list of population elements is available, the
next question to answer is "How could the sample elements be selected from the population
elements?"
The basic principle to remember in the process of selecting is "the sample elements
should truly represent the population elements". This means that characteristics of the sample
may or may not be taking all the characteristics of the population and that sample elements
should not contain any characteristics not found in the population elements.
1. Probability Sampling Techniques

There are techniques that allow every element of the population an equal chance of being
selected as a sample element. The selection may be done using:
I. Random Sampling is the method of selecting a sample size (n) from a universe (N)
such that each member of the population has an equal chance of being included in the
Page |2
sample and all possible combinations of size (n) have an equal chance of being
selected as the sample. There are several ways of drawing sample unit at random, it
can be done by:
a. Lottery Sampling or
b. Table of Random Numbers
c. Use of Calculators
a. Lottery Sampling. The lottery sampling method is usually carried out by assigning
numbers to each member of the population. For example. we may write down the names
of each member of the population on pieces of paper. These papers are then placed in a
box or container drum. The box or lottery drum must be shaken thoroughly to prevent
some pieces of paper from sinking at the bottom, where they will have less chances of
being drawn. From the box or lottery drum, the required number of sample units are
picked.
b. Table of Random Numbers. The use of the Table of Random Numbers is another
example of random sampling. Under this technique, the selection of each member of the
population is left adequately to chance, and every member of the population has an equal
chance of being chosen.
c. Use of Calculators. Some calculators have a key labelled RAN that gives random
numbers. The numbers that appear when this key is pressed have three decimal places. If
the population is less than a hundred, select the first two digits and disregard the decimal
point. The numbers are either one or two digits. Consider the following example:
Ran# Interpretation
0..185 18
0.284 28
0.678 67
0.726 72
0.410 41
0.014 1
2. Systematic Sampling. This method uses prior knowledge of the individuals comprising a
universe with the end in view to increasing precision and representation of samples.
When sample units are obtained by drawing every, say 4th or 7th or 10th item on a list, the
process of selecting the sample is called systematic sampling.
N
To get the Kth interval, we use . We usually get a number from 1 to K for a random
n
start. All other sample numbers are readily obtained by adding K to the previous number.
Example:
Suppose a sample of 75 students is to be chosen from the population of 325 students.
Identify the sample numbers by systematic sampling with a random start.
Solution:
N 325
Step 1: = =4.33 ≈ 4
n 75
Step 2: Pick a starting point using lottery method (random start).
Page |3
Step 3: add K to get the next element. (Example 3+4=7, 7+4=11, 11+4=15).
3. The Stratified Sampling. In this method the population is first divided into groups -
based homogeneity - in order to avoid the possibility of drawing samples whose
members come only from one stratum.
Stratified sampling is often used in polls of public opinion in order to secure
representative proportions of opinions coming from various classes of people.
Classifications may be based on districts, socio-economic status, sex, work, etc.
depending on the problem being studied.
The sample size per stratum is obtained using the formula:

population ¿× desired sample ¿ ¿
subpopulation ¿ ¿ ¿
¿
Example: A stratified sample of size n=500 is to be taken from a population size of N=4000,
which consists of three strata of size N1=2000, N2=1,200 and N3=800. If the allocation is to be
proportional, how large a sample must be taken from each stratum?
Solution:
2000 1200 800
n1= ×500=250 n2= ×500=150 n3= ×500=100
4000 4000 4000
n1 +n2 +n3 =250+150+100=500=desired sample ¿ ¿
4. Cluster Sampling. The cluster sample is sometimes referred to as an area sample

because it is frequently applied on a geographical basis. On this basis, districts or blocks
of a municipality or city are selected. These districts or blocks constitute the clusters.
Cluster sampling is useful in selecting the sample when blocks in a community or city are
occupied by heterogeneous groups. For example, if a community in Manila has lower,
middle-, and upper-income residents living side by side, we may use this community as a
source of a sample to study the different socio- economic groups in Manila. By
concentrating on this particular area, we can save more time, effort and money than if we
covered different communities throughout Manila.
A cluster is an intact group possessing a common characteristic. An example is a

population consisting of all the 800 nurses in 20 hospitals in a large city. Illustrate how
the desired sample size of 200 nurses could be selected given this population.
Solution:
Number of hospitals, N = 20
Average number of nurses per hospital, X = 40
Desired sample size, n = 200
n 200
Required clusters, y= = =5 clusters
X 40
Select the needed number of clusters. By using the Table of Random Numbers, select the 5
hospitals from the population list of 20 hospitals. Include all the members in the selected
Page |4
clusters. Since there is an average of 40 nurses per hospital and we shall only use 5 hospitals, our
sample size of 200 nurses is completed.
2. Non-Probability Sampling Techniques

1. Purposive Sampling. This is based on certain criteria laid down by the researcher.
People who satisfy the criteria are interviewed. A researcher might want to find out, for
example, the reaction of the banking community to a particular Central Bank circular.
Instead of interviewing the executives of all banks, he purposely can choose to interview
the key executives of only five biggest banks in the country if he believes that it is the
reaction of these big ones that counts anyway. Of course, the answers obtained through
this procedure are not representative of the entire banking system. Or a researcher may
want to find out whether the production of "burong talangka” conforms to the minimum
standards of health and safety. There are several small and medium-scale producers of
this product. However, to get a complete listing of the producers would be rather difficult,
What the researcher can do is to study and analyze only the two major producers of this
product.
2. Quota Sampling. This is a relatively quick and inexpensive method to operate. Each
interviewer is given definite instructions about the section of the public he is to question,
but the final choice of the actual persons is left to his own convenience or preference, and
is not predetermined by some carefully operated randomizing plan. Each interviewer then
proceeds to fill the prescribed quota. As the following example will show, this method
has its pitfalls. Suppose there is a survey to estimate what percent of the population of
Quezon City consider basketball as one of their favorite sports. One interviewer might
report that 100 percent of his quotas of 70 people are basketball fans. However, it may
later be found out that this interviewer reached his quota by going to the Araneta
Coliseum to enjoy watching his favorite team compete and at the same time interview
some thrilled viewers during the game.
3. Convenience Sampling. A researcher might want to find out the popularity of a radio
program. Since the researcher has a telephone, he might simply use it and "randomly"
pick his samples from the telephone directory. This method, of course, biased against
non-telephone users. Or a researcher might want to find out whether the production of
"bola-bola" or fish balls conforms to the minimum standards of health and safety. There
are hundreds of ambulant peddlers of this product. Thus, it is impossible for the
researcher to make a complete list, much less to interview all the producers and test all
their products. So, what the researcher can just do is to get samples of the product, say,
from the fish ball peddler near his school or near his residence.
Page |5

Stat File

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Stat File

Uploaded by

Copyright:

Available Formats

MEANING OF STATISTICS/RESEARCH

Descriptive Statistics and Inferential Statistics as major areas of Statistics

What are the types of Statistics?

Parameter is a characteristic of a population.

Assumption is a statement that is accepted as true without proof.

It is necessary to give attention to different levels of measurement especially when

Properties of the nominal data as follows:

Example of a one-way classification:

Example No. 2: Responses to a questionnaire.

The following are the properties of ordinal data:

The properties of interval data are as follows:

SAMPLE SIZE DETERMINATION

Always remember, however that the assumption of a normal distribution of the

1. Probability Sampling Techniques

The sample size per stratum is obtained using the formula:

4. Cluster Sampling. The cluster sample is sometimes referred to as an area sample

A cluster is an intact group possessing a common characteristic. An example is a

2. Non-Probability Sampling Techniques

You might also like