Professional Documents
Culture Documents
What is Statistics?
The term statistics has several meanings. It is either singular (s) or plurals (pl). In its singular
sense, it refers to the branch of mathematics which deals with the systematic collection,
presentation, analysis and interpretation of quantitative data.
1. Collection of Data - refers of data gathering using one or a combination of the following
methods: interview, questionnaire, registration or observation.
2. Presentation of Data - refers to the organization of data into tables, graphs or charts so
that the reader will be able to get the clear picture of the various relationships.
3. Analysis of Data - refers to the process of extracting relevant information from the given
data.
4. Interpretation of Data - refers to the task of drawing conclusions from analyzed data.
When considered in another sense, it is the plural form of statistics, which refers to the set of
related quantitative data, or some numerical computations derived from a set of data. The
average height of a group of freshmen of school X is a statistic for it is derived from the data on
heights of a group of freshmen of said school. From the same set of freshmen, their average
weight can also be obtained from their data on weight, such is also a statistic: Then together, the
average height and the average weight are called statistics.
What is Research?
Research is defined as a “careful, critical and exhaustive investigation to discover new facts
which will test a hypothesis, revise accepted conclusions or contribute positive values to society
in general.” By Hildreth Hoke McAshan
The research process includes collecting and processing data to arrive at answers to questions
identified in the investigation. Research is linked to statistics because statistics is a tool of
research.
USES OF STATISTICS
The use of statistics is spread through all fields, namely, fisheries, agriculture, commerce,
trade and industry, health, education, nursing, medicine, biology, economics, psychology,
sociology, engineering, chemistry, physics and many others.
It is said that statistics is the "tool" of all sciences. It is called the "language of research
In the field of fisheries, statistics is used in the analysis and interpretation of experimental
data. The weight and length relationship of fish cultured in controlled and semi-controlled
environments using different supplemental feeds is determined through statistics. The
acceptability, nutritive values and economics of processed foods using different fish
processing methods and techniques are similarly determined. The significant differences
in the quality attributes, i.e., color, odor, flavor, and texture, of these processed fishery
products become clear through statistics.
Statistics is widely used in the field of agriculture. Statistical treatment is needed in the
analysis and interpretation of data in their experiments as well as in agricultural economy.
In commerce, trade and industry, statistical techniques are of vital importance in the
planning, production and marketing of commodities, prices, costs, and profits. The
statistical results serve as basis for making policies on efficient management.
In education, statistics is a vital tool in evaluating the achievements of the students and
the performance of mentors, staff and administrators. Statistical results serve as basis for
promotion and retention of students. Statistical treatment determines also the
effectiveness and ineffectiveness of instruction, research, extension and production.
In health, nursing and medicine, statistics is an indispensable tool. Determination of the
effectiveness of treatment is based on a collection of records of clinical trials devised in
such a scale and such form that valid conclusions can be drawn. Also, nurses and
physicians have a better understanding of nursing and medical research journals,
respectively, if they have knowledge of statistical methods. T
Statistical techniques are of vital importance in evaluating, analyzing, and interpreting
results of biological experiments.
In economics, the supply and demand of commodities need statistical analysis and
interpretation for better understanding.
In psychology, the scores in personality tests, intelligence tests, aptitude tests, prognostic
tests, diagnostic tests and many others have to be analyzed and interpreted for better
understanding of an individual.
In sociology, statistical tools are used to determine the social problems. problems and
needs of society can be solved and determined by analyzing and interpreting the
observations made by the people living in a particular society.
In chemistry and physics, statistical analysis and interpretation of data of their
experiments are needed at valid and reliable results.
Importance of STATISTICS in RESEARCH
Statistics plays a vital role in research. Practically, no research can be complete without statistics.
Even in anthropological studies, the use of statistics though minimal is unavoidable. Some of the
uses of statistics in research are the following:
1. Statistics helps the researcher in making his research design. This is especially true in
experimental research. When an investigator designs his research project, he most likely asks
himself the following questions: What statistical data have to be collected? What sampling
technique has to be used? What statistical methods have to be employed in the treatment of the
data? Unless these questions are answered satisfactorily, the design cannot be complete.
2. Statistical techniques help the researcher in determining the validity and reliability of his
research instruments. Validity and reliability are important characteristics of instruments used for
gathering data. Statistics is utilized to make research instruments valid and reliable.
3. Statistical manipulations organize raw data systematically to make them appropriate for study.
Raw data, unless organized in some way, have very little or no meaning at all. They must be
organized systematically so that they can be studied and inferences can be drawn out of them.
Setting them in tables is one example.
4. Statistical treatments give meaning and interpretation to raw data and hence, are used to test
the hypotheses. Statistical methods are usually used to determine whether the hypotheses which
are drawn up at the beginning of the study are true or not.
5. Statistical methods determine the levels of significance of the research findings. In many
studies, the levels of significance of findings are indispensable in drawing up inferences,
conclusions and other forms of generalizations.
All the facts stated above make statistics a very essential part of research.
DESCRIPTIVE AND INFERENTIAL STATISTICS
Example:
The tastes of two or three pieces of lanzones fruit from a basket are descriptive of these two/three
pieces. However, if we say that all other pieces this of is lanzones fruit in the basket may have
the same taste as these two/threes, the this is the concern of inferential statistics.
TYPES OF STATISTICS
While the non-parametric statistics make fewer and weaker assumptions like:
1. The observations must be independent and the variable has the underlying continuity.
2. The observations are measured in either the nominal or ordinal scales. To have a better
understanding on when to use the parametric and non-parametric statistics, please refer to
the table below:
Inferential Techniques Distribution Measurement
Parametric Statistics Normal Interval or Ratio
Non-Parametric Statistics Unknown or any distribution Nominal or Ordinal
Page |1
POPULATION AND SAMPLE as sources of data for the research/investigation
Population is an aggregate or a set of all units/cases (may be people, things, events, etc.) being
studied having at least one common characteristics.
Example:
1. The total number of carabaos in Barangay X.
2. All students of Notre Dame of Midsayap College during the second semester of SY 2007-
2008.
Sample is a subset of units/cases drawn or taken from a population.
Example:
1. Some carabaos in Barangay X.
2. Some students of Notre Dame of Midsayap College during the second semester of SY
2007-2008.
PARAMETER AND STATISTICS as descriptive measures in a research/investigation
Census is process referred to when information is gathered from all the units of population.
Sample survey is when a part of the population is used to obtain data.
CONSTANT AND VARIABLES
Constant is a quantity that takes on a single fixed numerical value, it does not change or does not
show differences in value.
Variable is a trait, attribute or property of things, persons or places that changes in quality,
quantity or magnitude.
Examples of variables are height, length, age, methods of teaching, efficiency.
ASSUMPTION and HYPOTHESIS
Variables can be assigned numerical values called variates. The kinds of variables depend on the
kind of numbers they can be assigned.
1. Discrete or Categorical Variables can be assigned counting numbers only as variates.
Example:
Household size
Number of times you visit your dentist per year
2. Continuous variable - can be assigned counting number, fractions or decimals, which are
results of measurements as variates.
Example: Age, height, weight
TYPES OF DATA
Data are those that are manipulated or computed statistically. There are results of counting or
measurements or observations of variable.
1. Qualitative Data
Refer to attributes or characteristics of a population or a sample. These are facts for
which no numerical measure exists. They are usually expressed as categories. (Example:
Color of the skin, sex, religion)
2. Quantitative Data
These are the results of counting or measurement.
(Example: Monthly salary, grade, number of units enrolled)
MEASUREMENT SCALES/LEVELS OF MEASUREMENTS
In the two-way classification, an individual may be classified twice. For example, Peter is
classified as male under sex and at the same time, he is classified Yes or under Neutral or
whatever is his response.
Page |1
Example of two-way classification:
Example 1.
SEX YES NEUTRAL NO TOTAL
Male 20 10 30 60
Female 45 10 20 75
Total 65 20 50 135
The ordinal scale is the second level of measurement. In here, there is logical ordering or
arrangement of categories aside from categories being mutually exclusive. measurement is the
same as the nominal scale where number of objects are counted in each category. However, we
can discern which is the highest or lowest. For example, rank in military, we know that the
private < corporal<sergeant < lieutenant etc.
Example:
RANK FREQUENCY
Private 20
Corporal 15
Sergeant 10
Lieutenant 25
Example:
Variable Categories Data
Rich No. of rich people in town X – 2
Economic Status
Poor No. of poor people in town X – 200
No. of people who die of:
Heart Disease (1st) Heart disease – 100
Cause of death Cancer (2nd) Cancer – 80
Cerebrovascular Disease (3rd) Cerebrovascular Disease – 50
No. of responses:
Often (3) Often – 3
Frequency Sometimes (2) Sometimes – 6
Never (1) Never – 8
Interval scale is the third higher level of measurement. It possesses all the properties of the
preceding scales with some additional properties. Another additional property is the difference
between the various level of categories on any part of the scale are equal.
A common variable measured on an interval scale is temperature. The difference between
temperature of 65 and 88 is regarded as the different between temperature 13 and 16. Here, zero
is just another point on the scale. It does not mean that there is no temperature. In fact, this is the
freezing point of water.
Page |2
Example:
Temperature reading – the zero-temperature reading is artificial because it does not
represent the total absence of heat.
Attitude scale
IQ score
Ratio scale is the highest level of measurement. All properties of the interval scale are
applicable in the ratio scale plus one additional property which is known as the "true zero point"
which reflects the absence of the characteristics measured.
Example, if the teacher in statistics gives a quiz and the student got zero, then the student got
no correct answer (score = 0).
Example:
Scores in the test
Height
Weight
In Summary:
The nominal scale categorizes without order.
The ordinal scale categories with order.
The interval scale categories with order and established an equal unit in the scale.
The ratio scale categories with order, establishes an equal unit in the scale, and contains a
true zero point.
Page |3
SAMPLING DESIGNS/TECHNIQUES
WHY SAMPLE?
In many fields of investigation, the researcher may use one of the two research designs illustrated
below.
Research Design I Research Design II
Population Sample
Descriptive Descriptive
Statistics Statistics
Inferential
Statistics
Parameter Statistics
The research design I requires the total units under investigation known as population.
The method of gathering the facts of interest on every unit of the population is called census. It
is well known that it is not always possible to get timely accurate and economic data by the use
of census. The method that is widely used nowadays is sampling.
Sampling is the process of selecting a part called sample from a given population with
ultimate goal of making generalization about unknown characteristics of the given population.
This is shown in research design II.
ADVANTAGES OF SAMPLING
1. Sampling enables the investigation of a large population.
When the population is too big, then it is almost impossible to collect data from all the
elements of the population.
2. Sampling reduces cost.
3. Sampling enables the completion of the study within a reasonable period of time.
4. Sampling avoids consuming all the sources of data.
Page |1
Example:
If the population consists of the College of Health Sciences students enrolled this
semester with a size of 1600, what could be a good sample size for a survey involving these
students?
Solution.
Given: N = 1600
e = 5% (a number within the range of 1% to 10%)
Required: n = sample size
Equation:
N 1600 1600 1600 1600
n= = = = = =320
1+ N e 1+[1600∗( 0.05 ) ] 1+(1600∗0.0025) 1+ 4 5
2 2
Conclusion:
A group of 320 students from the CHS constitutes the sample.
SAMPLING TECHNIQUES
Once the sample size is determined and the list of population elements is available, the
next question to answer is "How could the sample elements be selected from the population
elements?"
The basic principle to remember in the process of selecting is "the sample elements
should truly represent the population elements". This means that characteristics of the sample
may or may not be taking all the characteristics of the population and that sample elements
should not contain any characteristics not found in the population elements.
I. Random Sampling is the method of selecting a sample size (n) from a universe (N)
such that each member of the population has an equal chance of being included in the
Page |2
sample and all possible combinations of size (n) have an equal chance of being
selected as the sample. There are several ways of drawing sample unit at random, it
can be done by:
a. Lottery Sampling or
b. Table of Random Numbers
c. Use of Calculators
a. Lottery Sampling. The lottery sampling method is usually carried out by assigning
numbers to each member of the population. For example. we may write down the names
of each member of the population on pieces of paper. These papers are then placed in a
box or container drum. The box or lottery drum must be shaken thoroughly to prevent
some pieces of paper from sinking at the bottom, where they will have less chances of
being drawn. From the box or lottery drum, the required number of sample units are
picked.
b. Table of Random Numbers. The use of the Table of Random Numbers is another
example of random sampling. Under this technique, the selection of each member of the
population is left adequately to chance, and every member of the population has an equal
chance of being chosen.
c. Use of Calculators. Some calculators have a key labelled RAN that gives random
numbers. The numbers that appear when this key is pressed have three decimal places. If
the population is less than a hundred, select the first two digits and disregard the decimal
point. The numbers are either one or two digits. Consider the following example:
Ran# Interpretation
0..185 18
0.284 28
0.678 67
0.726 72
0.410 41
0.014 1
2. Systematic Sampling. This method uses prior knowledge of the individuals comprising a
universe with the end in view to increasing precision and representation of samples.
When sample units are obtained by drawing every, say 4th or 7th or 10th item on a list, the
process of selecting the sample is called systematic sampling.
N
To get the Kth interval, we use . We usually get a number from 1 to K for a random
n
start. All other sample numbers are readily obtained by adding K to the previous number.
Example:
Suppose a sample of 75 students is to be chosen from the population of 325 students.
Identify the sample numbers by systematic sampling with a random start.
Solution:
N 325
Step 1: = =4.33 ≈ 4
n 75
Step 2: Pick a starting point using lottery method (random start).
Page |3
Step 3: add K to get the next element. (Example 3+4=7, 7+4=11, 11+4=15).
3. The Stratified Sampling. In this method the population is first divided into groups -
based homogeneity - in order to avoid the possibility of drawing samples whose
members come only from one stratum.
Stratified sampling is often used in polls of public opinion in order to secure
representative proportions of opinions coming from various classes of people.
Classifications may be based on districts, socio-economic status, sex, work, etc.
depending on the problem being studied.
Example: A stratified sample of size n=500 is to be taken from a population size of N=4000,
which consists of three strata of size N1=2000, N2=1,200 and N3=800. If the allocation is to be
proportional, how large a sample must be taken from each stratum?
Solution:
2000 1200 800
n1= ×500=250 n2= ×500=150 n3= ×500=100
4000 4000 4000
n1 +n2 +n3 =250+150+100=500=desired sample ¿ ¿
Select the needed number of clusters. By using the Table of Random Numbers, select the 5
hospitals from the population list of 20 hospitals. Include all the members in the selected
Page |4
clusters. Since there is an average of 40 nurses per hospital and we shall only use 5 hospitals, our
sample size of 200 nurses is completed.
2. Quota Sampling. This is a relatively quick and inexpensive method to operate. Each
interviewer is given definite instructions about the section of the public he is to question,
but the final choice of the actual persons is left to his own convenience or preference, and
is not predetermined by some carefully operated randomizing plan. Each interviewer then
proceeds to fill the prescribed quota. As the following example will show, this method
has its pitfalls. Suppose there is a survey to estimate what percent of the population of
Quezon City consider basketball as one of their favorite sports. One interviewer might
report that 100 percent of his quotas of 70 people are basketball fans. However, it may
later be found out that this interviewer reached his quota by going to the Araneta
Coliseum to enjoy watching his favorite team compete and at the same time interview
some thrilled viewers during the game.
3. Convenience Sampling. A researcher might want to find out the popularity of a radio
program. Since the researcher has a telephone, he might simply use it and "randomly"
pick his samples from the telephone directory. This method, of course, biased against
non-telephone users. Or a researcher might want to find out whether the production of
"bola-bola" or fish balls conforms to the minimum standards of health and safety. There
are hundreds of ambulant peddlers of this product. Thus, it is impossible for the
researcher to make a complete list, much less to interview all the producers and test all
their products. So, what the researcher can just do is to get samples of the product, say,
from the fish ball peddler near his school or near his residence.
Page |5