Professional Documents
Culture Documents
The term statistics came from the Latin word “statista” which means state. Statistics in
the early days was widely used for the purposes of governing the state such as the figures on
the geographical areas conquered and the number of soldiers killed in the battlefield and for
purposes of taxation
Achenwall
The first to introduce the word “statistiks” in a preface to a statistical
work.
Zimmerman and Sinclair
Introduced and popularized the name “statistics” in their books.
Girotamo Cardano
An Italian mathematician who wrote “Liber de Ludo Aleae” where the
first study of the principles of probability appeared
Blaise Pascal
Worked on the “Game of Points” that marked the beginning of the
mathematics of probability
De Moivre
Discovered the equation for normal distribution
Adolf Quetelet
A Belgaim astronomer who applied the theory of probability to
anthropology psychology and education.
Francis Galton
Developed the use of percentiles and worked with Charled Darwin in the
application of statistics to heredity and correlation theory.
Karl Pearson
Worked with Galton to develop regression and correlation theory and
sampling theory
Ronald Fisher
Introduced the Fisher’s test used in the analysis of variance.
1. Descriptive Statistics seeks only to describe and analyze a given group without
drawing conclusion or inference about a larger group
2. Inferential Statistics seeks only to draw conclusion or inference about the larger
group based on the sample –subset of the larger group.
Defenition of Terms:
It is important to know some terminologies that will be used in the study of statistics.
4. Measurement
It is a process of assigning values or score to persons or objects.
4.1 Nominal scale assigns number or other symbols to persons or objects to be used
mainly for identification and classification purposes.
4.2 Ordinal scale places measurements into categories each category indicating different
level of some attributes that is being measured. Categories can be ordered or
distance between categories is undetermined.
4.3 Interval scale is the distance between any two different numbers in the scale of
Known size. It does not always have a meaningful zero point. A zero point is a
point that indicates the absences of what we are measuring.
Sampling
refers to the method of selecting a portion from the population under study.
1. Probability sampling allows every unit of the population the chance of being included
In the study
1.1 Simple random sampling is the process of selecting a sample giving each
sampling unit an equal chance of being included in
the sample. This is the most commonly used method
and basic to all sampling designs. This is the most
suitable method for homogenous groups.
Procedure:
i. Number the units of the population consecutively from 1 to n
ii. Determine the sampling interval (k) by the formula:
k=N/n where N = population size
n = sample size
iii. Use the table of random numbers to choose r. r is the first unit of
the sample size. The formula for obtaining the sample size (n) is
Slovin’s Formula:
n = N/(1+Ne2) where n = sample size
N = population size
c = margin error
2. Non-probability sampling selects the sample in such a way that not all the units of the
Population is given the chance of being selected – some have no chance at all.
2.2 Quota sampling chooses the sample based on the required number or
Percentage of the population, the selection of which is not based
On randomization.
2.3 Convenience sampling selects the sample that can be easily picked and made
Part of the group since the population is infinite.
COLLECTION OF DATA
Gathered available facts/data from published or unpublished sources should be
accurate, timely, complete, and relevant to the problem.
Sources of Data
1. Survey Method
Data is obtained by asking people either directly (interview) or indirectly
(questionnaire) through the use of schedule – set of questions.
1.1 Interview is a person to person exchange if data between one supplying data
(interviewee) and the one soliciting the data (interviewer) that is most
appropriate for revealing data on complex, emotionally laden topics or
sentiments underlying an expressed opinion.
1.2 Questionnaire elicits responses by way of a set of questionnaire that are usually
mailed (snail mail or electronic mail)
1.2.1 Confidential data are usually collected by questionnaire
1.2.2 Respondent can accomplish the questionnaire at his most convenient time.
1.2.3 Covers wide geographical area.
Types of Questions:
2. Open-ended questions permit free response by merely raising the issue without
providing any instruction to the respondents reply.
1. Questions must be simple and clear in order to obtain accurate information. Good
questions result in a greater degree of precision. Questions like, “How much do you
drink?”. The question is not clear to respondents, it may have several meanings.
2. Questions must be objective. Questions like. “Why do you like to study in UST?”.
This question must be phrased in such a way not to put the answer into the subject’s
response.
3. Questions must always state the precise units in order to facilitate the presentation
of data.
2. Observation Method
Data pertaining to behaviors of an individual or a group of individuals during the
occurrence of a particular event/situations are best obtained through observation. This method
is limited to the time of occurrence of the event.
Types of Observations
2.1 Participant observation – observer joins the group as participating member
actively or passively
2.2 Non-participating observation – observe outside of the group whether his
presence is known or unknown
3. Experimental Method
A method designed for collecting data under controlled condition that usually
establishes causal relationship.
ORGANIZATION OF DATA
FREQUENCY DISTRIBUTION
Frequency distribution is the method of organizing and summarizing statistical data in
tabular form.
Class size is called size of the class interval. It is obtained by getting the difference
between the successive upper/lower class limits/boundaries.
Class mark is the midpoint of the class interval.
CM = (ucl + lcl) / 2
1. Determine the range which is the difference between the highest and lowest value.
a. The number of classes should not be smaller than 6 but not greater that 16 (6 < n < 16).
Not too many so as to obtain many empty classes and not too few to avoid lumping
observation and too much information.
b. Observation should fall into one and only one class interval. Sturges approximation is
only a guideline not an inflexible rule.
K = no. of classes (approximate)
K = 1 + 3.22 log n
4. Determine a number less than or equal to the lowest score divisible by the size of the class
interval
7. Get the sum of the frequency column and check against the total number of observation.
PRESENTATION OF DATA
Data must be presented in them most understandable form that shows significant
characteristics.
1. Textual Form is summarizing the data in paragraph form. The simplest and the
most appropriate approach when there are only few numbers to
be presented. When a large quantitative data are included in the
text or paragraph the presentation becomes almost
incomprehensible
2. Tabular Form is arranging and presenting data in rows and columns so that the
reader may easily compare and analyze. This method facilitates
the comparison of various figures under the different categories.
3.1 Bar Graph. This consists of bars of heavy lines of equal width, either all
vertical or all horizontal. The length of the bars represent the
magnitude of the quantities being compared.
3.2 Line Graph. This graph shows the relationship between two or more sets
of quantities and is usually used to highlight the effect of time in a
given data.
3.3 Pie Chart. This is used to represent quantities that make a whole. The
diagram is in circular shape cut into sub-dimension with each size
of every section indicated on the proportion of each component.
Definition of Terms:
1. For every event a, 0 < P(A) < 1, that is, the probability of any event is a real number 0
and 1 inclusive
2. P(S) = 1 and P(Ø) = 0
3. If A1, A2, A3... An are mutually exclusive events (mutually disjoint sets) Then:
n
P(A1 U A2 U... An ) = i =1{ [Ai = P(A1) + P(A2) + P (A3) + … P(An)]
This method is easy to employ when the sample space S… that are equally likely
or equiprobable.
Example:
A researcher studied the relationship between the salary of a working a woman with
school aged children and the number of children she had.
Definition:
P =
n r n!___
(n – r)
C =
n r n!___
n!(n – r)!
_____________________________________________________________________________________________________________________
Prepared By: Doxa Dave Rotap
B.S. Microbiology 2013