You are on page 1of 9

STATISTICS

Statistics is a branch of science that deals with the collection, tabulation or presentation,
analysis, and interpretation of numerical or quantitative data

2 Fields of Statistics

1. Descriptive Statistics is concerned with gathering, classification and presentation of


data and the collection of values to describe group characteristics of the given data.
Examples of this are measures of central tendency, variability, skewness, kurtosis, etc.

2. Inferential Statistics aims to give information about large groups of data (population)
without dealing with each and every element of these groups. It only uses a small but
representative portion (sample) of the total set of data in order to draw conclusions or
judgments regarding the entire set of data. Examples of this are sampling/sampling
distribution, estimation, and testing of hypotheses using z-test, t-test, chi-square test, F-
test, ANOVA, among other

Historically, the modern science of statistics traces its origins from two diverse interests of
man: politics (in political states) and entertainment (in games of chance). Statistics was
described as the study of the political arrangements of the modern states of the known world.
It was derived from the Italian word statista meaning statesman (one who is well versed with
public affairs). Achenwall (1749) first used the word statistics, defining it as "the political
science of several countries."

In the early 16th century, games of chance gave rise to the development of the principles of
probability Problems on how to increase their chances of winning were posed by gamblers who
called upon mathematicians to provide them with optimum strategies for playing various
games of chance. The answers given by mathematicians such as Pascal, Fermat, Leibnitz,
Cardano, Bernoulli, and others became the basis of modern statistical theory. These
represented the beginnings of the mathematics of probability. During the 19th century, Adolph
Quetelet applied statistical methods in the fields of education and sociology, and demonstrated
that statistical techniques derived in one area of research are also applicable in other areas.
Thus, Quetelet is known as the "Father of Modern Statistics".
Methods of Data Presentation

Data may be presented in three main forms, namely, textual, tabular, and graphical. A
graph or chart may be a bar graph, line graph, pie chart, pictograph or statistical map.

1. Textual form is used in presenting data in paragraph or narrative form. It is simple and

appropriate only when there are few numbers to be presented.

Example:

Educational Attainment of PASUC and UP Teachers (from the Dissertation of Dr. Maria Aracena
B. Lubrica entitled Premation and Merit Schemes of SUC in CAR 1996)

The PASUC and the UP systems are comparative in terms of the educational level
reached by the respondents. Most of the teachers are master's degree holders, that is, 41 out
of 75 or 54.67% in PASUC and 18 out of 32 or 56.25% in UP

2. Tabular form is a systematic way of arranging data in columns and rows according to
classifications or categories. Creating a statistical table is a very effective and efficient means of
organizing and summarizing data because a lot of information can be seen from a single table.
It can show a comparison of figures under each category.

A statistical table must consist of the following parts:

a. the table heading which includes the table number and the title;

b. the body which is the main part of the table containing the figures being presented

c. the stubs or classes which are the categories describing the data, usually found at the left-
hand side of the table; and

d. the captions which are designations of the information contained in a column.

Captions are usually found at the top of the column.


Example

Table 1: Educational Attainment of PASUC and UP Teachers

Educational Attainment PASUC Teachers UP Teachers

College 23 (30.67%) 6 (18.75%)

Master’s 41(54.67%) 18(56.25%)

Doctorate 11(14.67%) 8(25%)

TOTAL 75(100%) 32(100%)

3.Graphical form is a third way of presenting data by means of a graph. A graph or chart is a
pictorial presentation of a of data. It shows a general situation at a glance. Each graph or chart
must have a figure number and a title. If data is based from another source, a source note
should be included.

The following are some of the most commonly used graphs:

a. Bar graph represents the frequencies or magnitudes of quantities in a set of categories.


In a bar graph, bars rise vertically from the horizontal axis and the height of each bar is
proportional to the frequency or magnitude of its corresponding category. It may be
simple or compound and can be vertically or horizontally arranged. It is used for both
qualitative and quantitative data.

b. Line graph is obtained by plotting the points representing the frequencies of every
category on the horizontal axis. and then joining the points with straight lines (or broken
lines). Line graph is used to show fluctuations and trends in the components of the total
quantity over a period of time, or patterns of change in the data. It is also used for both
qualitative and quantitative data.
c. Pie diagram is simply a circle divided into slices which represent various categories It
should be drawn in such a way that the size of each slice is proportional to the
percentage corresponding to its category. Pie charts are useful whenever the objective
is to display the components of a whole entity in a manner that indicates their relative
sizes.

d. Pictograph makes use of symbols and is used to compare a few discrete data usually of
one kind.

e. Statistical maps show geographical locations and may contain different symbols. It is
very necessary to include the legend which tells what the symbols represent.

f. Histogram is a bar graph associated with a Frequency Distribution Table (refer to the
definition of an FDT below). It is constructed by marking off the true class boundaries
along the horizontal axis and erecting over each class interval a rectangle whose height
is equal to the frequency of that class. It makes the information in a frequency
distribution easier to understand and more visually appealing. A histogram quickly
reveals the general pattern or distribution of values.

g. Frequency polygon is a simple line graph joining the midpoints of the bars of a
histogram. The midpoint of each class is plotted against the frequency for that class..

h. Ogive is a graphical presentation of the cumulative frequency of an FDT. The less than
ogive is constructed by plotting the less-than cumulative frequency of each class against
the upper true class boundary of the corresponding class. The points representing the
cumulative frequencies are then joined by straight lines. The greater-than ogive makes
use of the greater-than cumulative frequency of each class which is plotted against the
lower true class boundary of the corresponding class.
FREQUENCY DISTRIBUTION TABLE (FDT)
A frequency distribution is a tabular presentation of qualitative or quantitative data
grouped into categorical or non-overlapping numerical intervals called classes, together with
the number of observations in each class. When the frequency distribution is grouped
according to some categorical or non-numerical value, it is called a qualitative frequency
distribution. When the frequency distribution is grouped according to numerical intervals, it is
called a quantitative frequency distribution. It is a simple, yet effective method of organizing
and presenting numerical data so that one can grasp an overall picture of the distribution,
whether measurements are concentrated or spread out.

Steps in Constructing an FDT (Square Root Method)

(1) Determine the range for the set of observations

Range = Highest Value - Lowest Value

(2) Determine the approximate number of class intervals k.

K = √𝑛𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑜𝑏𝑠𝑒𝑟𝑣𝑎𝑡𝑖𝑜𝑛𝑠

(3) Obtain the class width, c.


𝑅
C= 𝐾

(4) Each class interval Cl includes a lower limit and an upper limit. The first lower limit is the
lowest value in the data. The next lower limits can be determined by adding the class size c
successively.

The upper limits can be determined using the formula,

upper limit = lower limit + c - 1 unit measure.

A unit measure refers to the indicated place value at which the raw data are rounded off.
[Example: 3.4 (0.1 unit measure), 24 (1 unit measure), 8.279 (0.001 unit measure), 3.45 (0.01
unit measure)]
Example: Given the midterm exam scores of 50 students, construct the FDT.

9, 14, 18, 24, 29, 23, 18, 13, 9, 5, 10, 16, 20, 25, 26, 21, 16, 10, 6, 29, 29, 18, 8, 13, 18, 28. 18. 13,
17, 12, 17, 28, 17, 12, 7, 10, 16, 26, 11, 16, 27, 16, 11, 6, 12, 17 27.17, 12, 22

(1) Range = 29 - 5 = 24

(2) k = √50 =7.07 or 7

24
(3) c = = 3.4 or 3
7

(4) 1st lower limit = 5; 1st upper limit = 5 + 3 – 1 = 7

Below is the FDT of the given example.

Class Interval (CI) Frequency (F)

5-7 4

8-10 6

11-13 9

14-16 6

17-19 10

20-22 3

23-25 3

26-28 6

29-31 3

TOTAL (N) 50
Other Information Related to the FDT

(a) True Class Boundaries (TCB) - "real" limits of each CI


1 1
Lower TCB = LL - 2 unit measure Example: LTCB = 5 - 2 (1) = 4.5

1 1
Upper TCB = UL + 2 unit measure Example: LTCB = 7 + 2 (1) = 7.5

(b) Class Mark (CM) - midpoint of each CI


1 1
CM = (LL + UL) Example: CM = 2 (5+ 7) = 6
2

(c) Relative Frequency (RF)-ratio of the frequency to the number of observations in each CI.
𝑓 𝑓 4
RF = or x 100% Example: RF = 50 = 0.08 or 8%
𝑁 𝑁

(d) Cumulative Frequency (CF)

<CF - obtained by summing up the frequencies starting with the frequency of the lowest
valued CI (less than).

>CF- obtained by summing up the frequencies starting with the frequency of the highest
valued CI (greater than).
FDT with the other columnar information

CI F TCB CM RF (%) <CF >CF

5-7 4 4.5-7.5 6 8 4 50

8-10 6 7.5-10.5 9 12 10 46

11-13 9 10.5-13.5 12 18 19 40

14-16 6 13.5-16.5 15 12 25 31

17-19 10 16.5-19.5 18 20 35 25

20-22 3 19.5-22.5 21 6 38 15

23-25 3 22.5-25.5 24 6 41 12

26-28 6 25.5-28.5 27 12 47 9

29-31 3 28.5-31.5 30 6 50 3

TOTAL N 50 100

You might also like