You are on page 1of 11

Statistics

Statistics-is a scientific body of knowledge that deals with the collection, organization or
presentation, analysis, and interpretation of data

CATEGORIES OF STATISTICS

1. Descriptive Statistics- is a statistical procedure concerned with describing the characteristics


and properties of a group of persons, places or things.

Example: We may describe a collection of persons by stating how many are poor and how many
are rich, how many fall into various categories of age, height, civil status, IQ, and many more.

2.Inferential Statistics – is a statistical procedure that is used to draw inferences or information


about the properties or characteristics by a large group of people, places, or things o the basis of
the information obtained from a small portion of a large group.It is concerned with reaching
conclusions at times the information available is incomplete and generalizations are reach based
on the data available.

Example: As a result of the increase in the number of patients in a hospital this week because of
a certain disease, it is expected that the number of patients will double next week.

1.The term population, as used in statistics, refers to a group of aggregate people, objects or
events.

Biological population - A group of individuals belonging to the same species.

Example.Population of Wmsu students.

2. Sample- is a collection of some elements in a population.

Ex. BSED students.

3. Data- are information we gather about the sample or the population

4. Parameters – the data obtained about the population.Ex.The researcher uses the whole
population(N=1500), then the average income obtained is called a parameter.

5. Statistic – data about samples.Ex. The researcher makes use of the sample(n=200), then the
average income is called a statistic.

6.Constant- is a property or characteristic of a population or sample which makes the members of


the group similar to each other. Ex., if a class composed of all boys , then gender is constant.

7.Variable – refers to a characteristic or property of population or sample which makes the


members different from each other. Ex., if a class composed of boys and girls then gender is a
variable.
8.Qualitative data – data that manifest the concept of attributes. These are sometimes called
categorical data.Data falling in this category cannot be subjected to meaningful arithmetic
operations.

9.Quantitative data – are data which are numerical in nature.This data are obtained from counting
and measuring. In addition, meaningful arithmetic operations can be with this type of data.

Example

a)Qualitative data– sex(male or female), attitude (favorable or not favorable), emotional condition
(happy or sad)

b) Quantitative data

- involve numbers and the result of counting or measuring

Discrete numbers are those obtained through counting,

Continuous numbers are the result of measurement

Example(Discrete numbers)

Number of barangays

Number of teachers in a school

Number of students in a class

Note. For discrete numbers, decimals have no meaning such as 100 families not 100.47 families.

Example(Continuous numbers)

Heights and weights of students in a class, temperature in a locality

Note. For continuous numbers, decimals have meaning.

Classification of variables

A.)according to functional relationship

1.)Dependent variable is a variable which is affected or influenced by another variable

2.)Independent variable is one which affects or influences another variable

B.)According to continuity of values

1.)discrete variable is one that can assume a finite number of values. In other words it can
assume specific values only.These values are obtained through the process of counting.

2.)continues is one that can assume infinite values within a specified interval.The values of a
countinuous variable are obtained through counting.

C.)According to scale of measurements

1.)Nominal scale- this is the most primitive level of measurement.This is use when we want to
distinguish one object from another for identification purposes.Ex. zip code, credit card
numbers,gender.
2.)Ordinal scale – data are arranged in some specific order or rank. When objects are measured
in this level, we can say that one is better or greater than the other, but we cannot tell how much
more or how much less one object than the other.Ex. the rank of contestant in a beauty contest.

3.)Interval scale- if data are measured in this level, we can say not only one object is greater
than or less than the other, but we can also specify the amount of difference.To illustrate, suppose
Maria got 50 in math quiz while Martha got 40, we can say that Maria got higher than Martha by 10

4.)Ratio scale- this level of measurement is like the interval level.The only difference is that ratio
level always starts from an absolute zero point.in addition, this level mostly has the presence of unit
of measure.If data are measured in this level, we can say that one object is so many times as large or
as small as the other .Ex., suppose Mrs. Reyes weighs 50kg, while her daughter weighs 25kg.we can
say that Mrs. Reyes is twice as heavy as her daughter.Thus, weight is an example of data measured
in the ratio scale.

Exercise 1.1

A.Indicate whether the data are represented in each of the following is a part of population or a
sample

1. Twenty five cases of TB have been reported in the past year and a patient is to be carried out
using data from all 25 cases.

2. A total 338 chest x-rays were performed during the past months.A quality control review is to
be carried on 10% of the group.

B. Tell whether the following situations will make use of descriptive or inferential statistics.

1. A teacher computes the average grades of her students and determine the top 10 students.

2. A school administrator forecast a future expansion of a school.

3. A researcher investigates the effectiveness of a beauty product

C. Indicate whether the following that represent qualitative or quantitative data.

1.Place of birth

2.Type of insurance

3. Number of students admitted

D.Determine whether the numbers obtained in the following variables are discrete or continuous.

1. Spots on a die

2. Passing score in the bar examination

3. Distance of the towns in a province from the capital

4. Number of domestic animals in a barangay

5. Weights of infants at birth in a particular area

6. Average temperature of a place in one year

7. Grade point average of students in a certain class


8. Books in a library

9. Height of basketball players

10.Holidays in a school year

E.Determine the scale of measurement of the following.

1. weight 2. Educational level 3. License 4. Placement in a 100-m dash 5. Civil status.

Data Gathering Techniques

I.Collecting Data

2 Sources of data

1.Primary sources of data are the government institutions, business agencies, and other
organizations.

Example:1. Data are gathered from the Phil. Statistics Authority(PSA).

2.information derived from personal interviews.

Various Ways of Collecting Data

1.The Direct or Interview Method- the researcher or the interviewer has a direct contact with the
interviewee.The researcher obtains the information needed by asking questions and inquiries from
the interviewee.in this method the researcher can can get more accurate answer since clarification
can be done by the interviewee if respondent does not understand the question. However, this
method is costly and time consuming .

Ex. A business firm would interview residents of a certain barangay regarding their favourite
brand of toothpaste.

2.The Indirect or Questioner Method-This method makes use of the questionnaire. The researcher
distributes the questionnaire to the respondents either by personal delivery or by mail. In this
method the researcher can save time a lot of time and money because questionnaires can be given
to a large number of respondents at the same time.however, the researcher cannot expect that all
questionnaires will be answered because some of the respondents simply ignore it.In addition,
clarification cannot be made by the respondent who does not understand the question.

3.The Registration Method- this method o gathering data is enforced by certain laws.

Ex. Registration of births ,deaths ,or vehicles.

4.The Experimental Method- this method usually used to find out cause and effect relationship of a
certain phenomena under controlled condition.Scientific researchers often used this method.

SAMPLING TECHNIQUES

This procedure used to determine the individuals or members of a sample.

2 Types of Sampling Techniques

1.Probability sampling – is a sampling technique wherein each member or element of the population
has an equal chance of being selected as members of the sample.

Several Probabilty Sampling Techniques


a.)Random sampling-is the procedure by which all the members of the population have an
equal chance of being selected.

can be be performed by the “fish bowl” method or lottery method wherein each individual
in a population is assigned a number and lots are drawn to determine which individual will
be included in the sample or using table of random numbers or using calculator with
function key labeled RAN that gives random numbers.

b.)Systematic Sampling-Involves taking the every kth element in a population as part of the
sample. The starting point is determined by the nature of the population or is selected at
random.

Ex. Mrs. Cruz wants to select 5 students out of her 40 students.First. we select a random
starting point.This is done by dividing the the number of members by the number of
members in the sample.Hence, in our case we shall have i=8. The next step is to write the
numbers 1,2,3,4,5,6,7 and 8 on a pices of paper and draw one number by lottery.If we are
able to draw 5, this means that we will select every 5 th student in the group of 8.That is,
5th,13th,21st,29th,and 37th.If for instance we pick 7, then the members of the sample will be
the 7th,15th,23rd,31st,and 39th.

c.Stratified Random sampling- The word stratified comes from the root word strata which
groups or or categories(singular form stratum).There are some instances the members of
the population do not belong to the same category or group.When we use this method we
are actually dividing the elements of a population into different categories or subpopulation.

Ex. Suppose a community consists of 5000 families belonging to a different income


brackets. We will draw 200 families as our sample using stratified random sampling. Below
are the subpopulations and the corresponding number of families belonging to each stratum

Strata Number of
Families
High-income 1000

Average-income 2500

Low-income 1500

First find the percentage of each stratum by dividing the number of families in each stratum by the
total number of families.Then multiply each percentage by the desired number of familiesin the
sample. The table below shows how it is done.

Strata No. of Families Percentage No. of Families in


the sample

High 1000 1000/5000=0.2or2% 0.2x200=40

Average 2500 2500/5000=0.5or50% 0.5x200=100


Low 1500 1500/5000=0.3or30% 0.3x200=60

N=5000 n=200

d.)Cluster Sampling-is a sampling technique wherein groups or clusters instead of individuals are
randomly chose.This is sometimes called area sampling because this usually apply when the
population is large.

Ex. Let’s suppose we want to find the average income of the families in Zamboanga city. Assume that
there are 98 barangays in zamboanga city. We can draw a random saple of 10 barangays using
simple random sampling, and then a certain number of families from each 10 barangays may be
chosen.

2.Non- Probabilty Sampling

This sampling technique wherein members of the sample drawn from the population based on the
judgment of the researchers. The results of the study using this technique are relatively biased , it
also lacks objectivity of selection; hence, it is sometimes called subjective sampling. However, this is
convenient and economical.

a.)Convinience sampling- as the name implies, convenience sampling is use because of the
convenience it offers to the researcher.

For example, a researcher who wishes to investigate the most popular noontime show may just the
opinions of those without telephone will not be included.

b.) Quota Sampling-in this type of sampling, the proportions or the various subgroups in the
population are determined and the sample is drawn to have the same percentage in it.This is very
similar to stratified random sampling the only difference is that selection of the members is not done
randomly.

To illustrate this, let us suppose that that we want to determine teenager’s favourite brand of t-
shirt.If there are 1000 female and 500 male teenagers in the population and we want 150 members
of the sample, we can select 75males and 75 females from the sample without randomnization.

c.)Purposive Sampling-this is another type of non-probabilty sampling.

Ex. Suppose that the target is to find the effectivity of a certain kind of shampoo. Of course, bald
fellows will not be included in the sample.

Determining Sample Size

To determine the sample size from a given population, the Slovin’s formula is used.

Slovin’s formula :

N
n=
1+ N e2
Where n= sample size

N= population size

e = margin of error
To illustrate suppose we want to find the average age of the students in Zamboanga. However, due
to insufficient time, only the students in three particular schools were used to estimate the average
age. Obviously, the result is not the actual average but just an estimate and thus, there is usually an
error when we use the sample size instead of the population.

Example:

A group of researchers will conduct a survey to find out the opinion of residents of a particular
community regarding the oil price hike. If there are 10000 residents in the community and the
researchers plan to use a sample using 10% margin of error, what should the sample size be?

Solution:

Here, N= 10000 and e= 10% or o.1. Subtituting the given values in the formula, we have :

N 10000 10000
n= 2 = 2 = = 99.01 or 99
1+ N e 1+(10000)0.1 1+(10000)(0.1)
Hence, the researchers will just conduct the survey using 99 residents. A 10% margin of error means
that the researcher is 90% confident that the result obtained usng the sample will closely
approximate the result using the entire population.

Summation Notation

In the study of statistics, we shall be using mathematical symbols, one of the most common is the
summation notation or simply summation (Ʃ ).

Recall that variables are represented using capital letters. If our variable is age, then we can
represent this by X. Hence, if there are 40 students in a class, we can represent the age of the first
student by X1, the second student by X2, the third by X3, and so on. If we want to find the sum of
these ages, then we can write the sum in this way:

X1 + X2 + X3 + …………………+ X40

To write the sum of n values or measurement in a simpler way,


40

∑ X i (read as “ the summation of X sub I, where I varies from 1 to 40)


1

Here i is the index of summation and its value ranges from 1, the lower limit, to40, the upper limit.
Observe also that when we write the sum f values in summation notation, we replace the subscript
of the sum variable by an arbitrary subscript i and indicate in the index the range of the
40
summation.Thus, ∑ X i= X1 + X2 + X3 + ………..+ X40
1

Exercise 2

1.)Determine the sample size n.

a. N=5000, 5% margin of error

b. N= 8000, e = 10%

c. N = 10000, margin of error of 2.5%


2.) Expand the following expression.
10
3
a. ∑ X i
1

6
2
b.∑ X i
2

7
c. ∑ ¿¿ + Bi )
3

3.) Write the following in summation notation.

a. 2X3 + 2X4 + 2X5

b. (5 + X1) + (5 +X2) + (5 + X3) + ……………………+ (5 + X50)

4.) Given the following: X1=2, X2 = 4, X3 = 5, Y1 =1, Y2 = 3 and Y3 = 7, find the sum.
3
a. ∑ X i
1

3
b.∑ X iYi
1

Presenting and Describing Data

After data have been gathered and checked for possible errors, the next logical step is to present the
data in a manner that it is easy to understand. It should also convey the relevant information and the
important results at a glance.

Ungrouped data – are data that are not organized, or if arranged, could only be from lowest to
highest or highest to lowest

Grouped data – are data that are organized and arranged into different classes or categories.

3 Methods in Presenting Data

1.)Textual- the presentation is narrative or paragraph form. The data are within are within the text
of the paragraph. This involves enumerating the important characteristics, giving emphasis on
significant figures and identifying important features of the data. This may not get immediate
interest of the reader. However, it can present a more comprehensive picture of the data because of
further written explanation of its nature.

Example:

Nominally, the peso improved by 1.4% as of April 14, 2003 compared to its level n 2002, followed by
Thai baht, which gained 0.86%; Indonesian rupiah, 0.68%; and Taiwan dollar, 0.2%. Other currencies
on the other hand, depreciated during the same period. The Singapore dollar fell 2.33%. The South
Korean won slid 2.14% while the Japanese yen dropped 0.61%(Phil. Daily Inquirer, April 17, 2003,
p.B2)

2.) Tabular – sometimes we could hardly grasp information from textual presentation of data. Thus,
we may present data by using tables. By organizing data in tables, important feature about the data
can readily understood and comparisons can be easily made. Thus, a table shows complete
information regarding the data.

Parts of a Table

1.Heading:It includes the following,

a. Table number: this is for easy reference to the table.

b. Table title: It briefly explains the content of the table

2.Box head/ Column header: it describes the data in each column.

3.Stubs/Row classifier: it shows the classes or categories.

4.Body: this is the main part of the table.

5. Foot note/Source note: This is only placed below the tble when the data written are not original;
that is, it indicates the source of data.

A. Cross Tabulation Table

Table 1.1 distribution of religious Affiliation by Sex for Barangay Tibanga

Religion Sex
Male Female Total
Roman Catholic 2,758 2,693 5,451
Islam 113 126 239
Iglesia ni Cristo 82 79 161
Others 231 275 506
Total 3,184 3,173 6,357
Source: 1994 Iligan Census Summary Report

B. Frequency Distribution

Frequency distribution is a grouping of the number of all observations into intervals or classes
together with a count of the number of observations that fall in each interval or class.

Steps in constructing frequency distribution

1.Find the range (R): R=highest value – lowest value

2.Estimate the number of classes, k

a. k=√ n

b. 1 + 3.322 log n

3. Estimate the width c of the interval ( c= R/K).Round off this to the same number of significant
decimal places as the original set of data.
4. List the lower and the upper class limits of the first class, this interval should contain the lowest
observation in the data set.

5. List all the class by adding the class width to the limits of the previous interval. The highest class
should contain the largest observation in the data set.

6. Tally the frequency for each class.

7.Compute the class marks and the class boundaries.

l i+U
Class Marks or Class midpoint: xi = i

Class Boundaries

Lower class boundaries= lower class limit – (1/2 unit)

Upper class boundaries = upper class limit + (1/2unit)

C. Stem and Leaf Plot

For small data set, grouping data into intervals may still be done without loss of information. Stem
and leaf plot is a table consisting of a stem and leaf.

A. Bar chart and Histogram

A bar chart is graph where the different classes are represented by rectangles and bars. The
width of the rectangles is the width of the interval represented by the class limits in the horizontal
axis or categories for the nominal data. The length of the rectangle, represented by the frequency, is
drawn in the vertical axis.

A histogram is a graph which is a close resemblance of the bar chart. Histogram employs the class
boundaries for the horizontal axis.

B. Friquency polygon

A frequency polygon is constructed by plotting the class marks against the frequency. To
complete the polygon, which is mathematically defined as a closed figure, an additional class mark is
added at the beginning and at the end of the distribution.

C. Frequency ogive

A cumulative frequency distribution can be represented graphically by a frequency ogive. An ogive


is obtained by plotting the upper class boundaries on the horizontal scale and the cumulative
frequency less than the upper class boundaries in the vertical scale.

D. Pie chart and Pictograph

You might also like