statistics

- Basic Statistical Concept
- Chapter 01 What is Statistics?
- ARM
- 47 Spatial Aurocorrelation
- Indicators
- 14 Attitude Measurement
- business research methods
- sbk06m
- BRM Unit 2
- Tutorial 02 Probabilistic Analysis
- Research Methodology and Design-1
- QNT 351.doc
- Types of Variables
- Polaris-WP20_QuestionnaireDesign_101_091510
- Marketing Research Lecture-8
- HAND 2005 S Size Matters-How Measurement Defines Our World
- STA301FormulasDefinitions01to45
- 01 Pengstat Pendahuluan.pptx
- SGI-V500
- 03_SamplingDataCollection

Concepts

Definition of Terms

Statistics refer to the science that deals with the

collection, tabulation or presentation, analysis, and

interpretation of numerical or quantitative data.

Collection of data refers to the process of obtaining

numerical measurements. Tabulation or presentation of

data refers to the organization of data into tables,

graphs or charts, so that logical and statistical

conclusion can be derived from the collected

measurements. Analysis of data pertains to the process

of extracting the given data relevant information from

which numerical description can be formulated.

Interpretation of data refers to the task of drawing

conclusions from the analyzed data. It also normally

involves the formulation of forecasts or prediction about

larger groups based on the data collected from small

group.

Population or the universe refers to the collection of

all traits under study or under consideration. A small part of

this big group is called a sample.

Example of population and sample: graduate students

of NEUST are an example of population while students in

the MPA program are a sample.

Using the language of mathematics, the universal set

is the population while the subset refers to the sample.

Hence, If the universal set is a set of counting numbers,

the set of even numbers is a subset, so with the set of odd

numbers.

A population can be finite or infinite. The population of

a certain school in a particular term is finite while the

population consisting of all possible outcomes (heads, tails

in successive tosses of coin) is infinite.

A parameter refers to the numerical characteristic of the

population like the population mean, population standard

deviation, population variance, and many more. It is

usually unknown and estimated only by a corresponding

statistic computed from the sample data. Thus, the

population mean is estimated by the sample mean,

population standard deviation through the sample

standard deviation, the population variance by the sample

variance, etc. The mean weight of a sample of 100

sophomore students selected from the entire population

of the sophomore students in a certain high school is a

statistics. The mean weight of all students comprising the

population is a parameter, which is estimated by the

sample mean weight of the sophomore students.

Generally, the characteristics of a population are called

parameters, while the characteristics of a sample are

called statistics.

and statistics in most statistical writing:

Characteristics

Parameter

Mean

(mu)

Standard Deviation (sigma)

Variance

2

Proportion

P

Pearson Correlation Coef.

R

Number of Cases

N

n

Statistic

x

s

s2

p

r

Variables

Variable is one of the basic concepts in

statistics.

It

refers

to

observable

characteristics or phenomena of a person

or object whereby the members of the

group or set vary or differ from one

another. A variable is a symbol such as X,

Y, Z, a, b, c, etc. which can assume any

domain of the variable. If the variable can

assume only one value it is called a

constant. (e.g. - )

Variables

A variable which can be theoretically assume

any value between two given values is called a

continuous variable, otherwise it is called a

discrete variable.

Example: the number of houses in a

community is a discrete variable- it can be measure

any of the values 0, 1, 2, 3, etc. but cannot be 1.5,

3.34, 4.624, etc.

The weight of an individual, which can be 45.3

kg., 50.50 kg., 70.345 kg., etc depending on the

accuracy of measurement, is a continuous variable.

In general, measurement gives rise to

continuous data while enumeration or counting

gives rise to discrete data.

Dependent and

Independent Variables

Variables can be grouped into dependent and

independent variables with respect on their use.

Independent variable is used as predictor if the

objective is to predict the value of one variable on the

basis of the other. Contrary to this, dependent variable

means the variable whose value is predicted. To

illustrate, if we want to predict or foresee the students

academic achievement in mathematics, we may

analyze the different factors such as gender, study

habits, intelligence quotient, interest, attitudes, socioeconomic status and many more. Hence, the

independent variables are gender, study habits,

intelligence quotient, interest, attitudes, and socioeconomic status. On the other hand, the dependent

variable is the student academic achievement in

mathematics.

Uses of Statistics

According to Ary and Jacobs (1976), statistics is a

body of scientific methods for analyzing quantitative

data. Statistics produces two functions: (1) they aid the

scientist in organizing, summarizing, interpreting and

communicating quantitative information obtained from

observations and (2) they allow scientist to extrapolate

the data to reach tentative conclusions about the larger

group from which the smallest group was derived. The

statistical procedure dealing with the first function are

generally called descriptive statistics (gathering,

classification, presentation of data and collection of

summarizing values) while the procedures dealing with

the second function are called inferential statistics

(critical judgement and mathematical methods).

Types of Data

Statistical tools rely on the types of data that are collected.

Among the different types are as follows:

Primary and Secondary Data

Primary data refer to information which are gathered directly

from the original source or which are based on direct or first hand

experience (e.g. autobiographies, diaries, etc.). Secondary data

refer to information which are taken from published or unpublished

data which are previously gathered by other individuals or agencies

(e.g.- books, magazines, newspapers, etc.).

Qualitative and Quantitative Data

Qualitative data are categorized data, which take the form of

categories or attributes (e.g. - sex, year level, religion, etc.). On the

other hand, quantitative data or numerical data are obtained from

measurements (e.g. height, weight, ages, scores, etc.).

Measurement Scales

Qualitative data can be converted to quantitative data through the

process called measurements. By measurements, numbers are utilized to

code objects in order that they can be treated statistically. There are four types

of measurements. They are as follows:

Nominal Measurements. Nominal measurements are used only for

identification or classification purposes. Example: students numbers, names of

books, number of vehicles, etc.

Ordinal Measurements. Ordinal measurements do not only classify items.

They also give the order of classes, items or objects. Example: first runner-up,

second runner-up, third runner-up, etc.

Interval Measurements . In interval measurements, numbers are assigned to

the items or objects. They measure the degree of differences between any two

classes. Example: weight, height, temperature, IQ, test scores, etc.

Ratio Measurements . For ratio measurements, the ratio of the numbers

assigned in the measurements shows the ratio in the amount of property being

measured. Multiplication and division have meanings in ratio measurements.

Example: Boris is 40 years old and Morgana is 20 years old, then their ages

may be expressed in the ratio 2:1 (two is to one).

Sampling Techniques

It is not necessary for the researcher to examine every member of the

population to get data or information about the population. Cost and time

constraints will prohibit one from undertaking a study of the entire population.

Sampling techniques are utilized to test the validity of conclusions or

inferences from the sample of population.

Random Sampling. What is random sampling? Random sampling is a

method of selecting sample size from a population or universe such that each

member of the population has an equal chance of being selected in the sample

and all possible combinations of size have an equal chance of being selected as

the sample.

Stratified Random Sampling. In this method the population is first divided into

groups based on homogeneity in order to avoid possibility of drawing

samples whose members come only from one stratum.

Cluster Sampling. It is the advantageous procedure when the population is

spread out over a wide geographical area. It is also means as a practical

sampling technique used if the complete list of the members of the population is

not available. A cluster refers to an intact group which has a common

characteristics.

Methods Used

Collection of

in the

Data

This is a method of person-to-person exchange between

the interviewer and the interviewee.

The following are the advantage of the direct or interview

method:

1.

2.

influence the respondents answer through his facial

expressions, tone of voice, or wording of the questions.

3.

responses if their expected or desired responses are not

obtained.

The questionnaire method is one of the easiest

methods of data gathering. In this method, written

responses are given to prepared questions. A

questionnaire is a list of questions which are

intended to elicit answer to the problems of a study.

It should be attractive, includes illustrations,

pictures, and sketches. Its contents, especially the

directions, must be precise, clear, and selfexplanatory.

3. Registration Method

This method of gathering information is enforced by

certain law. Examples are the registration of births,

deaths, motor vehicles, marriages, and licenses.

The advantage of this method is that information is

kept systematized and made available to all

because of the requirement of the law.

4. Observation Method

Observation method is utilized to gather

data regarding attitudes, behavior, values,

and cultural patterns of the sample under

study. It is usually used when the subjects

cannot talk or write.

5. Experiment Method

An experiment is applied to collect data if

the investigator wants to control the factors

affecting the variable being studied.

Methods of Presenting

Data

Collected data are useless and

invalid if they are not presented

effectively

for

analyses

and

interpretations. Data are presented in

four general methods: [1] textural

method, [2] tabular method, [3] semitabular method, and [4] graphical

method or presentation.

Frequency Distribution

When the researcher gathers all

the needed data, the next task is to

organize and present them with the

use of appropriate tables and graphs.

Frequency distribution is one system

used to facilitate the description of

important features of the data.

a lower limit and an upper limit.

Class Boundaries if heights are recorded to the nearest inch,

the class interval 60 62 theoretically includes all measurements

from 59.5000 to 62.5000 in. These numbers, indicated briefly by

the exact numbers 59.5 and 62.5, are class boundaries, or the

true class limits; the smaller number [59.5] is the lower class

boundary, and the larger number [62.5] is the upper class

boundary.

Class Mark - is the midpoint or middle of a class interval.

Example: it is obtained by finding the average of the lower class

limit and the upper class limit. The class mark of the class limit

5 9 is [5 + 9]/2 or 7.

Class Size refers to the difference between the upper class

boundary and the lower class boundary of a class interval.

Class Frequency - means the number of observation belonging

to a class interval.

Graphical Presentation of

Data

Histogram - is made up of vertical bars that are joined together, making

an appropriate graph for continuous data. The base of each bar or

rectangle is equal to the class boundaries, wherein height

corresponding to its class frequency.

Frequency Polygon is commonly called linear graph. It is very useful

device to show changes in values over successive periods of time.

An advantage of the frequency distribution is that it can be used to

compare two or more distributions graphically on one pair of axes.

Bar Graph is used to represent discrete data, where the bars are

separated. The length of each bar is arbitrary. However, the bars

must be of the same width. Thus, the bar graph is almost like as the

histogram, the only difference is that the bars of the histogram are

joined.

Pie Diagram or Pie Chart is used to show percentage distribution. It is

made up a circle subdivided into sectors proportional in size to the

quantities or percentages they represent.

1. The symmetrical or bell-shaped frequency curves, frequency curves

are characterized by the fact that observations equidistant from the

central maximum have the same frequency. An important example is

the normal curve.

2. In J-shaped and reversed J-shaped frequency curves, a maximum

occurs at the end.

3. In the moderately asymmetrical or skewed frequency curves, the tail

of the curve to one side of the central maximum is longer than that to

the other. If the longer tail occurs to the right, the curve is said to be

skewed to the right or have positive skewness, while if the reverse is

true, the curve is said to be skewed to the left or have negative

skewness.

4. A U-shaped frequency curve has maxima at both ends.

5. A bimodal frequency curve has two maxima.

6. A multi-modal frequency curve has more than two maxima.

Illustration:

Symmetrical

Or Bell-shaped

Reversed J-shaped

(positive Skewness)

U-shaped

(negative Skewness)

Bimodal

J-shaped

Multi-modal

