You are on page 1of 27

STATISTICS REFRESHER

THE IMPORTANCE OF STATISTICS


Statistics allow us to make sense of and interpret a great deal of
information. Consider the sheer volume of data you encounter in a
given day. How many hours did you sleep? How many students in
your class ate breakfast this morning? How many people live
within a one-mile radius of your home? By using statistics, we can
organize and interpret all of this information in a meaningful way.
In psychology, we are also confronted with enormous amounts of
data. How do changes in one varible impact other variables? Is
there a way we can measure that relationship? What is the overall
strength of that relationship and what does that mean? Statistics
allow us to answer these kinds of questions.
STATISTICS ALLOWS:
 Organize data: When dealing with an enormous amount of
information, it is all too easy to become overwhelmed. Statistics
allow psychologists to present data in ways that are easier to
comprehend. Visual displays such as graphs, pie charts, 
frequency distributions, and scatterplots allow researchers to get
a better overview of data and look for patterns they might
otherwise miss.
STATISTICS ALLOWS:
 Describe data: Think about what happens when researchers
collect a great deal of information about a group of people (for
example, the U.S. Census). Descriptive statistics provide a
way to summarize facts, such as how many men and women
there are, how many children there are, or how many people are
currently employed.
STATISTICS ALLOWS:
 Make inferences based on data: By using what's known as
inferential statistics, researchers can infer things about a given
sample or population. Psychologists use the data they have
collected to test a hypothesis. Using statistical analysis,
researchers can determine the likelihood that a hypothesis
should be either accepted or rejected.
TWO TYPES OF STATISTICS

Descriptive Statistics Descriptive statistics are used to


quantitatively summarize and describe the
salient features of a collection of data and
information. or example, a study that uses
descriptive statistics will present
information on the demographics of the
sample population.
Inferential Statistics inferential statistics provide ways of testing
the reliability of the findings of a study
and "inferring" characteristics from a
small group of participants or people (your
sample) onto much larger groups of people
(the population). Descriptive statistics just
describe the data, but inferential let you say
what the data mean. An example of
inferential statistics is the analysis of
variance (ANOVA).
PROPERTIES OF SCALES

Magnitude Magnitude is the property of


“moreness.”
A scale has the property of magnitude
if we can say that a particular instance
of the attribute represents more, less, or
equal amounts of the given quantity
than does another instance
Equal Intervals A scale has the property of equal
intervals if the difference between
two points at any place on the scale
has the same meaning as the difference
between two other points that differ by
the same number of scale units.
Absolute 0 An absolute 0 is obtained when
nothing of the property being
measured exists.
SCALES OF MEASUREMENT

1. NOMINAL – IS THE SIMPLEST MEASUREMENT SCALE


AS IT IS ONLY CONCERNED WITH CLASSIFYING DATA
WITHOUT RESPECT FOR ORDER OR EQUAL INTERVAL.

2. ORDINAL – CLASSIFIES AND ASSIGNS RANK-ORDER


TO DATA. LIKERT-TYPE SCALES, WHICH OFTEN RANK
DEGREES OF SATISFACTION TOWARD A PARTICULAR
ISSUE, ARE AN EXAMPLE.
SCALES OF MEASUREMENT

3. INTERVAL – INCLUDES ALL ORDINAL SCALE QUALITIES


AND HAS EQUIVALENT INTERVALS-THAT IS, INTERVAL
SCALE MEASURES HAVE AN EQUAL DISTANCE BETWEEN
EACH POINT ON THE SCALE.

4. RATIO – MOST ADVANCED SCALE OF MEASUREMENT AS


IT PRESERVES THE QUALITIES OF NOMINAL, ORDINAL
AND INTERVAL SCALES AND HAS AN ABSOLUTE ZERO.
SCALES OF MEASUREMENT
Measurement Description Examples
Scales
Nominal Labels, Categories, Male, Female
Identifies groups of
people who share
common attributes

Ordinal Measure of magnitude 1st Place, 3rd Place

Interval Units are in equal IQ scores


intervals
Ratio Units are in equal Height, Weight
interval but with
meaningful zero value
SCALES OF MEASUREMENT
PERMISSIBLE OPERATIONS
Level of measurement is important because it defines which mathematical operations we
can apply to numerical data. For nominal data, each observation can be placed in only
one mutually exclusive category. For example, you are a member of only one gender.
One can use nominal data to create frequency distributions (see the next section), but no
mathematical manipulations of the data are permissible. Ordinal measurements can be
manipulated using arithmetic; however, the result is often difficult to interpret because it
reflects neither the magnitudes of the manipulated observations nor the true amounts of
the property that have been measured. For example, if the heights of 15 children are rank
ordered, knowing a given child’s rank does not reveal how tall he or she stands. Averages
of these ranks are equally uninformative about height. With interval data, one can apply
any arithmetic operation to the differences between scores. The results can be interpreted
in relation to the magnitudes of the underlying property. However, interval data cannot
be used to make statements about ratios. For example, if IQ is measured on an interval
scale, one cannot say that an IQ of 160 is twice as high as an IQ of 80. This
mathematical operation is reserved for ratio scales, for which any mathematical
operation is permissible.
FREQUENCY DISTRIBUTIONS
A single test score means more if one relates it to other test scores. A distribution of
scores summarizes the scores for a group of individuals. In testing, there are many ways
to record a distribution of scores.
The frequency distribution displays scores on a variable or a measure to reflect how
frequently each value was obtained. With a frequency distribution, one defines all the
possible scores and determines how many people obtained each of those scores.

PERCENTILES
PERCENTILE RANKS
MEASURE OF CENTRAL TENDENCY
Describing distributions.

1. MEAN – IS THE ARITHMETIC AVERAGE OF A SET


OF SCORES. IT IS COMPUTED BY SUMMING THE
VALUES OF ALL DATA POINTS AND DIVIDING BY
THE TOTAL NUMBER OF PARTICIPANTS.
 OUTLIER – IS AN EXTREME DATA POINT THAT
DISTORTS THE MEAN.
2. Median – is the middle most score when the scores
are ordered from smallest to largest, or largest to
smallest. When the number of scores is even, one
takes the average of the two middle most scores.
MEASURE OF CENTRAL TENDENCY

3. MODE – IS THE MOST FREQUENTLY OCCURRING


SCORE. IF A DATA SET HAS TWO FREQUENTLY
OCCURRING SCORES, IT IS SAID TO BE BIMODAL.
IF A DATA SET HAS MORE THAN TWO FREQUENTLY
OCCURRING SCORES, IT IS MULTIMODAL.
MEASURE OF VARIABILITY

 ANSWERS THE QUESTION “HOW DISPERSED ARE SCORES FROM A MEASURE OF


CENTRAL TENDENCY?” IT IS THE AMOUNT OF SPREAD IN A DISTRIBUTION OF
SCORES OR DATA POINTS.

1. RANGE (R) – IS THE MOST BASIC INDICATOR OF


VARIABILITY COMPUTED BY SUBTRACTING THE
SMALLEST VALUE FROM THE LARGEST VALUE AND
ADDING 1 PLACE VALUE.
2. INTERQUARTILE – THE DIFFERENCE BETWEEN Q1 AND
Q3
MEASURE OF VARIABILITY

INTERQUARTILE

Step 1. Put the Numbers in order.


2, 4, 5, 6, 7, 9, 10

Step 2. Find the Median


(2, 4, 5) 6 (7, 9, 10)

Step 3. Find Q1 (median of the lower half of the data) and Q3 (median of the upper
half of the data)
(2, 4, 5) 6 (7,9, 10) Q1=4 Q3=9

Step 4. Subtract Q1 from Q3


9-4 = 5
MEASURE OF VARIABILITY

3. STANDARD DEVIATION – THE MOST FREQUENTLY REPORTED


INDICATOR OF VARIABILITY FOR INTERVAL OR RATIO DATA. THE
LARGER THE STANDARD DEVIATION, THE GREATER THE
VARIABILITY IN SCORES
4. VARIANCE – FINAL FORM OF VARIABILITY AND IT IS THE
STANDARD DEVIATION SQUARED

Skewness
 Refers to tan asymmetrical distribution with data points that
do not cluster symmetrically around a mean; some
distributions may have scores or data points that cluster
toward the lower end or the higher end of the distribution.
 Skewness is directly related to measure of central tendency.
SKEWNESS

Positively Skewed: Mode < Median <


Mean
SKEWNESS

Negatively Skewed: Mean < Median <


Mode
KURTOSIS

 INDICATOR OF THE SHAPE OF A DATA DISTRIBUTION. IT IS


DERIVED FROM THE GREEK WORD REFERRING TO
“PEAKEDNESS.” THE HEIGHT OF THE DISTRIBUTION PROVIDES
ONE WITH A LOT OF INFORMATION ABOUT HOW DATA POINTS
ARE CLUSTERED. THE MORE DATA POINTS OR SCORES ARE
CLUSTERED AROUND A MEAN, THE MORE PEAKED THE
DISTRIBUTION. THE FURTHER SCORES ARE DISPERSED FROM
THE MEAN, THE FLATTER THE DISTRIBUTION.

 THREE GENERAL SHAPES OF DISTRIBUTIONS ARE MESOKURTIC


(NORMAL CURVE), LEPTOKURTIC (TALL AND THIN AND
PLATYKURTIC (FLAT AND WIDE)
KURTOSIS

Numeric Values:

Mesokurtic or normal distribution ( kurtosis value of 0 or technically from -1.00 to


+1.00)
Leptokurtic distribution (kurtosis value of > 1.00)
Platykurtic distribution (kurtosis value of < -1.00)
SCATTER DIAGRAM

• A SCATTER DIAGRAM IS A PICTURE OF THE RELATIONSHIP BETWEEN


TWO VARIABLES.
REGRESSION

• It is used to make predictions about scores on one


variable from knowledge of scores on another
variable. These predictions are obtained from the
regression line, which is defined as the best-fitting
straight line through A set of points in A scatter
diagram. It is found by using the principle of least
squares, which minimizes the squared deviation
around the regression line.
STATISTICAL TOOLS

 CORRELATION COEFFICIENT – PROVIDES INFORMATION ABOUT THE


RELATIONSHIP BETWEEN TWO VARIABLES. CORRELATIONS INDICATE THREE
THINGS: WHETHER THERE IS A RELATIONSHIP AT ALL, THE DIRECTION OF THAT
RELATIONSHIP AND THE STRENGTH OF THE RELATIONSHIP.
 INDEPENDENT T-TEST – INVOLVES COMPARING TWO INDEPENDENT GROUPS ON
ONE DEPENDENT VARIABLE
 DEPENDENT T-TEST – INVOLVES SIMILAR GROUPS PAIRED OR MATCHED IN
SOME MEANINGFUL WAY OR THE SAME GROUP TESTED TWICE.
 ANALYSIS OF VARIANCE (ANOVA) – INVOLVES HAVING AT LEAST ONE
INDEPENDENT VARIABLE IN A STUDY WITH THREE OR MORE GROUPS OR LEVELS
STATISTICAL TOOLS

 CHI-SQUARE – USED WITH TWO OR MORE CATEGORICAL


NOMINAL VARIABLES, WHERE EACH VARIABLE CONTAINS AT
LEAST TWO CATEGORIES
 FACTOR ANALYSIS – IS TO REDUCE A LARGER NUMBER OF
VARIABLES TO A SMALLER NUMBER OF FACTORS (GROUPS OR
FACTORS).
OTHER

• SPEARMAN’S RHO
• BISERIAL CORRELATION
• RESIDUAL
• STANDARD ERROR OF MEASUREMENT

• COEFFICIENT OF DETERMINATION

You might also like