You are on page 1of 16

RIZAL TECHNOLOGICAL UNIVERSITY

Cities of Mandaluyong and Pasig

Statistics Refresher

Student Name
Section
Schedule
Program
College
Professor JOAN B. MARASIGAN
Module Duration
SESSION NO. / WEEK NO. 4

MODULE NO.: STATISTICS REFRESHER (PART 1)

TOPICS:
1. Scales of measurement
2. Descriptive Statistics
3. The normal curve
4. Standard scores

Overview

This module is designed to guide you in recollecting and reorganizing your previous
learnings from subjects that encompass Statistical concepts. Knowledge in at least some basic
statistical concepts is necessary to facilitate your understanding of the science of measuring
psychological constructs.
Here, the scales of measurements (nominal, ordinal, interval, ratio) will be reintroduced
with emphasis on measuring psychology-related data/variables. You will also be reoriented about
describing psychology-related data as well as measures of central tendency. The concepts of
normal curve and standard scores will be introduced to equip you with the skills in giving meaning
to test scores and in interpreting them appropriately.
This module dwells more on the Descriptive Statistical concepts in psychological
measurement. It is expected that the knowledge from this module will help you further understand
how we quantify human behavior.
Study Guide

• You can complete this module at your own phase. It means you can take every part to
study depending on your capability to do so.
• This module starts with a brief overview about what the topic is all about. It will also tell
you the learning outcomes we would want you to accomplish after completing this module.
The presentation of topic has complete resources you can check for further readings and
clarifications.
• Requirements involve learning activities and assessment which are available on the last
part of the module. Take time to study each lesson carefully to apply these new leanings
appropriately.

Learning Outcomes

1. Identify and Differentiate types of data


2. Differentiate Descriptive and Inferential Statistics
3. Familiarize with the concept of Normal curve and its relevance to Psychological testing and
assessment.
4. Familiarize the learners about the meaning of score
5. Convert scores into different forms.
Topic Presentation

Importance of Statistics in Psychological Testing and Assessment


You have encountered countless tests throughout your entire stay in school- from
kindergarten to college. Chances are, at any given point in your life, you may have answered
some tests that measured your academic ability (school, scholarship, NAT, NCAE), personality
tests, interest test, employment exam, or a random online pop-psychology test.
Your role most of your life is that of a “test-taker”. Things change once you shift your role
from being a “test-taker to a “test administrator”. One day, you may be a student-researcher who
will create and administer a test to selected respondents; an HR practitioner who will administer
test to applicants; or a teacher giving midterm exam to your students.
Test scores, how they are obtained, analyzed and interpreted often remain a mystery to
people without background in testing and assessment. Contrary to popular notion that it is just
simply counting the number of correct answers to arrive with a decision to pass or fail, it is way
more complex, yet systematic than that.
Test scores are frequently expressed as numbers, and statistical tools are used to
describe, make inferences from, and draw conclusions about numbers. Knowledge in
psychological statistics also help us in clinical decision making when we interpret quantitative
assessment tools.
LEVELS OF MEASUREMENT
Humans are obsessed with measurements. We love measuring everything from the size
of our shoes, body parts, dimension of our house, distance from our home to school, depth of a
swimming pool, etc.
There are also instances that you have stepped on a weighing scale to measure your
weight. While these physical properties such as time, distance, weight, length and temperature
can be measured using standardized tools such as a weighing scales or thermometer, it is difficult
to measure psychological constructs such as happiness, anxiety, magnitude of negative
experiences, and interest. Obviously, you can’t use a weighing scale or a thermometer to measure
your happiness. The table below shows comparisons of measuring physical and psychological
concepts:
Table 1. Analogy/comparison between measuring physical and psychological variables. It is
difficult to measure psychological variables because there are no universal unit of measure or tools that
can be used.

Variable being Unit of Tools used to


Occupation
Measured Measurement Measure
Engineer Distance Meters Meter stick
Nurse Temperature Celsius Thermometer
Dietician Weight Kilograms Weighing scale
Psychometrician Happiness High/Low (Level) Test/Questionnaire
Measurement (from Greek “metron”; Filipino “Pagsukat”) as the act of assigning numbers or
symbols to characteristics of things (people, events, whatever) according to rules. The rules
used in assigning numbers are guidelines for representing the magnitude (or some other
characteristic) of the object being measured (Cohen & Swerdlik, 2018). Below are examples of
measurement rules:

Fig 1: It is assigned that the distance between 0-1 inch constitutes


one unit of measurement (inch), same as from 1-2 inches.

Fig 2: It is assigned that the distance between each Likert response anchor is
one unit of measurement. So, we can arbitrarily assign “Strongly Agree” with
a value of 4, “Agree” with 3, “Disagree” with 2, and “Strongly Disagree” with 1.

Variables. Before we can begin to describe data, we need to decide what sort of data we have.
This seems like a very obvious thing to say, but it is easy to make mistakes. Different sorts of data
need to be summarized in different ways. When we measure something, we are assigning
numbers to individuals (where an individual is usually, but not always, a person). A measurement
is usually called a variable. A variable is anything that can vary (or change) between individuals
(Miles & Banyard, 2007).

Scale (Filipino “Panukat”) is a set of numbers (or other symbols) whose properties model
empirical properties of the objects to which the numbers are assigned. A Scale can be classified
as Nominal, Ordinal, Interval, and Ratio.

Table 3. Summary of the Types of Variables and Scales

TYPES OF
TYPES OF DATA/SCALE EXAMPLE
VARIABLE
Sex: Male, Female
NOMINAL- where there are
Test Result: Passed, Failed
Categorical three or more possible
Jungian Personality Type:
measures- categories, but there is no
ESTJ, ISTJ, ENTJ, and INTJ
Categorical natural order to the categories.
Student: Regular student,
measures are Words are used instead of
working student, student
qualitative or numbers.
athlete, irregular student.
classification
ORDINAL- when the categories Birth Order,
variables.
have an order. However, the Place in a race,
distance between each ranks Class ranking
and orders are not always equal. Grade Level
Also numbers are used to Percentile Scores
designate an orderly series. Likert Scale (though this is
controversial).
Interval Scale- have the same
interval between each score.
Also known as equal-unit scales.
In these scales, the difference
between any two consecutive Fahrenheit and Celsius
numbers reflects an equal temperature scales; calendar
empirical or demonstrable
Continuous
difference between the
Variables- give you
objects or events that the
a score for each
numbers represent
individual person
Ratio scales- numbers achieve
and measures may
the property of additivity, which
(theoretically) take
means they can be added—as
any value.
well as subtracted, multiplied,
and divided—and the result Measures of length; periods of
expressed as a ratio, all with time
meaningful results. Ratio scales
have a true or absolute zero point
that stands for “none” of
whatever is being measured.

These scales also have properties such as magnitude, equal interval, and absolute
zero. Each of these four scale has distinct properties:

1. Magnitude- is the property of “moreness.”


2. Equal Interval- the difference between two points at any place on the scale has
the same meaning as the difference between two other points
3. Absolute Zero- nothing of the property being measured exists

Table 4. Scales and their Properties in a Nutshell

Magnitude Equal Interval Absolute Zero


Nominal No No No
Ordinal Yes No No
Interval Yes Yes No
Ratio Yes Yes Yes

TYPES OF STATISTICS

Descriptive statistics
Analysis of data that helps describe, show or summarize data in a meaningful way such
that patterns might emerge from the data. Descriptive statistics do not, however, allow us to make
conclusions beyond the data we have analyzed or reach conclusions regarding any hypotheses
we might have made. They are simply a way to describe our data
(https://statistics.laerd.com/statistical-guides/descriptive-inferential-statistics.php). Typically, the
data can be described based on:

a. Frequency Distribution
b. Measures of central tendency: mean, median, and mode
c. Measures of spread: includes range, quartiles, absolute deviation, variance and
standard deviation.

Inferential statistics
Methods used to make inferences from observations of a small group of people, called a
sample. These inferences are then used to estimate the characteristics of a larger group of
individuals, known as a population (Kaplan & Saccuzzo, 2017). The methods involved are:
a. the estimation of parameter(s)
b. testing of statistical hypotheses.

Table 5
Comparison between Descriptive and Inferential Statistics
Descriptive Inferential
Present data Assess relationships among variables
Draw conclusions and generalized findings
Organize data
about the population based on the sample.
Summarize data Hypothesis testing
Small data set Large data set
Simple Complex
Results obtained represent a portion of the
Represents the entire data
population, but can be used to deduce
set
information about the entire population
Less Error involved is usually more

Frequency Distributions
A single test score means more if one relates it to other test scores. A distribution of
scores summarizes the scores for a group of individuals. In testing, there are many ways to record
a distribution of scores. The frequency distribution displays scores on a variable or a measure
to reflect how frequently each value was obtained (Kaplan & Saccuzzo, 2017).

FIGURE 3.1 FIGURE 3.2


1.1

Above is a sample frequency distribution in tabular (figure 1.1) and graphical formats (figure
1.2).
Percentile Rank
Percentile ranks replace simple ranks when we want to adjust for the number of scores in
a group. A percentile rank answers the question, “What percent of the scores fall below a
particular score (Xi)?”
To calculate a percentile rank (1) determine how # COUNTRY CASES
many cases fall below the score of interest, (2) determine 1 United States 7,382,944
how many cases are in the group (3) divide the number of 2 India 6,549,373
cases below the score of interest (Step 1) by the total 3 Brazil 4,906,833
number of cases in the group (Step 2), and (4) multiply the 4 Russia 1,215,001
result of Step 3 by 100. The formula is: 5 Colombia 848,147
6 Peru 824,985
𝐵 7 Argentina 790,805
𝑃𝑟 = 𝑥 100 = 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑘 𝑜𝑓 𝑋𝑖 8 Spain 789,932
𝑁
9 Mexico 757,953
10 South Africa 679,716
Where: 11 France 606,625
Pr= Percentile Rank 12 United Kingdom 480,017
Xi= the score of interest 13 Chile 470,179
B= the number of scores below Xi 14 Iran 468,119
N= the total number of score 15 Iraq 375,931
16 Bangladesh 367,565
The table above shows the recent data of COVID 17 Saudi Arabia 335,997
18 Italy 325,329
cases worldwide (as of October 5, 2020). Here N=220
19 Turkey 324,443
countries were ranked based on the number of cases and 20 Philippines 322,497
recovery. If you are interested in the relative percentile *N= 220 countries, countries no. 21-
ranking of the Philippines in terms of cases in this global 220 are not shown here
index, you may apply the formula above:

Where:
𝐵 Pr= Percentile Rank
𝑃𝑟 = 𝑥 100 = 𝑝𝑒𝑟𝑐𝑒𝑛𝑡𝑖𝑙𝑒 𝑟𝑎𝑛𝑘 𝑜𝑓 𝑃ℎ.
𝑁 Xi= Philippines
200 B= 200
𝑃𝑟 = 𝑥 100 = 90.91% N= 220
220

Based on the computation above, when we rank the Covid-19 cases worldwide,
Philippines has a percentile rank of 90.91%. It means that we have more COVID cases than
90.91% of the countries in the entire world! Now it’s your turn, give the percentile ranks of the
following countries based on the data above:

Ranks Below
Country N Pr
them (B)
1. USA
2. France
3. Russia
Frequency distribution can also be described by looking at its appearance and symmetry. Below
is an example of frequency distribution of test scores:

Figure 3.3
Symmetrical distribution approximate a normal curve. Majority of the scores are
gathered in the center and spread out evenly from left to right

Skewness
Distributions can be characterized by their skewness (Filipino: baling, pagkakiwal), or
the nature and extent to which symmetry is absent. Skewness is an indication of how the
measurements in a distribution are distributed (Cohen & Swerdlik, 2018).
The asymmetry of a frequency distribution can give us some hints about the characteristic
of the variable measured. When a data is positively skewed (see figure 2.2), the scores are
gathered in the left side of the X-axis; while in negatively skewed distribution (see figure 3.2), the
scores accumulate in the right side of the X-axis. To avoid being confused, look at the thinner tail
of the distribution. In the context of testing, we can say that a test is difficult if the distribution is
positively skewed, and easy if negatively skewed. The relative placement of the mean, median,
and mode also vary depending on the shape of the distribution.

Figure 4.1 Figure 4.2


Positively Skewed Negatively Skewed
Skew often happens because of a floor effect or a ceiling effect.
Floor effect happens when the scores are gathered in the left portion of the distribution,
signifying that a test is difficult. In a case of psychological measure, for example, if we measure
the levels of depression in a ‘normal’ population, we will find that most people are not very
depressed, some are a little depressed and a small number are very depressed.
Ceiling effect are much less common in psychology, although they sometimes occur –
most commonly when we are trying to ask questions to measure the range of some variable, and
the questions are all too easy, or too low down the scale.
Kurtosis
Kurtosis is much trickier than skew, and luckily for us, it’s usually less of a problem. It
occurs when there are either too many people at the extremes of the scale, or not enough people
at the extremes, and this makes the distribution non-normal. A distribution is said to be positively
kurtosed when there are insufficient people in the tails (ends) of the scores to make the
distributions normal, and negatively kurtosed when there are too many people, too far away, in
the tails of the distribution (Miles & Banyard, 2007).

Figure 5
Kurtosis

Although your distribution is approximately normal, you may find that there are a small
number of data points that lie outside the distribution. For example, you have this one classmate
who scores almost perfect in a test when almost all of you barely got a score of 60/100. Or a
student who scored very low in a test when majority of you almost got a perfect mark. They are
called “outliers”.

MEASURES OF CENTRAL TENDENCY


Saying central tendency is just a posh way of saying ‘average’. Average is a tricky word, because
it has so many different meanings, so it is usually best to avoid it, and use a more specific word
instead.
The mean is what we all think of as the average. Strictly speaking, it is called the arithmetic
mean because there are other types of mean.

Where:
Σx= add up all of the values in x.
N= total number of cases
The median is the middle score in a set of scores. The median is used when the mean is
not valid, which might be because the data are not symmetrically or normally distributed, or
because the data are measured at an ordinal level.
The mode is rarely reported in research. It is the most frequent score in the distribution
or the most common observation among a group of scores. The mode is the best measure of
central tendency for categorical data (although it’s not even very useful for that).

MEASURES OF DISPERSION AND SPREAD


When describing a variable it is necessary to describe the central tendency (the mean,
median or mode). However, the central tendency doesn’t mean a lot without a measure of
dispersion or spread.
Range
The range is the simplest measure of dispersion. It is simply the distance between the
highest score and the lowest score
Range= Highest Value – Lowest Value
Standard Deviation
Measure of variability equal to the square root of the average squared deviations about
the mean. More succinctly, it is equal to the square root of the variance. Formula:

x – bar is the mean;


Σ means ‘add them all up’;
σ is the standard deviation;
N is the number of cases.

Sample Problem: Compute for the SD of the following array of scores:


x: 9 8 7 1 11 10 4 13 4 3 7
1. Write down the equation. The x refers to each
value, 𝑥̅ is the mean, the superscript 2 means
‘square’ and the ∑ is the Greek letter sigma,
which means ‘take the sum of’.
2. The first thing to do is draw a table, such as
that shown above. The first column (score)
contains the individual scores. The second
column contains the mean. We looked at the
calculation of the mean earlier, so we will just go
through the workings on the right-hand side. The
mean is 7, so we write the number 7 in the
second column.
3. The next stage is to calculate 𝑥–𝑥̅ for each
person. The calculations for the first two
individuals are calculated on the right, and we
have filled in the rest in the table above
4. Next we need to calculate (𝑥–𝑥̅ )2 . To do this,
we square each of the values that we calculated
at stage 3. Again, we have shown the first two
cases on the right, and we have filled in the
table above
5. We can add each of these values together, to
find ∑ (𝑥–𝑥̅ )2 .
6. Now we have all of the information to put into
the equation. By doing this one small stage at a
time, you will be less likely to make a mistake.
While I do the calculations on the right, I will give
instructions in this column. You can follow one,
the other, or both.
7. Calculate N − 1. N = 11 (that’s how many
rows we have in the table above), so N − 1 = 10.
This gives the bottom half of the fraction.

8. Now divide the top half of the fraction by the


bottom half: 136÷10 = 13.6.

9 Find the square root in step 7. (You will almost


certainly need a calculator to do this.) √13.6 =
3.69. This is the standard deviation

The Normal Curve


Development of the concept of a normal curve began in the middle of the eighteenth
century with the work of Abraham DeMoivre and, later, the Marquis de Laplace. At the beginning
of the nineteenth century, Karl Friedrich Gauss made some substantial contributions. Through the
early nineteenth century, scientists referred to it as the “Laplace-Gaussian curve.” Karl Pearson
is credited with being the first to refer to the curve as the normal curve, perhaps in an effort to be
diplomatic to all of the people who helped develop it. Somehow the term normal curve stuck—but
don’t be surprised if you’re sitting at some scientific meeting one day and you hear this distribution
or curve referred to as Gaussian.
Theoretically, the normal curve is a bell-shaped, smooth, mathematically defined curve
that is highest at its center. From the center it tapers on both sides approaching the X-axis
asymptotically (meaning that it approaches, but never touches, the axis). In theory, the distribution
of the normal curve ranges from negative infinity to positive infinity. The curve is perfectly
symmetrical, with no skewness. If you folded it in half at the mean, one side would lie exactly on
top of the other. Because it is symmetrical, the mean, the median, and the mode all have the
same exact value (Cohen & Swerdlik, 2018).
The Area Under the Normal Curve
The normal curve can be conveniently divided into areas defined in units of standard
deviation. A hypothetical distribution of National Spelling Test scores with a mean of 50 and a
standard deviation of 15 is illustrated below. In this example, a score equal to 1 standard deviation
above the mean would be equal to 65 (X + 1s = 50 + 15 = 65).

Figure 6. The area under the Normal Curve

Standard Scores
Is a raw score that has been converted from one scale to another scale, where the latter scale has some
arbitrarily set mean and standard deviation? Why convert raw scores to standard scores? Raw scores may be
converted to standard scores because standard scores are more easily interpretable than raw scores. With a
standard score, the position of a testtaker’s performance relative to other testtakers is readily apparent.
z Scores
A z score results from the conversion of a raw score into a number indicating how many
standard deviation units the raw score is below or above the mean of the distribution. Let’s use
an example from the normally distributed “National Spelling Test” demonstrate how a raw score
is converted to a z score. We’ll convert a raw score of 65 to a z-score by using the formula:

T-Score, Sten, and Stanine

There is only one formula to compute for all of these scores:

= (Z-score x SD) + Mean

But, please take note of the following constants! These will be consistent for any
group regardless of the raw score mean and SD:

Standard Score Mean SD


T-Score 50 10
Sten 5.5 2
Stanine 5 2

As you can see from the following formulae, each of the new standard score
systems is based on the Z-score:

T-score = (Z-score x 10) + 50


Sten = (Z-score x 2) + 5.5
Stanine = (Z-score x 2) + 5

If we take some simple Z-scores (such as -2, -1, 0, +1, and +2) we can use the
formulae to calculate other standard score equivalents as follows:
Although some decimals are shown in the tables above, T-scores, stens and
stanines are all usually rounded to the nearest whole number. A sten calculated to be
7.78 would therefore be rounded up to 8.

Figure 7. Sample standard scores for the RTU-CAT 2019

Figure 8. Stanine and Normal Distirbution

For test developers’ intent on creating tests that yield normally distributed measurements, it is generally
preferable to fine-tune the test according to difficulty or other relevant variables so that the resulting distribution
will approximate the normal curve. That usually is a better bet than attempting to normalize skewed distributions.
This is so because there are technical cautions to be observed before attempting normalization. For example,
transformations should be made only when there is good reason to believe that the test sample was large enough
and representative enough and that the failure to obtain normally distributed scores was due to the measuring
instrument.
References

Chapman, K. (2009). FACTSHEET 21: T-Scores, Stens and Stanines. Knight Chapman
Psychological Ltd.
Cohen, R. & Swerdlik, M. (2018). Psychological Testing and Assessment An Introduction to
Tests and Measurement, 9th ed. McGraw-Hill Education.
Kaplan, R. & Sacuzzo, D. (2018). Psychological Testing: Principles, Applications, and
Issues, Ninth Edition. Cengage Learning
Miles, J. & Banyard, P. (2007). Understanding and Using Statistics in Psychology. Sage
Publications Ltd.
Urbina, S. (2014). Essesntials of Psychological Testing. Wiley.

Website:
(https://statistics.laerd.com/statistical-guides/descriptive-inferential-statistics.php).

You might also like