LU 3 Descriptive Statistics in SPSS

Nueva Ecija University of Science and Technology
College of Arts and Sciences

MATHEMATICS AND SCIENCES DEPARTMENT
Psy 102: Psychological Statistics
Lecture prepared by: JAYNELLE G. DOMINGO, MSc. MathEd

LESSON OBJECTIVES:
At the end of the lesson, the students are expected to:
ü Determine frequency counts and percentages using SPSS
ü Compute for the measures of central tendency of a set of

scores using SPSS
ü Compute for the measure of variability of a set of scores

using SPSS
Introduction
Computing the measures of central tendency (mean,

median and the mode), and measures of variability (range,
quartiles, decile percentiles, standard deviation, variance and
coefficient of variation) using long methods of calculations are
tedious and prone to error especially involving large data sets (i.e.
hundreds or even thousands of data). These long calculations may
be one of the reasons why statistics as a course seems to be
complicated and difficult. Thus, students tend to dislike the
subject.
Introduction
With technological advancement and the invention of

computers, programs have been developed to help people
overcome difficulties in long calculations, and more importantly
arriving at more accurate results. In research and related
endeavors, the Statistical Package for Social Sciences (or SPSS) is
one of the many calculation programs developed by the
International Business Machines, Corporation (or IBM) to aid
researchers in analyzing large data sets easily. Specifically, this
module utilizes the SPSS version 21.
Review of Two Basic Descriptive Statistics
1. Measures of Central Tendency (MCT)
Statistician often collects data from small portions

(sample) of a large group (population) in order to determine
information about the group. These data usually represents by a
single value referred to as measures of central tendency or central
location. Measure of Central Tendency (MCT) is measure
indicating the center of a set of data which are arranged in order
of magnitude. There are three (3) MCT that are commonly used
namely, the mean, the median, and the mode.
a. Mean – simply the average. It is the most commonly used MCT.

The mean is denoted by 𝜇 for populaJon mean and 𝑥̅ for
sample mean.
Proper&es of Mean
• The mean reflects the magnitude of every observaJon, since
every observaJon contributes to the value of the mean.
• The mean can be easily affected by the presence of an extreme
value, hence not a good measure of MCT when extreme value
do occur.
b. Median – the middle score for a set of data arranged in order

(array data). It is denoted by 𝑀𝑑 or 𝑥.
%
Proper&es of Median
• Median is a posiJonal value and hence is not affected by the
presence of an extreme value unlike the mean.
• The median is not amenable for further computaJon and hence
medians of subgroups cannot be combined in the same manner
as the mean.
c. Mode – the most frequent score or value in the data set. It is

sometimes considered as the most popular option and is denoted
by 𝑀𝑜 or 𝑥.
# A particular data set can have no mode, one mode
(unimodal) or two modes (bimodal) and so on.
Properties of Mode
• Since mode is the most frequently occurring value, it may not be the
center of the data.
• Mode does not make use of all observations.
• Mode is difficult to manipulate algebraically.
• Mode is ideal for qualitative type of data.
Illustration of MCT
Koko recorded his duration of stay in library for 10 school days. His data are as
follows:
Day Duration (in minutes)
1 44
2 20
3 35
4 33
5 40
6 33
7 33
8 15
9 42
10 34
Mean:
∑ 𝑥 44 + 20 + 35 + 33 + 40 + 33 + 33 + 15 + 42 + 34 329
𝑥̅ = = = = 𝟑𝟐. 𝟗 𝒎𝒊𝒏𝒔
𝑛 10 10
Median:
Arrange first the data from lowest to highest.
15 20 33 33 33 34 35 40 42 44
Since we have even number of data, two middle scores occur. Add the two middle score and divide the sum by 2.
33 + 34 67
𝑥7 = = = 𝟑𝟑. 𝟓 𝒎𝒊𝒏𝒔
2 2
Mode:
In the data set, 33 appear thrice. Thus, 33 is the mode and the data is unimodal.
; = 𝟑𝟑 𝒎𝒊𝒏𝒔
𝒙
2. Measures of Dispersion
Measures of dispersion (varia0on) iden0fy how a set of values

spreads or fluctuates. It enables you to know how varies the observa0on
are, whether there are extreme values in the distribu0on, or whether the
values are very close to each other. The common measures of varia0on are
range, mean absolute devia0on, variance, standard devia0on, coefficient of
varia0on, quar0le devia0on, and the percen0le range. However, only the
range, variance and standard devia6on of ungrouped data will be
discussed in this sec0on as these three are the most commonly used and
more prac0cal when it comes to inferen0al sta0s0cs.
a. Range – the difference between the greatest data value and the
lowest data value. It is the simplest measure of dispersion but the
least reliable. It does not reflect variations in the data set that lie in
between the highest and lowest data value.
Example:
In Koko’s data in his duration of stay in the library, the highest data value
is 44 and the lowest data value is 15. Thus,
𝑅 = 44 − 15
𝑹 = 𝟐𝟗
b. Variance – considers the deviation of every single data value in the

data set unlike range. It is simply referred to as the average of the
squared deviation of each data value from the mean of the data
set. It is denoted by 𝜎 ! for population variance and 𝑠 ! for sample
variance. Like the range, the higher the computed value of variance
the more dispersed are the data set. It is always nonnegative.
Formula:
Population Variance Sample Variance
∑(?@A)! ∑(?@?)̅ ! D ∑ ?! @ ∑ ? !
𝜎= = 𝑠= = or 𝑠= =
C D@E D D@E
c. Standard Deviation – computed as the positive square root of variance.

Similar to variance, it is based on the deviations of all data value in data
set. Standard deviation is considered as the most reliable measure of
dispersion as this value is associated with the characteristics of common
data sets which are normally distributed. In statistics, it is denoted by 𝜎
for population standard deviation and 𝑠 for sample standard deviation but
in research, it is denoted by Std. Dev. or 𝑆𝐷.
Formula:
Population SD Sample SD
∑(?@A)! ∑(?@?)̅ !
𝜎= 𝑠=
C D@E
Example:
Let us consider Koko’s data in his duration of stay in the library (treated as
sample). The Statistical Package for Social Sciences (SPSS) was used to
compute the variance and standard deviation since this course
recommends computer application. The results are:
𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 (𝒔𝟐 ) = 𝟖𝟑. 𝟐𝟏
𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒔 = 𝟗. 𝟏𝟐
FREQUENCY COUNTS AND PERCENTAGES IN SPSS
In most cases, aside from analyzing the data to answer the main
objective of the study (i.e. testing the hypothesis that there is no
significant difference on the test anxiety of male and female students),
we usually start in determining how many of the respondents (from a
large data) belongs to a category in a study variable. For instance, of the
1000 respondents, “how many are females ?, without actually counting
it manually. Or maybe we are interested on determining, “what percent
of the students answered strongly agree on one of the test anxiety
items in your questionnaire, again without actually having to count it.
To illustrate this, we will

use the excel file. Folder name:
DATA SET FOR LECTURES, File
name: DATA SET (descriptive
statistics), a portion of which is
shown.
The sample excel file contains data gathered from a sample of 90

students, who were asked relevant information such as their gender, type of
school, how much they like schooling, and their scores in English test, Math
test and Science test.
The categorical variables were dummy coded as:

a) Gender (1-male, 2-females)
b) Type of school (1-public, 2-private)
c) How much do you like schooling in general
( 5- very much, 4- much, 3- neutral, 2- not much, 1- not at all).
Meanwhile, English test, Math test and Science test are the
actual raw scores obtained in the test. Using the data, respondent 1 is a
male (coded as 1), enrolled in a private school (coded as 2), and
answered neutral in terms of how much he like schooling (coded as 3).
He got the score of 65, 88 and 76 in the English test, Math test, and
Science test, respectively.
How to import excel files to the SPSS

program
Step 1. Open your SPSS. Close the initial
dialog box to show a blank SPSS
Data Editor. (Note that earlier
version can be used, but some of
the features may be different.)

program
Step 2. Click File, Open


program
Step 3. Look in the data where it is saved.

The file folder’s name is DATA SET
FOR LECTURES. Specify files of type:
Excel (*xls, *xlsx, *xlsm). Click DATA
SET (descriptive statistics).xls

program
Step 4. Click Open, then OK. (If the excel

file to be used contains multiple
spread sheets, select the spread
sheet to be analyzed before clicking
OK).
We have successfully imported the excel file to the

SPSS program, and is now ready for analysis. There are so
many features of the SPSS program that are very interesting
to learn, but we will only tackle the ones that we need in as
far as our objectives are concerned.
HOW TO DETERMINE FREQUENCY COUNTS

AND PERCENTAGES OF CATEGORICAL DATA
(NOMINAL AND ORDINAL) USING SPSS
Step 1. With the excel file already imported

to the SPSS, Click Analyze, Descriptive
Statistics, Frequencies.
HOW TO DETERMINE FREQUENCY COUNTS AND

PERCENTAGES OF CATEGORICAL DATA
Step 2. Choose the variables to be analyzed and

put them inside the Variable(s) by using
the arrow pointing to the right. You can
choose the variables one at a time or
simultaneously, and you can use the arrow
pointing to the left if you like to change or
replace the variable). Click Statistics, then
Continue. You can also Click Charts and
specify whether you like a bar graph or a
histogram for graphical representation.
HOW TO DETERMINE FREQUENCY COUNTS

AND PERCENTAGES OF CATEGORICAL DATA
Step 3. Click OK, and the result of the analysis

is shown below.
The output shows that the Valid N for all the three variables (Gender,
Type of School and How much do you like schooling in general) is 90, and 0
Missing data. In other words, the data set is complete. The frequency table
shows that out of 90 respondents, 42 were males (coded as 1), and that is
46.7%. There are 48 females (coded as 2) comprising 53.3%. Similarly, 38 or
42.2% are enrolled in public schools (coded as 1), while the remaining 52 or
57.8% are enrolled in private schools (coded as 2). In terms of how much they
like schooling, 6 (6.7%) responded “very much” (coded as 4), while 38 (42.2%)
were neutral.
With SPSS, you can practically count all categorical variables (nominal
and ordinal) simultaneously and easily even for a very large data set (i.e. n=2,
000).
CROSS TABULATION IN SPSS
In dealing with descriptive statistics in a data set, it is also

possible to determine the frequency counts and percentages of
samples for a specific variable across the levels of another
variable/s.
Example 1. We might be interested to know how many male

students are enrolled in private school, or how many females are
enrolled in public school.
Step 1. With the excel file open, Click Analyze, Descrip^ve Sta^s^cs,
Crosstabs
Step 2. Put one of the variables in the Row(s) box, and the other
variable in the Column(s) box.
Step 3. Click Cells, and Row and Column Percentage (to express
frequency counts as percentages.)
Step 4. Click OK, and the result of the analysis is shown.

The result of cross tabulation shows that out of the 42 males

(Gender 1) , 20 or 47.6 are enrolled in public school (Type of School
1), while 22 or 52.4 % are enrolled in private school (Type of School
2) . We can also describe it in terms of type of school. Out of the 38
students who are enrolled in public school, 18 or 47.4 are females
while there are 20 or 52.6% males. Practically, you can also perform
cross tabulations easily to any categorical variables, even for large
data sets.
Example 2. How many students enrolled in public school

responded “ neutral” in the question, “How much do you like
schooling in general?
Perform the same procedure. Put type of school in the Row(s)

and the How much do you like schooling in Column(s) box. The
result is:
The cross tabulaGon shows that out of the 38 students

enrolled in public school (School 1) , 14 or 36.8% responded
“neutral”, while 24 out of 52 (46.2%) from private schools
(School 2) responded “neutral”. Moreover, of the 6 students
who responded “ very much”, 5 or 83.3% were from private
school.
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS
HOW TO OBTAIN MEAN,

MEDIAN, MODE, RANGE,
STANDARD DEVIATION AND
VARIANCE OF CONTINUOUS
DATA (INTERVAL OR RATIO)
USING SPSS
Step 1. With the excel file open, Click Analyze, Descriptive Statistics,
Frequencies.
Step 2. Click Statistics. Check all statistics that you want to compute.
(Aside from measures of central tendency and dispersion,
measures of distribution like skewness can also be calculated)
Step 3. Click Continue, then OK and the result is shown below.

The result shows the following:
1. Valid N for English test, Math test and Science test is 90, the data is
complete.
2. The Mean for English test = 76.22, Math test = 79.16, and Science
test = 74.60.
3. The Median for English test = 76, Math test = 78, and Science test =
76.
4. The Mode for English test = 76, Math test = 83, and Science test = 76.
The English test is polymodal (multiple modes).
5. The Highest score for English test = 90, the Lowest score = 56, Range = 34 .
The Highest score for English test = 88, the Lowest score = 66, Range = 22.
The Highest score for English test = 89, the Lowest score = 56, Range = 33.
6. The Standard deviation (s) English test = 7.857, variance (s2) = 61.725.
The Standard deviation (s) Math test = 6.135, variance (s2) = 37.638.
The Standard deviation (s) Science test = 8.544, variance (s2) = 73.007.
7. Skewness
The normal distribution, represented by the normal curve
represents symmetry and that the measures of central tendency
(mean, median and mode) are the same. However, if these three—
mean, median and mode are affected by lack of symmetry,
skewness in the data occurs.
Typically, for standardized test the curve very closely

approximates a normal distribution. However if the distribution is
positively skewed, most of the scores piled up at the lower end and
there are just few high scores. For a negatively skewed distribution it
is just the opposite- most of the scores are high with few low scores.
In the example, all scores are nega^vely skewed (English = -

0.257, Math = -0.800 and Science .108). It means that more scores are
high and few are low, especially in Math. If the data is perfectly
normal/symmetrical, skewness is zero, but it is almost impossible in real
life situaôns. In research, normality is assumed if skewness is -
1.0<x<1.0. Thus, the three scores s^ll meet the assumpôn of normality.
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE
Example 1: In our previous result, the average score for the

English test is 76.22. What are the descriptive statistics (i.e.
means, and standard deviation in English test of males and
females separately?
Step 1. With the excel file open, Click Analyze, Compare Means, Means.
Step 2. Put English test in the Dependent List and Gender in the Independent
List.
Step 3. Click Options. The Default Cell Statistics are Mean, Number of Cases
and Standard Deviation. You can include other statistics such as Median,
Minimum, Maximum, Variance, Kurtosis and Skewness using the arrow
key.
Step 4. Click Continue, then OK and the result is shown below.

The output shows the descriptive statistics of Gender 1 (male)

and Gender 2 (female) in the English test. The mean score of the 42
males is 76.24 with the standard deviation of 7.77. The median score is
78, the highest score is 89 and the lowest is 56. Similarly, the mean
score of the 48 females is 76.21 with the standard deviation of 8.02. The
output also shows that the scores of the males are more skewed
(negatively) than the score of the females. The overall mean of the 90
students in the English test is 76.22, the same value we obtained in the
earlier example.
Example 2. What are the descriptive statistics in Math scores

according to type of school?
Perform the same procedure. Put Math test in the Dependent List
and Type of School in the Independent List. The result is:
The table shows that the two groups have the same minimum (66)
and maximum (88) scores. The overall mean score of the 90 students in
79.16, the same value we obtained in the earlier example. The output
shows that the mean score in the Math test of the 38 students enrolled in
public school (Type of School 1) is 78.47 with a standard deviation of 6.26
while the mean score of the 52 students enrolled private school (Type of
School 2) is 79.65 with a standard deviation of 6.06. With these means
scores and standard deviations in Math test, can we say that students
enrolled in private school performed better than those enrolled in public
school? This question will be discussed thoroughly in the succeeding
module (i.e. mean comparisons).
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS
In the earlier examples, we obtain the frequency counts

and percentages of cross tabulations (i.e. gender and type of
school). Suppose we are interested to determine the relevant
statistics (i.e. means and standard deviations) of these cross
tabulations in a given variable.
Example 1. In the previous example, the mean score of

male students in the English test is 76.24, and SD is 7.77. What are
the descripJve staJsJcs of male students who are enrolled in
public school only? In the cross tabulaJon performed earlier, 20
or 47.6 % of the 90 respondents belong to this group (male-
public).
Step 1. With the excel file open, Click Analyze, Compare Means, Means.
Step 2. Put English test in the Dependent List and Gender in the
Independent List. Then Click Next. Put Type of School in the
Independent List. Note: You can continue adding another
independent variable to determine the cross tabulations and means
( i.e. What is the mean score in English of male students enrolled in
public school (next) who responded “neutral”?)
Step 3. Click OK and we have the result.

The output shows that the mean score of all male students
(n= 42) is 76.42 and SD is 7.77, the same values we obtained
earlier. The mean score of 20 males (Gender 1) who are enrolled
in public school (Type of School 1) is 77.55 and SD is 7.65.
Similarly, the mean score of the 30 females (Gender 2) who are
enrolled in private school (Type of School 2) is 75.47 and SD is
8.92.
Example 2. What are the descriptive statistics in Math test based

on the cross tabulation of gender and type of school?
Perform the same procedure. The Dependent Variable is Math

test. The result is:
The output shows that the mean score in Math test of

male students enrolled in public school (n=20) is 79.70 and SD is
4.68. Similarly, the mean score of female students enrolled in
private school (n=30) is 80.37 and SD is 5.55.
References
Dela Rosa, E. D. (2019). Learning Module in Statistics with SPSS

Applications. Philippine Copyright 2019.
Domingo, J. G. (2020). Learning Materials in Unit 4 – Data

Management. NEUST’s Learning Module in Math in the
Modern World.

LU 3 Descriptive Statistics in SPSS

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

LU 3 Descriptive Statistics in SPSS

Uploaded by

Copyright:

Available Formats

Nueva Ecija University of Science and Technology

College of Arts and Sciences

Psy 102: Psychological Statistics

Lecture prepared by: JAYNELLE G. DOMINGO, MSc. MathEd

At the end of the lesson, the students are expected to:

ü Determine frequency counts and percentages using SPSS

ü Compute for the measures of central tendency of a set of

ü Compute for the measure of variability of a set of scores

Computing the measures of central tendency (mean,

With technological advancement and the invention of

1. Measures of Central Tendency (MCT)

Statistician often collects data from small portions

a. Mean – simply the average. It is the most commonly used MCT.

b. Median – the middle score for a set of data arranged in order

c. Mode – the most frequent score or value in the data set. It is

Measures of dispersion (varia0on) iden0fy how a set of values

b. Variance – considers the deviation of every single data value in the

c. Standard Deviation – computed as the positive square root of variance.

𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 (𝒔𝟐 ) = 𝟖𝟑. 𝟐𝟏

To illustrate this, we will

The sample excel file contains data gathered from a sample of 90

The categorical variables were dummy coded as:

How to import excel files to the SPSS

How to import excel files to the SPSS

Step 2. Click File, Open

How to import excel files to the SPSS

Step 3. Look in the data where it is saved.

How to import excel files to the SPSS

Step 4. Click Open, then OK. (If the excel

We have successfully imported the excel file to the

HOW TO DETERMINE FREQUENCY COUNTS

Step 1. With the excel file already imported

HOW TO DETERMINE FREQUENCY COUNTS AND

Step 2. Choose the variables to be analyzed and

HOW TO DETERMINE FREQUENCY COUNTS

Step 3. Click OK, and the result of the analysis

In dealing with descriptive statistics in a data set, it is also

Example 1. We might be interested to know how many male

Step 4. Click OK, and the result of the analysis is shown.

The result of cross tabulation shows that out of the 42 males

Example 2. How many students enrolled in public school

Perform the same procedure. Put type of school in the Row(s)

The cross tabulaGon shows that out of the 38 students

HOW TO OBTAIN MEAN,

Step 3. Click Continue, then OK and the result is shown below.

The result shows the following:

The result shows the following:

The result shows the following:

The result shows the following:

Typically, for standardized test the curve very closely

The result shows the following:

In the example, all scores are nega^vely skewed (English = -

Example 1: In our previous result, the average score for the

Step 4. Click Continue, then OK and the result is shown below.

The output shows the descriptive statistics of Gender 1 (male)

Example 2. What are the descriptive statistics in Math scores

In the earlier examples, we obtain the frequency counts

Example 1. In the previous example, the mean score of

Step 3. Click OK and we have the result.

Example 2. What are the descriptive statistics in Math test based

Perform the same procedure. The Dependent Variable is Math

The output shows that the mean score in Math test of

Dela Rosa, E. D. (2019). Learning Module in Statistics with SPSS