You are on page 1of 60

Nueva Ecija University of Science and Technology

College of Arts and Sciences


MATHEMATICS AND SCIENCES DEPARTMENT

Psy 102: Psychological Statistics

Lecture prepared by: JAYNELLE G. DOMINGO, MSc. MathEd


LESSON OBJECTIVES:

At the end of the lesson, the students are expected to:

ü Determine frequency counts and percentages using SPSS

ü Compute for the measures of central tendency of a set of


scores using SPSS

ü Compute for the measure of variability of a set of scores


using SPSS
Introduction

Computing the measures of central tendency (mean,


median and the mode), and measures of variability (range,
quartiles, decile percentiles, standard deviation, variance and
coefficient of variation) using long methods of calculations are
tedious and prone to error especially involving large data sets (i.e.
hundreds or even thousands of data). These long calculations may
be one of the reasons why statistics as a course seems to be
complicated and difficult. Thus, students tend to dislike the
subject.
Introduction

With technological advancement and the invention of


computers, programs have been developed to help people
overcome difficulties in long calculations, and more importantly
arriving at more accurate results. In research and related
endeavors, the Statistical Package for Social Sciences (or SPSS) is
one of the many calculation programs developed by the
International Business Machines, Corporation (or IBM) to aid
researchers in analyzing large data sets easily. Specifically, this
module utilizes the SPSS version 21.
Review of Two Basic Descriptive Statistics

1. Measures of Central Tendency (MCT)

Statistician often collects data from small portions


(sample) of a large group (population) in order to determine
information about the group. These data usually represents by a
single value referred to as measures of central tendency or central
location. Measure of Central Tendency (MCT) is measure
indicating the center of a set of data which are arranged in order
of magnitude. There are three (3) MCT that are commonly used
namely, the mean, the median, and the mode.
Review of Two Basic Descriptive Statistics

a. Mean – simply the average. It is the most commonly used MCT.


The mean is denoted by 𝜇 for populaJon mean and 𝑥̅ for
sample mean.

Proper&es of Mean
• The mean reflects the magnitude of every observaJon, since
every observaJon contributes to the value of the mean.
• The mean can be easily affected by the presence of an extreme
value, hence not a good measure of MCT when extreme value
do occur.
Review of Two Basic Descriptive Statistics

b. Median – the middle score for a set of data arranged in order


(array data). It is denoted by 𝑀𝑑 or 𝑥.
%

Proper&es of Median
• Median is a posiJonal value and hence is not affected by the
presence of an extreme value unlike the mean.
• The median is not amenable for further computaJon and hence
medians of subgroups cannot be combined in the same manner
as the mean.
Review of Two Basic Descriptive Statistics

c. Mode – the most frequent score or value in the data set. It is


sometimes considered as the most popular option and is denoted
by 𝑀𝑜 or 𝑥.
# A particular data set can have no mode, one mode
(unimodal) or two modes (bimodal) and so on.

Properties of Mode
• Since mode is the most frequently occurring value, it may not be the
center of the data.
• Mode does not make use of all observations.
• Mode is difficult to manipulate algebraically.
• Mode is ideal for qualitative type of data.
Review of Two Basic Descriptive Statistics

Illustration of MCT
Koko recorded his duration of stay in library for 10 school days. His data are as
follows:
Day Duration (in minutes)

1 44
2 20
3 35
4 33
5 40
6 33
7 33
8 15
9 42
10 34
Review of Two Basic Descriptive Statistics

Mean:
∑ 𝑥 44 + 20 + 35 + 33 + 40 + 33 + 33 + 15 + 42 + 34 329
𝑥̅ = = = = 𝟑𝟐. 𝟗 𝒎𝒊𝒏𝒔
𝑛 10 10

Median:
Arrange first the data from lowest to highest.
15 20 33 33 33 34 35 40 42 44
Since we have even number of data, two middle scores occur. Add the two middle score and divide the sum by 2.

33 + 34 67
𝑥7 = = = 𝟑𝟑. 𝟓 𝒎𝒊𝒏𝒔
2 2
Mode:
In the data set, 33 appear thrice. Thus, 33 is the mode and the data is unimodal.

; = 𝟑𝟑 𝒎𝒊𝒏𝒔
𝒙
Review of Two Basic Descriptive Statistics

2. Measures of Dispersion

Measures of dispersion (varia0on) iden0fy how a set of values


spreads or fluctuates. It enables you to know how varies the observa0on
are, whether there are extreme values in the distribu0on, or whether the
values are very close to each other. The common measures of varia0on are
range, mean absolute devia0on, variance, standard devia0on, coefficient of
varia0on, quar0le devia0on, and the percen0le range. However, only the
range, variance and standard devia6on of ungrouped data will be
discussed in this sec0on as these three are the most commonly used and
more prac0cal when it comes to inferen0al sta0s0cs.
Review of Two Basic Descriptive Statistics

a. Range – the difference between the greatest data value and the
lowest data value. It is the simplest measure of dispersion but the
least reliable. It does not reflect variations in the data set that lie in
between the highest and lowest data value.

Example:

In Koko’s data in his duration of stay in the library, the highest data value
is 44 and the lowest data value is 15. Thus,
𝑅 = 44 − 15
𝑹 = 𝟐𝟗
Review of Two Basic Descriptive Statistics

b. Variance – considers the deviation of every single data value in the


data set unlike range. It is simply referred to as the average of the
squared deviation of each data value from the mean of the data
set. It is denoted by 𝜎 ! for population variance and 𝑠 ! for sample
variance. Like the range, the higher the computed value of variance
the more dispersed are the data set. It is always nonnegative.

Formula:
Population Variance Sample Variance
∑(?@A)! ∑(?@?)̅ ! D ∑ ?! @ ∑ ? !
𝜎= = 𝑠= = or 𝑠= =
C D@E D D@E
Review of Two Basic Descriptive Statistics

c. Standard Deviation – computed as the positive square root of variance.


Similar to variance, it is based on the deviations of all data value in data
set. Standard deviation is considered as the most reliable measure of
dispersion as this value is associated with the characteristics of common
data sets which are normally distributed. In statistics, it is denoted by 𝜎
for population standard deviation and 𝑠 for sample standard deviation but
in research, it is denoted by Std. Dev. or 𝑆𝐷.

Formula:
Population SD Sample SD
∑(?@A)! ∑(?@?)̅ !
𝜎= 𝑠=
C D@E
Review of Two Basic Descriptive Statistics

Example:

Let us consider Koko’s data in his duration of stay in the library (treated as
sample). The Statistical Package for Social Sciences (SPSS) was used to
compute the variance and standard deviation since this course
recommends computer application. The results are:

𝒗𝒂𝒓𝒊𝒂𝒏𝒄𝒆 (𝒔𝟐 ) = 𝟖𝟑. 𝟐𝟏

𝒔𝒕𝒂𝒏𝒅𝒂𝒓𝒅 𝒅𝒆𝒗𝒊𝒂𝒕𝒊𝒐𝒏 𝒔 = 𝟗. 𝟏𝟐
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

In most cases, aside from analyzing the data to answer the main
objective of the study (i.e. testing the hypothesis that there is no
significant difference on the test anxiety of male and female students),
we usually start in determining how many of the respondents (from a
large data) belongs to a category in a study variable. For instance, of the
1000 respondents, “how many are females ?, without actually counting
it manually. Or maybe we are interested on determining, “what percent
of the students answered strongly agree on one of the test anxiety
items in your questionnaire, again without actually having to count it.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

To illustrate this, we will


use the excel file. Folder name:
DATA SET FOR LECTURES, File
name: DATA SET (descriptive
statistics), a portion of which is
shown.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

The sample excel file contains data gathered from a sample of 90


students, who were asked relevant information such as their gender, type of
school, how much they like schooling, and their scores in English test, Math
test and Science test.

The categorical variables were dummy coded as:


a) Gender (1-male, 2-females)
b) Type of school (1-public, 2-private)
c) How much do you like schooling in general
( 5- very much, 4- much, 3- neutral, 2- not much, 1- not at all).
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

Meanwhile, English test, Math test and Science test are the
actual raw scores obtained in the test. Using the data, respondent 1 is a
male (coded as 1), enrolled in a private school (coded as 2), and
answered neutral in terms of how much he like schooling (coded as 3).
He got the score of 65, 88 and 76 in the English test, Math test, and
Science test, respectively.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

How to import excel files to the SPSS


program
Step 1. Open your SPSS. Close the initial
dialog box to show a blank SPSS
Data Editor. (Note that earlier
version can be used, but some of
the features may be different.)
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

How to import excel files to the SPSS


program

Step 2. Click File, Open


FREQUENCY COUNTS AND PERCENTAGES IN SPSS

How to import excel files to the SPSS


program

Step 3. Look in the data where it is saved.


The file folder’s name is DATA SET
FOR LECTURES. Specify files of type:
Excel (*xls, *xlsx, *xlsm). Click DATA
SET (descriptive statistics).xls
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

How to import excel files to the SPSS


program

Step 4. Click Open, then OK. (If the excel


file to be used contains multiple
spread sheets, select the spread
sheet to be analyzed before clicking
OK).
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

We have successfully imported the excel file to the


SPSS program, and is now ready for analysis. There are so
many features of the SPSS program that are very interesting
to learn, but we will only tackle the ones that we need in as
far as our objectives are concerned.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

HOW TO DETERMINE FREQUENCY COUNTS


AND PERCENTAGES OF CATEGORICAL DATA
(NOMINAL AND ORDINAL) USING SPSS

Step 1. With the excel file already imported


to the SPSS, Click Analyze, Descriptive
Statistics, Frequencies.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

HOW TO DETERMINE FREQUENCY COUNTS AND


PERCENTAGES OF CATEGORICAL DATA
(NOMINAL AND ORDINAL) USING SPSS

Step 2. Choose the variables to be analyzed and


put them inside the Variable(s) by using
the arrow pointing to the right. You can
choose the variables one at a time or
simultaneously, and you can use the arrow
pointing to the left if you like to change or
replace the variable). Click Statistics, then
Continue. You can also Click Charts and
specify whether you like a bar graph or a
histogram for graphical representation.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

HOW TO DETERMINE FREQUENCY COUNTS


AND PERCENTAGES OF CATEGORICAL DATA
(NOMINAL AND ORDINAL) USING SPSS

Step 3. Click OK, and the result of the analysis


is shown below.
FREQUENCY COUNTS AND PERCENTAGES IN SPSS

The output shows that the Valid N for all the three variables (Gender,
Type of School and How much do you like schooling in general) is 90, and 0
Missing data. In other words, the data set is complete. The frequency table
shows that out of 90 respondents, 42 were males (coded as 1), and that is
46.7%. There are 48 females (coded as 2) comprising 53.3%. Similarly, 38 or
42.2% are enrolled in public schools (coded as 1), while the remaining 52 or
57.8% are enrolled in private schools (coded as 2). In terms of how much they
like schooling, 6 (6.7%) responded “very much” (coded as 4), while 38 (42.2%)
were neutral.
With SPSS, you can practically count all categorical variables (nominal
and ordinal) simultaneously and easily even for a very large data set (i.e. n=2,
000).
CROSS TABULATION IN SPSS

In dealing with descriptive statistics in a data set, it is also


possible to determine the frequency counts and percentages of
samples for a specific variable across the levels of another
variable/s.

Example 1. We might be interested to know how many male


students are enrolled in private school, or how many females are
enrolled in public school.
CROSS TABULATION IN SPSS

Step 1. With the excel file open, Click Analyze, Descrip^ve Sta^s^cs,
Crosstabs

Step 2. Put one of the variables in the Row(s) box, and the other
variable in the Column(s) box.

Step 3. Click Cells, and Row and Column Percentage (to express
frequency counts as percentages.)

Step 4. Click OK, and the result of the analysis is shown.


CROSS TABULATION IN SPSS
CROSS TABULATION IN SPSS

The result of cross tabulation shows that out of the 42 males


(Gender 1) , 20 or 47.6 are enrolled in public school (Type of School
1), while 22 or 52.4 % are enrolled in private school (Type of School
2) . We can also describe it in terms of type of school. Out of the 38
students who are enrolled in public school, 18 or 47.4 are females
while there are 20 or 52.6% males. Practically, you can also perform
cross tabulations easily to any categorical variables, even for large
data sets.
CROSS TABULATION IN SPSS

Example 2. How many students enrolled in public school


responded “ neutral” in the question, “How much do you like
schooling in general?

Perform the same procedure. Put type of school in the Row(s)


and the How much do you like schooling in Column(s) box. The
result is:
CROSS TABULATION IN SPSS
CROSS TABULATION IN SPSS

The cross tabulaGon shows that out of the 38 students


enrolled in public school (School 1) , 14 or 36.8% responded
“neutral”, while 24 out of 52 (46.2%) from private schools
(School 2) responded “neutral”. Moreover, of the 6 students
who responded “ very much”, 5 or 83.3% were from private
school.
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

HOW TO OBTAIN MEAN,


MEDIAN, MODE, RANGE,
STANDARD DEVIATION AND
VARIANCE OF CONTINUOUS
DATA (INTERVAL OR RATIO)
USING SPSS
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

Step 1. With the excel file open, Click Analyze, Descriptive Statistics,
Frequencies.

Step 2. Click Statistics. Check all statistics that you want to compute.
(Aside from measures of central tendency and dispersion,
measures of distribution like skewness can also be calculated)

Step 3. Click Continue, then OK and the result is shown below.


MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

The result shows the following:

1. Valid N for English test, Math test and Science test is 90, the data is
complete.
2. The Mean for English test = 76.22, Math test = 79.16, and Science
test = 74.60.
3. The Median for English test = 76, Math test = 78, and Science test =
76.
4. The Mode for English test = 76, Math test = 83, and Science test = 76.
The English test is polymodal (multiple modes).
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

The result shows the following:

5. The Highest score for English test = 90, the Lowest score = 56, Range = 34 .
The Highest score for English test = 88, the Lowest score = 66, Range = 22.
The Highest score for English test = 89, the Lowest score = 56, Range = 33.

6. The Standard deviation (s) English test = 7.857, variance (s2) = 61.725.
The Standard deviation (s) Math test = 6.135, variance (s2) = 37.638.
The Standard deviation (s) Science test = 8.544, variance (s2) = 73.007.
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

The result shows the following:

7. Skewness
The normal distribution, represented by the normal curve
represents symmetry and that the measures of central tendency
(mean, median and mode) are the same. However, if these three—
mean, median and mode are affected by lack of symmetry,
skewness in the data occurs.
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

The result shows the following:

Typically, for standardized test the curve very closely


approximates a normal distribution. However if the distribution is
positively skewed, most of the scores piled up at the lower end and
there are just few high scores. For a negatively skewed distribution it
is just the opposite- most of the scores are high with few low scores.
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS
MEASURES OF CENTRAL TENDENCY AND VARIABILITY IN SPSS

The result shows the following:

In the example, all scores are nega^vely skewed (English = -


0.257, Math = -0.800 and Science .108). It means that more scores are
high and few are low, especially in Math. If the data is perfectly
normal/symmetrical, skewness is zero, but it is almost impossible in real
life situa^ons. In research, normality is assumed if skewness is -
1.0<x<1.0. Thus, the three scores s^ll meet the assump^on of normality.
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE

Example 1: In our previous result, the average score for the


English test is 76.22. What are the descriptive statistics (i.e.
means, and standard deviation in English test of males and
females separately?
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE

Step 1. With the excel file open, Click Analyze, Compare Means, Means.

Step 2. Put English test in the Dependent List and Gender in the Independent
List.

Step 3. Click Options. The Default Cell Statistics are Mean, Number of Cases
and Standard Deviation. You can include other statistics such as Median,
Minimum, Maximum, Variance, Kurtosis and Skewness using the arrow
key.

Step 4. Click Continue, then OK and the result is shown below.


DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE

The output shows the descriptive statistics of Gender 1 (male)


and Gender 2 (female) in the English test. The mean score of the 42
males is 76.24 with the standard deviation of 7.77. The median score is
78, the highest score is 89 and the lowest is 56. Similarly, the mean
score of the 48 females is 76.21 with the standard deviation of 8.02. The
output also shows that the scores of the males are more skewed
(negatively) than the score of the females. The overall mean of the 90
students in the English test is 76.22, the same value we obtained in the
earlier example.
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE

Example 2. What are the descriptive statistics in Math scores


according to type of school?

Perform the same procedure. Put Math test in the Dependent List
and Type of School in the Independent List. The result is:
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE
DESCRIPTIVE STATISTICS OF GROUPS WITHIN A VARIABLE

The table shows that the two groups have the same minimum (66)
and maximum (88) scores. The overall mean score of the 90 students in
79.16, the same value we obtained in the earlier example. The output
shows that the mean score in the Math test of the 38 students enrolled in
public school (Type of School 1) is 78.47 with a standard deviation of 6.26
while the mean score of the 52 students enrolled private school (Type of
School 2) is 79.65 with a standard deviation of 6.06. With these means
scores and standard deviations in Math test, can we say that students
enrolled in private school performed better than those enrolled in public
school? This question will be discussed thoroughly in the succeeding
module (i.e. mean comparisons).
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS

In the earlier examples, we obtain the frequency counts


and percentages of cross tabulations (i.e. gender and type of
school). Suppose we are interested to determine the relevant
statistics (i.e. means and standard deviations) of these cross
tabulations in a given variable.
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS

Example 1. In the previous example, the mean score of


male students in the English test is 76.24, and SD is 7.77. What are
the descripJve staJsJcs of male students who are enrolled in
public school only? In the cross tabulaJon performed earlier, 20
or 47.6 % of the 90 respondents belong to this group (male-
public).
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS

Step 1. With the excel file open, Click Analyze, Compare Means, Means.

Step 2. Put English test in the Dependent List and Gender in the
Independent List. Then Click Next. Put Type of School in the
Independent List. Note: You can continue adding another
independent variable to determine the cross tabulations and means
( i.e. What is the mean score in English of male students enrolled in
public school (next) who responded “neutral”?)

Step 3. Click OK and we have the result.


DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS

The output shows that the mean score of all male students
(n= 42) is 76.42 and SD is 7.77, the same values we obtained
earlier. The mean score of 20 males (Gender 1) who are enrolled
in public school (Type of School 1) is 77.55 and SD is 7.65.
Similarly, the mean score of the 30 females (Gender 2) who are
enrolled in private school (Type of School 2) is 75.47 and SD is
8.92.
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS

Example 2. What are the descriptive statistics in Math test based


on the cross tabulation of gender and type of school?

Perform the same procedure. The Dependent Variable is Math


test. The result is:
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS
DESCRIPTIVE STATISTICS BASED ON CROSS TABULATIONS

The output shows that the mean score in Math test of


male students enrolled in public school (n=20) is 79.70 and SD is
4.68. Similarly, the mean score of female students enrolled in
private school (n=30) is 80.37 and SD is 5.55.
References

Dela Rosa, E. D. (2019). Learning Module in Statistics with SPSS


Applications. Philippine Copyright 2019.

Domingo, J. G. (2020). Learning Materials in Unit 4 – Data


Management. NEUST’s Learning Module in Math in the
Modern World.

You might also like