You are on page 1of 4

Chapter 1 Fundamental concepts

SPSS - Descriptive statistics


Before starting with any advanced analysis, it is a good habbit to start with some descriptive statistics and
simple graphics, to see what is going on in your data!

Datafile used: gss.sav

How to get there: Analyze Descriptive Statistics ….

Frequencies …

This menu selection opens the following Frequencies dialog box:

As you can see, the variables are difficult to read. To make them easier to read, we’ll use variable names
instead of labels in dialog boxes. Do this by choosing Edit Options. Then, in the Options dialog box,
click the General tab. In the Variable Lists group box (top at the right), select ‘Display names’ and click
OK.

This change doesn’t have effect until the next time you open a data file!
So close the datafile, and reopen it. Return to the Frequency dialog box.

Now you’ll see the following Frequencies dialog box:


Choose the variable(s) for which you need descriptive statistics by selecting them and clicking on the
arrow. They appear in the ‘Variable(s):’ box.

‘Display frequency tables’ is automatically selected. In a frequency table the absolute and the relative
frequencies are shown, as well as the percentage and cumulative percentage of valid cases (without missing
values). The cumulative percentage is the portion that is smaller or equal to the concerning value.

Button Statistics…
One can select many descriptive statistics. Most importantly, these are the Mean, Median, and Mode, and
Std. deviation, Range, Minimum and Maximum. See following figure.

Button Charts…
Some simple charts can be obtained, such as bar charts, pie charts and histograms. A histogram is a
graphical display of counts for ranges of data values. In histograms, one can choose to indicate the normal
curve as well. See following figure.

When a chart is obtained in the output, they can be modified in the SPSS Viewer. A new window appears,
the SPSS Chart Editor, in which changes can be made by clicking on a certain part of the chart (e.g. axis,
legend, title) In the following figure, the window ‘Category Axis’ appears by clicking on the x-axis title
Respondent’s Sex.

Output of running frequencies

Output 1
When you perform an analysis using ‘Frequencies’ on the variable degree, without indicating any
options, the results are the following:

Frequencies
Statistics

RS Highest Degree
N Valid 1496
Missing 4

RS Highest Degree

Cumulative
Frequency Percent Valid Percent Percent
Valid Less than HS 279 18,6 18,6 18,6
High school 780 52,0 52,1 70,8
Junior college 90 6,0 6,0 76,8
Bachelor 234 15,6 15,6 92,4
Graduate 113 7,5 7,6 100,0
Total 1496 99,7 100,0
Missing Don't know 2 ,1
No answer 2 ,1
Total 4 ,3
Total 1500 100,0
In the table ‘Statistics’, the number of cases (N) is splitted in Valid and Missing cases.

In the frequency table ‘RS Highest Degree’, the variable degree is splitted into the possible answers (Less
than HS, High School, ..etc), and their absolute (Frequency) and the relative (Percent) frequencies are
shown, as well as the percentage and cumulative percentage of valid cases (Valid Percent and Cumulative
Percent). Percent calculates the relative frequencies including the missing cases. However, Valid Percent
calculates the relative frequencies excluding the missing cases, so that the relative frequencies of the valid
cases count up to 100 %.

Output 2
When you perform an analysis using ‘Frequencies’ on the variables age, indicating the options mean,
median and mode (button Statistics) , and histogram with normal curve (button Charts), some of the
results are the following (we left the table ‘Age of Respondent’ out because it is very large):

Frequencies
Statistics

Age of Respondent
N Valid 1495
Missing 5
Mean 46,23
Median 43,00
Mode 28a
a. Multiple modes exist. The smallest value is shown

Age of Respondent
200

100
Frequency

Std. Dev = 17,42


Mean = 46,2
0 N = 1495,00
20,0 30,0 40,0 50,0 60,0 70,0 80,0 90,0
25,0 35,0 45,0 55,0 65,0 75,0 85,0

Age of Respondent

As usual, the number of valid and missing cases are visible in the ‘Statistics’ table. The other descriptive
statistics (Mean, Median and Mode), are indicated in the same table.

The histogram of the variable age shows its distribution, with Age of Respondent on the x-axis and
Frequency on the y-axis. The distribution seems to be approximately normal, and skewed to the left.

You might also like