You are on page 1of 22

Lesson 3: Descriptive Statistics

Objectives
At the end of this lesson, you shall be able to:

1. describe the following: mean, standard deviation, range


of scores, skewness and kurtosis.
2. choose a statistical filled to your variable (whether
categorical or continuous)
3. check for non-violation of assumptions set for each test
4. assess normality
5. interpret output from frequencies, descriptive and
explore
6. check for outliers
Putting your mind into action / Putting your
ideas into work

Once you are sure there are no errors in the data file (or at
least no out-of-range values on any of the variables). You can
begin the descriptive phase of your data analysis.
Descriptive statistics is used to:
•Describe the characteristics of your sample in the
Methodology section of your report;
•Check your variables of any violation of the assumptions
underlying the statistical techniques that you will use to
address your research questions:
•Address specific research questions
Categorical variables

To obtain descriptive statistics for


categorical variables you should use Frequencies.
This will tell you how many people gave each
response (e.g., how many males, how many
females etc). It doesn’t make any sense asking for
means, standard deviations etc., for categorical
variables, such as sex or marital status.
Procedures of Obtaining Statistics for
Categorical Variables

From the menu as the top of the screen


click on: Analyze, then click on Descriptive
Statistics, then Frequencies.
Choose and highlight the categorical
variables you are interested in (e.g.,
sex). Move these into the Variable box.
Click on the Statistics button. In the
Dispersion section tick Minimum and
Maximum and then OK.
Interpretation of output from frequencies

From the output shown above we know that there are


185 males (42.1 per cent) and 254 females (57.9 per
cent) in the sample, giving a total of 439 respondents. It
is important to take note of the number respondents
you have in different subgroups in your sample. For
some analyses (e.g., ANOVA) it is easier to have roughly
equal group sizes. If you have very unequal group sizes,
particularly if the group sizes are small, it may be
inappropriate to run some analyses.
Continuous variables

For continuous variables (e.g., age) it is easier to


use Descriptives, which will provide you with
‘summary’ statistics such as mean, median, standard
deviation. You certainly don’t want every single value
listed, as this may involve hundreds of values. You can
collect the descriptive information on all of your
continuous variables in one go, it is not necessary to
do it variable by variable. Just transfer all the variables
you are interested in into the box labelled Variables. If
you have a lot of variables, however, your output will
be extremely long. Sometimes it is easier to do them in
chunks and tick off each group of variables as you do
them.
Procedures for Obtaining Descriptive Statistics for
Continuous Variables

1. From the menu at the top of the screen click on:


Analyze,then click on Descriptive Statistics, then
Descriptives.
2. Click on all the continuous variables that you wish
to obtain descriptive statistics for. Click on the
arrow button to move them into the Variables box
(e.g., age, total perceived stress etc).
3. Click on the Options button. Click on mean,
standard deviation, minimum, maximum,
skewness, kurtosis.
4. Click on continue, and then OK.
Interpretation of output from Descriptives

In the output presentation above the information we requested


for each of the variables is summarized.
Descriptive also provides some information concerning the
distribution of scores on continuous variables (skewness and
kurtosis).
Positive skewness values indicate positive skew (scores clustered
to the left at the low values). Negative skewness values indicate
a clustering of scores at the high end (right-hand side of a
graph).
Assessing normality
Many of the statistical techniques presented in
Lesson Four and Five of this module ‘assume’,
normality, which is used to describe a symmetrical,
bell shaped curve, which has the greatest frequency
of scores in the middle, with smaller frequencies
towards the extremes.
Procedure for assessing normality using explore

1. From the menu at the top of the screen click on: Analyze, then
click on Descriptive Statistics, then Explore.
2. Click on the variables you are interested in (e.g., total perceived
stress). Click on the arrow button to move them into the
Dependent List box.
3. Click on any independent or grouping variables that you wish to
split your sample by (e.g., sex). Click on the arrow button to move
them into the Factor List box.
4. In the display section make sure that Both is selected. This
displays both the plots and statistics generated.
5. Click on the plots button. Under Descriptive click on the Histogram.
Click on normality plots with tests.
6. Click on continue
7. Click on Options button. In the Missing Values section click on
Exclude cases pairwise.
8. Click on Continue and then OK.
Interpretation of output from explore

•In the table labeled Descriptives you are provided with


descriptive statistics and other information concerning your
variables.
•Skewness and kurtosis values are also provided as part of this
output providing information about the distribution of scores for
the two groups (see discuss of the meaning of these values in the
previous section).
•In the table labeled Test of Normality you are given the results
of the Kolmogorov-Smirnov statistic.
•The actual shape of the distribution for each group can be seen
in the Histograms provided
•The Detrended Normal Q-Q Plots displayed in the output are
obtained by plotting the actual deviation of the scores from the
straight line.
•The final plot that is provided in the output is a boxplot of the
distribution of scores for the two groups.
Checking for outliers

Many of the statistical techniques are sensitive to


outliers (cases with value well above or well below
the majority of other cases). The techniques
described in the previous section can also be used
to check for outliers, however an additional
approach is detailed below. You will recognize it
from the previous lesson when it was used to check
for out-of-range cases.
Procedure for identifying outliers

1. From the menu at the top of the screen click on: Analyze,
then click on Descriptive Statistics, then Explore.
2. In the Display section make sure Both is selected. This
provides both Statistics and Plots.
3. Click on your variable (e.g., total perceived stress), and move
it into the Dependent list box.
4. Click on ID from your variable list and move into the section
Label cases. This will give you the ID number of the outlying
case.
5. Click on the statistics button. Click on Outliers. Click on
Continue.
6. Click on the Plots button. Click on Histogram. You can also
ask for a Stem and Leaf plot as well if you wish.
7. Click on the Options button. Click on Exclude cases pairwise.
Click on Continue and then OK.

You might also like