This action might not be possible to undo. Are you sure you want to continue?

Welcome to Scribd! Start your free trial and access books, documents and more.Find out more

______ 12.0forwindows

Analysis without Anquish

VERSION

First published 2005 by John Wiley & Sons Australia, Ltd 33 Park Road,. Milton, Qld 4064 Offices also In Sydney and Melbourne

**© S. Coakes, L. Steed 2005 National Library of Australia
**

Cataloguing-in-Publication Coakes, Sheridan J, SPSS: analysis withoutanguish: version 12.0 for Windows data

**Version 12,0 For te rti ary s rudents.
**

ISBN 0 470 80736 9, I. SPSS for Windows, 2, Social sciences methods - Computer programs, L. TIlle. Staustical

--I

005.369

AU rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmirted in any fonn or by !lIly means, electronic, mechanical, photocopying, recording, or otherwise, without the prior penn iss ion of the publisher, Edited by David Rule Cover lmage: © Digital Vision Primed in Singapore by CMO Image Printing Enterprise

10 9 8 7 6 5 4 3 2 I

SPSS version 12,0 for WindOWS is a registered trademark of SPSS, Inc, SPSS: Analysis without AJtgul.'ih is no! approved, sponsored or connected with SPSS Inc. Infonnation about SPSS and other software products is available from SPSS, Inc, www.spss.corn

"'_

LJ

-_

Contents

Preface ..... viii At-a-glance text examples ..... x Entering data .... 24 Inserting and deleting cases and variables ..... 24 Moving variables Saving data files 25 25

CHAPTER

1

Introduction to SPSS

Getting started ..... I The SPSS environment.. .. 2 Data Editor. .... 2 Viewer and Draft Viewer..... 3 Pivot Table Editor. .... 3 Chart Editor. .... 5 Text Output Editor. .... 6 Syntax Editor. Script Editor. Toolbar.. .. 7 Menus File Edit View Data Analyze Graphs Utilities Window Help ..... 12 9 9 9 10 10 10 II II II 12 6 7

1

Opening an existing data file..... 26 Practice example ..... 26

CHAPTER

3

**Data screening and transformation 28
**

Working example Errors in data entry Assessing normality Histograms ..... 32 Stem-and-leafplots and boxplots ..... 32 Normal probability plots and detrended normal plots ..... 34 Kolmogorov-Smirnov statistics ..... 35 Skewness and kurtosis Assessing normality by group Variable transformation ..... 36 Data transformation ..... 42 Recode Compute Data selection 12 Solutions Syntax Output 50 50 51 42 45 47 and Shapiro- Wilks 35 36 28 29 30

Transform.....

Dialogue boxes for statistical procedures..... Source variable list.,... 13 Selected variablei s) list.:.. 14 Command pushbuttons ..... 14 Accessing sub-dialogue boxes ..... 15 Check boxes, radio buttons and drop-down lists..... 16 Check boxes ..... 16 Radio buttons..... 16 Drop-down lists ..... 17 Saving files and ending a session ..... 17

Practice example ..... 49

CHAPTER 4

Descriptive statistics

Frequency distributions ..... 56

56

Measures of central tendency and variability ..... 56 Working example ..... 56 Descriptives command ..... 60 Practice example ..... 6 I Solutions 62 62 62

CHAPTER

2

Preparation

Working example.....

**of data files
**

19 19 20 20 21 2I

19

Syntax Output

Defining variables..... Value labels Missing values Variable type Column format

Variable labels ..... 20

CHAPTER 5

Correlation

Assumption testing Working example Practice example Solutions Syntax Output 70 70 70

65

65 66 70

Measurement level... .. 21 Applying variable definition attributes to other variables ..... 23

... Syntax Output 124 124 124 120 120 Assumption testing .. 74 T-tests with more than one sample ...... 75 Repeated measures t-test Independent groups t-test Practice example ................... Practice example...... 150 147 One-way repeated measures ANOVA 108 Assumption testing Working example 108 108 • CONTENTS ... Solutions Syntax Output 104 104 105 98 98 104 One-way analysis of covariance (ANCOVA) Assumption testing ........ 126 Working example Practice example Solutions Syntax Output 13 1 131 131 126 131 CHAPTER 14 CHAPTER 9 Two-way between-groups ANOVA 98 Assumption testing Working example Practice example. 135 Working example.........CHAPTER 6 Practice example... Practice example........... 123 CHAPTER CHAPTER 8 13 One-way between-groups ANOVA with planned comparisons 92 Assumption testing Working example Practice example Solutions Syntax Output 96 96 96 96 93 93 Mixed/split plot design (SPANOVA) 126 Assumption testing. Solutions Syntax Output 141 141 142 136 14 1 135 CHAPTER 15 CHAPTER 10 Reliability analysis Working example. Solutions Syntax 73 73 Output 112 112 112 1 12 T-tests 73 Assumption testing Working example The one-sample t-test..... . Solutions Syntax Output 118 118 118 114 118 CHAPTER 7 One-way between-groups ANOVA with post-hoc comparisons 85 Assumption testing Working example Practice example Solutions Syntax Output 89 89 89 85 86 89 CHAPTER 12 Trend analysis Working example. 114 Working example. 120 Practice example . Solutions. 81 Solutions Syntax Output 81 81 82 75 77 CHAPTER 11 Two-way repeated measures ANOVA 114 Assumption testing . Solutions Syntax Output 151 151 151 147 Practice example ..

.... 211 Wilcoxon signed-rank test... 227 example . 222 Practice example 1 .. 213 Kruskal-Wallis test .Wilcoxon signed-rank test ..CHAPTER 16 Factor analysis Assumption testing Working example Practice example Solutions 164 Syntax Output 164 165 154 155 164 154 Spearman's rank order correlation .. . 260 Appendix 262 CONTENTS . 205 Working example ...Wallis 6 . 254 Practice example ... 260 Viewer and Draft Viewer output..fit ... 215 Working example ..Spearman '. 213 Working example .chi-square testfor relatedness or independence .. 218 Practice examples .. 224 Practice example Practice example order correlation 5 ........ 234 Practice example l.. 204 Assumption testing ..chi-square test for relatedness or independence ...Friedman test ...... 208 Mann-Whitney U test (Wilcoxon rank sum W test) . Esteem Optimism 184 184 185 192 CHAPTER 20 Multiple responseand multiple dichotomy analysis 227 Multiple Working Multiple Working response analysis . 220 Practice example /..chi-square test for goodness of... 225 7 . 211 Working example ............I' rank ........Whitney U test (Wilcoxon rank sum W test) .Kruskal..... 222 Practice example 2 ..... 248 Modifying and improving charts for presentation . 224 Practice example test .. 230 example 231 192 193 196 197 199 CHAPTER 19 Nonparametric techniques 204 Chi-square tests ................ 215 Friedman test .. 228 dichotomy analysis .. 220 Practice example 2 220 Practice example 3 221 Practice example 4 221 Practice example 5 221 Practice example 6 221 Practice example 7 221 Solutions .. 217 Practice examples 233 Practice example l..... 218 Working example ........Mann... .fit ......chi-square test for goodness of. 222 Practice example 3 ..... 223 Practice example 4 .......... Practice example 2 Solutions ... 226 CHAPTER 17 169 169 Multiple regression 168 Assumption testing Working example Practice example 179 Solutions 179 Syntax 179 Output 179 18 CHAPTER Multivariate analysis of variance (MANOVA) 183 Assumption testing Working example Data screening Practice example Solutions 192 Syntax Output Hope.. .. 204 Working example . Practice example 2 233 234 234 236 CHAPTER 21 237 245 Multidimensional scaling 237 Working example Practice example CHAPTER 22 Working with output 248 Editing output in the SPSS Viewet.... 217 Working example ..

There are several new features in SPSS 12. Some of these new features are outlined below.g.Preface SPSS is a sophisticated piece of software used by social scientists and related professionals for statistical analysis. we have included reporting at the end of each relevant chapter to illustrate how significant results from each of the particular tests should be reported. text and SPSS-format for data files • Command syntax to delete variables . Again. This workbook is based on version 12. However. We've also tried to improve the text from the feedback we have obtained from our readers. Statistical Enhancements • New options of handling weighted data in the Crosstabs procedures • New stepwise function in Multinomial Logistic Regression. such as temporary variables used in transformations. In this edition. for those who work with it.0.helps you identify. statistical enhancements and. a new add-on module for working with sample survey data. age in 10-year ranges) • Duplicate record finder . XML. which is compatible with the Windows environment. Multidimensional Scaling and Discriminant Analysis.assists you in banding scale data into categorical ranges (e.no one text will satisfy all our readers. compared with eight in • Visual bander . Thanks to the contribution of Dr Mark Fenton from James Cook University in Townsville. we've tried to keep the text clear and simple . Improved Charting Features • Better default chart appearance • Support for long text strings and automatic text wrapping • Control of default scale ranges using chart templates • 3-D effects for pie and bar charts • Improved choice of colour patterns.HTML. New Data Management Features • Longer variable names previous versions now a maximum of 64 bytes. flag. in subsequent editions we hope to expand on some of the other techniques offered in SPSS .0 relating to improved charting. report and filter duplicate records with the new Duplicate Cases feature • Output management system . this edition includes additional chapters on Mutidimensional Scaling. new data management. Many readers also requested that we bring back our chapters on Cluster Analysis.allows you to turn output into input with the new OMS command by automatically writing selected categories of output to different output files in different formats . • PREFACE .the new Delete Variables command makes it easy to delete variables you do not need anymore. This edition of SPSS: Analysis without Anguish continues the trend of previous editions in providing a practical text intended as an introduction to SPSS and a guide for users who wish to conduct analytical procedures.

assumption testing for each procedure is discussed. At the beginning of each chapter. outlining ways in which presentation can be improved. to Peta Dzidic for her assistance in preparation. James Cook University. associate professor. We wish you well in your research endeavours. who tell us that we have helped alleviate some of the 'anguish' associated with the analysis of research data. for his chapter contributions and feedback. our readers. In each of these chapters. chapter 21 considers the application of Multidimensional Scaling and. At the end of each chapter. Townsville Queensland. for your continued support. the users of the book. chapter 22 concludes in dealing specifically with output. and undertaking appropriate analysis to creating meaningful data output. We still receive very positive feedback from you. and the procedure is approached simply and systematically. Curtin University of Technology. Kalamunda WA 6926 Dr Lyndall Steed.johnwiley. Coakes Consulting.com.au/ highered/spss V 12). a working example is presented. Senior Lecturer. Solutions to these practice examples are also provided to clarify the analytical procedure and to facilitate interpretation of the SPSS output. so that the user can progressively work through each procedure with the text. Director. entering and exploring data. The concept of the workbook arose from our collective experience of teaching and applying research methods. Chapters 4 to 20 outline specific procedures within the package. One major advantage of the workbook is that the text is accompanied by data files. Although the workbook outlines each statistical procedure. Having worked through the book. as with other editions. Perth WA 600 I September 2004 PREFACE . A very big thank you must also go to Dr Mark Fenton. which are available from the website that accompanies this title (www. this is not a statistical text and a degree of statistical knowledge is assumed. you will be well on your way to effective research from coding. PO Box 1027. a practice example is given to help users to consolidate the skills they have learned.Chapters 1 to 3 are generic chapters that introduce the software. As highlighted above. We trust that this edition continues to help with this. detail how to prepare data files and outline data screening methods. School of Psychology. Dr Sheridan Coakes. It evolved from a recognised need to make research methodology more accessible and understandable to students who are undertaking research methods courses and to professionals who are taking part in research in an applied context. and to you.

At-a-glance text examples Working examples Work 2 Work 3 Work 4 Work 5 Work 6 Work 7 Work 8 Work 9 Work 10 Work 11 Work 12 Work 13 Work 14 Work 15 Work 16 Work 17 Work 18 Work 19a Work 19b Work 19c Work 19d Work 1ge Work 19f Work 199 Work 20a Work 20b Work 21 Individual's shopping behaviour Community resident attitudes towards physical exercise Gender differences in tennis performance Relationship between intelligence (IQ) and tertiary entrance exammation scores (TEE) among potential university students Comparisons of car engine efficiency Comparisons of utility cost by city Efficacy of weight-reduction programs Toy store sales by store type and location Practice effects on anagram completion times Manager productivity across store type and location Influence of age on reaction time Efficacy of new treatment program for depression Gender differences in sales representatives Internal consistency of a help-seeking instrument Attitudes towards organ donation Effect of shelf space and price on the sale of pet food Influence of attitudes. feelings and exposure on organ donation behaviour Attitudes towards US military bases in Australia Reader publication preferences and geographic location Comparison of productivity across factories Comparison of factory productivity by time period Evaluation of different sales training programs Reaction times across drug conditions Relationship between sales performance and employee income Criteria used in selection of rams for breeding Native trees growing in gardens Tourist perceptions of distance between Australian cities Practice examples Practice 2 Practice 3 Practice 4 Practice 5 Practice 6 Practice 7 Practice 8 Practice 9 Attitudes towards new school opening Adolescent attitudes towards their future Differences in white goods sales across junior and senior sales staff Relationship between smoking and lung damage Hypnosis and memory recall Comparison of nutrient value of varying food supplements in rats Comparison of weight gain in rats across varying food supplements Influence of density of traffic and type of intersection on number of road accidents • AT-A-CLANCE TEXT EXAMPLES .

average volume of trees) Gender differences across personality measures Customer preference for specialty coffee bags Gender differences in drink preference Relationship between personality types and religious affiliation Number of injuries across different sports Companionability of varying dog breeds Effect of load on truck fuel consumption Reasons for getting married Factors influencing purchase of a used car Tourist perceptions of similarity among distance between nine cities Practice 19d Effect of temperature on pilot performance Note: there are no working or practice examples for chapters 1 and 22. AT-A-GLANCE TEXT EXAMPLES . locus hope.Practice 10 Practice 11 Practice 12 Practice 13 Practice 14 Practice 15 Practice 16 Practice 17 Practice 18 Practice 19a Practice 19b Practice 19c Practice 1ge Practice 19f Practice 199 Practice 20a Practice 20b Practice 21 Differences in house sales according to suburb and time of year Influence of colour and background on visual aesthetics Effect of caffeine on motor task performance Influence of auditorium size and sound-proofing on quality of acoustics Influence of peer tutoring on level of computer anxiety Reliability of independent personality scales of control and self-esteem hope. average age of trees. Factor structure of independent personality scales locus of control and self-esteem Relationship between the volume of wood and forest characteristics (number of trees. optimism. optimism.

\':a~:.. do not be put off. describes the menu options and tool bars. and provides instructions on how to begin and end an SPSS session.':'.SSistcil. It addresses aspects of the SPSS environment.Introduction to SPSS This chapter provides an introduction to using SPSS version 12.ij .3 CHAPTER 1 • Introduction to SPSS • . To start an SPSS session.rd TOSHIBA SD Card Utimes TOSHIBA Utilities ~ ":'.'. TOSHI6. you will be able to manoeuvre around the package with ease and carry out all kinds of analytical procedures..0 for Windows.:e ~ Log Off coekes._. A. When SPSS is initially installed.croba: Reader S. it's time to familiarise ourselves with the SPSS program and its attributes. Before long. the SPSS program group is created in the Programs menu. Geffing started If this is one of your first experiences with the SPSS package.a St?._.c:essones d o C<5:"or 8:vetooth Toshi::.. double-click on the SPSS icon. User':.O~ PhOtOKvoeCC. Remote . Now.

SPSS has Data Editor The Data Editor is a versatile spreadsheet-like system for defining. In addition to the simple point-and-click eight different types of windows: • Data Editor • Viewer • Draft Viewer • Pivot Table Editor • Chart Editor • Text Output Editor • Syntax Editor • Script Editor interface for statistical analysis. In this window you can create new data files or modify existing data files. • SPSS: Analysis without Anguish . As outlined. This window opens when you start an SPSS session and displays the contents of a data file. Most tasks can be accomplished simply by pointing and clicking the mouse. the data editor is like a spreadsheet where cases are represented in rows and variables are represented in columns. editing and displaying data.The SPSS environment SPSS for Windows provides a powerful statistical analysis and data management system in a graphical environment. using descriptive menus and simple dialogue boxes to do most of the work for you. entering.

rearrange rows.g. The right pane of the window is the contents pane. add colour. tables and charts and allows you to edit the output and save it in an output file for later use. change the display. Text Output Editor and Chart Editor and to move between SPSS and other applications. Pivot Table Editor Output displayed in pivot tables can be modified in a number of ways. The window is divided into two panes. The width of the outline pane can also be changed by clicking on the right-hand border and dragging it to the desired width. The left pane. e. selectively show and hide output. create multidimensional tables and selectively hide and display results. which contains statistical tables. If you select an item in the outline pane. Using this editor. order results and move presentation-quality tables and charts between SPSS and other applications. You move into the Pivot Table Editor from the Viewer by selecting the table you want to edit and clicking the right mouse button to open the SPSS Pivot Table Object. This window opens automatically the first time you run a procedure that generates some output. the corresponding item in the contents pane is highlighted.Viewer and Draft Viewer The Viewer makes it easy to browse your results. Word. charts and text output. contains an outline view of the output contents and can be used to navigate through your output and control the output display . columns and layers. This window also allows you to access the Pivot Table Editor. Moving an item in the outline pane moves the corresponding item in the contents pane. referred to as the outline pane. The window displays all statistical results.very handy if you have a lot of output. it is possible to edit text. CHAPTER 1 • Introduction to SPSS • . You can also display output as simple text (instead of in interactive pivot tables) in the Draft Viewer.

This will reveal the Pivot Table. formatting toolbar and pivot trays and allow editing of tables to begin. a SPSS: Analysis without Anguish .

3-D graphics and more are included as standard features in SPSS. histograms. scatterplots.1 OV'Msl You move into the Chart Editor from the Viewer by selecting the chart and clicking the right mouse button to open the SPSS Chart Object. Changes in colour.t EO :"~c~h e. axes.. These can be edited in the Chart Editor. rotations and chart types can also be made using the Chart Editor."h~_n:. This will reveal the Chart Editor and allow editing of charts to begin . CHAPTER 1 • Introduction to SPSS • . bar charts.. font. full-colour pie charts. .Chart Editor High-resolution..rf<.Js1 O~g.. • N'JI1ilED"S1 m~o()lI11E..

all the commands currently in the syntax window. Syntax Editor Although most tasks can be accomplished by simply pointing and clicking. This window contains the command syntax that reflects the choices you have made in selecting menu options. and they can be saved in a file for further use. • SPSS: Analysis without Anguish .Text Output Editor Text in the Viewer that is not displayed in pivot tables or charts can also be modified using the Text Output Editor. These commands can then be edited to include special features not available through the pull-down menus and dialogue boxes. a selection of commands or just to the end of the command syntax. type. Possible modifications include changes to font characteristics such as colour. which allows you to save and automate many common tasks. This menu allows you to process the commands you have pasted. You choose whether to run the current command. At the top of the syntax window there is a menu titled Run. you can paste your dialogue box choices into a syntax window. The command language also provides some functionality not found in the menus and dialogue boxes. style and size. As you undertake particular procedures in SPSS. SPSS also provides a powerful command language.

... you will need to use the Run menu to obtain your output and move into the Viewer window. • Dialogue Recall displays a list of recently opened dialogue boxes.... The toolbar contains tools that are available when a particular type of window is active. Each window has its own toolbar.. any of which can be selected by clicking on their name. Script Editor Toolbar The toolbar... • Go to Case allows the typing in of the number of the case you want to go to and it will be found in the data file.... which provide quick and easy access to the special features available in each of the SPSS windows.. • Variables provides data definition information for all variables in the working data file.. a brief description of that tool is displayed.... • File Save saves the file in the active window... At the top of each window you will have noticed a menu and an icon bar. easy access to many frequently used features..•....................... CHAPTER 1 • Introduction to SPSS .... you are able to create and modify basic scripts within the program............ When you put the mouse cursor on a tool in an active window.... located just below the menu bar..If you have decided to paste your syntax in this manner.....•...... Scripting and OLE automation allow you to customise and automate many tasks in SPSS................... •. Tools in the Data Editor include: • File Open allows particular data files to be opened for analysis. Let's learn more about these particular features... provides quick..... • File Print prints the file in the active window. With the Script Editor..

and it saves charts in a variety of common formats used by other applications. the command in which the cursor appears. • Export Output saves pivot tables and text output in HTML. spss: Analysis without Anguish . • Show Data Labels allows the data labels to be shown on the chart. • Weight Cases gives cases different weights for statistical analysis. • Syntax Help assists you with the syntax for the analysis you are undertaking. • Value Labels allows toggling between actual values and value labels in the Data Editor. text.o_~J _~"L_. Word/RTF. • Insert Heading. • Show/Hide allows output to be shown or hidden. • Insert a Text Box allows the insertion of a text box. if there is no selection./~J _ I!!. with a few additions: • Run Current runs commands that are selected or. • Insert Cases inserts a case above the case containing the active cell. • Show/Hide Legend allows the legend on a chart to be shown or hidden. • Insert Variable inserts a variable to the left of the variable containing the active cell. • Select Cases provides several methods for selecting a subgroup of cases based on criteria that include variables and complex expressions.• Find allows data to be found easily within the data editor. Some of the tools in the Viewer include: ~fiIIl~I[Q. • Go to Data moves directly into the data file and makes the Data Editor window active.1 ~ ~J ~ lliIDlbl c? I ~ Iil_U . The Chart Editor has a range of tools that can be used to make your charts more interpretable and attractive. In the Syntax Editor Window the tools are similar to those in the Data Editor. Text allows headings. titles and text to be added into your output. Title.! EJIIJl! • Print Preview allows what will be printed to be viewed. • Split File splits the data file into separate groups for analysis based on the values of one or more grouping variables. and Excel format. including: • Show Properties Window shows the properties of the chart. • Use Sets allows the selection of sets of variables to be displayed in the dialogue boxes.

.. making it much easier to generate new output without having to switch between windows.... CHAPTER 1 • Introduction to SPSS .....•.. The File menu allows you to create new files.......... .. File Edit The Edit menu allows you to modify or copy text from the output or syntax windows......... The Analyze and Graphs menus are available in all windows............................................... read in files from other software programs.......Menus SPSS is menu driven and has a variety of pull-down menus available for the user.. and to search for and replace text or data.... save files and print.. It also offers a number of personal preference options.. The main menu bar in the Data Editor contains ten menus: • File • Edit • View • Data • Transform • Analyze • Graphs • Utilities • Window • Help......... open existing files....

.......>. 32 20 20 ir~ns~. CJ:!PVData rrccertes........cale Cases.............. ~ ............< Identify Dupi....... In addition. and to change particular characteristics of the window (for example............. iVelyht ceses............... ge'ect Ceses . and selecting and weighting cases.............. The Transform menu allows you to change certain variables in the data file using commands such as Recode and Rank Cases....... Data The Data menu allows you to define variables and create variable templates........ ~ireg~te.................. The View menu allows you to make the status bar and toolbar active..........•...9 97 9' 97 97 57 10 14 (!7 12 20 q? Transform ... . more global changes to SPSS data files are available..... .. sorting and transposing variables and cases... by removing grid lines..... zesmctcre.. SpiitFHe ..View .. displaying value labels and changing font style and size).......................... • SPSS: Analysis without Anguish ... rreertvenebe Insert Cases Go to Case .......... such as merging files.. > 18 14 20 20 l' :.......... inserting. as well as to create new variables using the Compute command............ ltisex oefoe aetes......

Utilities The Utilities menu allows you to display file and variable information. In addition. as well as histograms and scatterplots. Graphs The Graphs menu allows you to create bar.:ssinQ vaiues.Jnsect. A variety of statistical procedures are available. it allows you to define and use different variable sets. ~ Eaeduc 12 20 20 20 97 20 16 I l1lae-::!~c'l 12 18 sEeduc S? 20 17 I 20 '" Erest980 22 75 59 . sibs 131 R. zecece j". Rand~l NCi'Tlbe~ eed".. line. S Analyze The Analyze menu allows you to select the analysis you require.create n-e Se-es.. area and pie charts.l8 Id J CHAPTER 1 • Introduction to SPSS II .. ranging from summarising data through to more complex designs.

...•. you can move efficiently between data. Variable(s): <t..•.. 1 Charts .......Region of the <~ Is Life b:cfting <$> ...("o..........JI. syntax.. The SPSS tutorial can be accessed through the Help menu............. • SPSS: Analysis without Anguish ..) I I I P' Display frequency tables Statistics .. Register Product ....... Using this menu..... output and chart windows.......•. Dialogue boxes for statistical procedures When you choose a statistical procedure....... command pushbuttons and the option to choose sub-dialogue boxes.. <~> eneral Happiness G or UnitE <~Number of Children 01):_1.. Command SyntelX Reference $PSSHC(!'Ie:P~ Aoo.. Each main dialogue box has four basic components: the source variable list..•..... a dialogue box appears on the screen... cese snces StatisticsCoad"'...·! t. The Window menu allows you to arrange.... select and control the attributes of the various windows..............._4 ~> of Brothers Number v D 1~~mJ Reset Cancel Help of Respondenlr .... the target variable(s) list... Help The Help menu allows you to access information on how to use the many features of SPSS.......}L~.......~e .Window ....

~:nn!"lrl"'i i> Region of the UnitE Happiness or D I'lumber of Brothers ........ ".... I To select multiple variables that are grouped together in the variable list Click on the first variable you wish to select..~).........~d _ tables P' Displa)t frequency StatistlcLI Charts....... I Charts ....... Format.Is Life Exciting IN~"JI I C Help i> Number of Children i> .. CHAPTER 1• Introduction to SPSS ... The source variable list is a list of all the variables in the data file........................Source variable list ....... To select a single variable Highlight the variable and click on the [EJ button next to the selected variable list box..........•... Then click on the [EJ button..... Variable(s): Reset Cancel Help I I I P' Display frequenC'{ tables Statistics .... Variable(s): <~Generel <#) <~Race of K"...... 1 oI Paste Reset <®. ~...A~e of Respondem.. then hold the Shift key and click on the last variable in the group.../ _ 4"~..........•....... I Format..

. Deselects the variables in the selected variable list(s) and resets all specifications in the dialogue and sub-dialogue boxes. • Reset • Cancel • Help spss: Analysis without Anguish . This command syntax can then be modified if necessary or new syntax can be added.. highlight the variable and click on the IE] button. and so on. then closes the dialogue box.. I Cancel I Help I Reset P' Display frequenC)' tables Statistics . For certain statistical procedures. 1 Chalts .. which you will notice is now reversed. both dependent and independent variable lists are created. Allows access to a Help window relevant to the current procedure. Pastes the syntax associated with a procedure into the Syntax Editor window. then hold the Ctrl key and click on the next variable. If you wish to remove variables from this list. FormaL Selected variable(s) list The selected variable(s) list is a list or lists of variables you have chosen for certain analyses. This process is called deselection of variables. Command pushbuHons The five standard command pushbuttons in most dialogue boxes are: • OK • Paste Runs the procedure. Then click on the IE] button.To select multiple variables that are not grouped together Click on the first variable. Cancels any changes in the dialogue box settings since the last time it was opened.

These buttons may include: Statistics .. I v _ P Display frequency tables Statistics... restores the previous settings and returns to the main dialogue box... Plots ......(~ Respondents (~ Race of Sex K. Charts ......mean Dispersion .(~ .. Format......... I r Cut points r Percentile(s): I equal groups Continue Cancel .. When selecting a statistical procedure. ......... These are accessed by selecting the buttons at the bottom or on the side of the main dialogue box.... etc......."'II"\rl... ':::$> General Happiness <~ Is life Exciting or D <iV' Number of Brothers (~ Number of Children ... Save .... Cells ...~. As outlined in the screen dump below....M.•....... Options .•........ Allows access to a Help window relevant to the current procedure...Afje of Respondent Cancel Help I ':::$> Highest Year of ~ Il~_I ... .. I Charts. Ignores any changes.--r 'Values are group Help I I I midpoints r Std..E...@'.............:....r. deviation r Minimum r Variance r Maximum r Range r S.. the three command pushbuttons within subdialogue boxes are: • Continue • Cancel • Help Saves any changes to the settings and returns to the main dialogue box.........:........····..... you can make additional specifications that are available in the sub-dialogue boxes........" FonnaL Accessing sub-dialogue boxes ......... CHAPTER 1 • Introduction to SPSS • ...

•. r Values P" Minimum are group midpOints Radio buHons Radio buttons (0) allow you to make single selections within sub-dialogue boxes. Multiple check boxes can be selected if required.. When you click on the box. Variables: tJK c'2j'::ie Reset Cancel <$> Race of He!:rlolnn ~ . ... a solid black circle appears in the centre of the button @. choices can be made using check boxes. ($> General ($> Number ~ <$> Is Lrfe Exciting Happiness or Dl of Brothers Number of Children : . click the box again.. Two -tailed Kendall's tau-b r One-tailed Options ..ri Region of the Un~e. To deselect this option.Check boxes./ is displayed in the check box 0.. Check boxes Check boxes (0) allow you to select certain options within sub-dialogue boxes. radio buttons and drop-down lists.. When you click on the radio button.7 Rag slgn~jcant correlations r • SPSS: Analysis without Anguish . radio buttons and drop-down lists Within sub-dialogue boxes.~j Help I I I I I r r.) An'" rot R"'~Mnrl""l"\t!.

General Social Survey ISPSS (-.. File name: Save as riPe: 11991 U..............sav) P' Paste Cancel r r CHAPTER 1• Introduction to SPSS • ....' •i <~ Number <~ Highest <'~Highest ~ ~ I.S.................... And if you are in the Syntax Editor window........spo will be displayed... then the file extension ........ Drop-down menus or lists allow you to make single selections from a list of alternatives.f Responde! 1*'~ Dependent: <~General <it -~ (~ Region of the Unrte! ............ If you are working in the Data Editor window. Keeping 43 of 43 variables. Help (~ Po!..... <i9 Respondent's Sex 1"'1 <i9 Race e.. then the extension ..Drop-down lists ......1 HapPiness! of Brothers! of Children ~ >1 ! <~ Number Is Ufe Exerting or D!I ~.. the Save As .... If you are in the Viewer window.. I Saving files and ending a session To save files in SPSS...onal P <19 Occupational Cate~ <~ R's Federal Income <i9 Take (~To Case Labels: Active Part In '>%> To Obey [obey] Be "·Jell Uked 0 To TI1ink for Ones€Lyl Options .. then the extension .... Year Scho Highest Year Scho '>%> Rs Occupat.....sav will be displayed in the Save as type: box and you will be prompted to type in a filename in the Filename: box......... command is selected from the File menu....sps will appear.eof Respondenll Highest Year of Sci Year Scho.....

To end a session. . ~veAs". r<1arkFleRead Disp!ay aete ceche nate. then SPSS will prompt you to save the contents of each window. Print Preview Only fiie: rnformabon 97 42 97 20 13 1J 38 Print.. If you exit a session without first saving your files. 65 spss: Analysis without Anguish . S'Idtch Server". select the File menu and the Exit option..

#.g. LE. variable type. _ or $. • Reserved keywords cannot be used as variable names e. CHAPTER 2 • Preparation of data files • . GE. ALL. BY. and a change from lower to upper case. !. a full stop or the symbols @. AND. • Blanks and special characters cannot be used (for example.e. • Must be unique. desire for 24-hour shopping. OR. Naming a variable Variable names must comply with certain rules: • Must begin with a letter. English. WITH. LT. Working example You have developed a questionnaire that asks a number of questions relating to an individual's shopping behaviour. NE. Defining variables The process of defining variables has seven optional steps. The variables you have measured include: gender. age. duplication is not allowed.Preparation of data files This chapter describes the process involved from data source to data file: that is. periods. choice of shopping area and amount spent on groceries per week. • The length of the name cannot exceed 64 bytes (typically 64 characters in singlebyte languages e. column format and measurement level. inserting and deleting cases and variables. can be written in upper or lower case. French. assigning appropriate numeric codes to alphanumeric data and dealing with missing data. It focuses on defining variables. the conversion of raw source material to a useable data file. " and *). Spanish etc. missing values. • Names are not case sensitive. so SPSS attempts to break lines at underscores. i. The primary step is naming your variable and the other steps cover labels (variable and value). German. NOT. saving data and opening existing data files will also be addressed. These preparatory steps are desirable before data entry can begin. The remaining characters can be a letter. any digit. GT. EQ. entering data. Other procedures such as applying variable definition attributes to other variables. • Long variable names need to wrap to multiple lines in the output.). TO.g. For each case you have assigned a participant identification number. ?. • Cannot end with a full stop or underscore.

.... you may also wish to use a numeric code... Variable labels The variable label is the full description of the variable name and is an optional means of improving the interpretability of the output... the first variable you will name in the data file is id and the label for this variable is 'participant identification number'....... Value labels .. If you choose the latter.... gender........ Variable name gender age allday area cost Label optional optional desire for 24-hour shopping facilities choice of shopping area amount spent on groceries per week You will notice that the gender.... then a number of rules apply: • Missing value codes must be of the same data type as the data they represent.. When dealing with missing data you may leave the cell blank or assign missing value codes... Suggested variable labels for the other variables appear in the following table. • By convention..... • Missing value codes cannot occur as data in the data set. The variables allday and area are also categorical.. the variable name can be age because this name complies with all of the rules listed........... The variable choice of shopping area can be labelled area.. the choice of digit is usually 9..... For example... however...... Variable name id gender age allday area Label not applicable 1 = female 2 = male not applicable 1 = would use 24-hour shopping 2 = would not use 24-hour shopping 1 = shop in suburb where living 2 = travel to next suburb 3 = travel further to shop not applicable cost Missing values It is rare to obtain complete data sets for all cases. For example..... This type of variable is categorical because it has discrete categories..... For example.. Value codes and labels for the above variables are illustrated in the following table.... • SPSS: Analysis without Anguish .......... for gender you could assign a code of 1 for female and 2 for male.. allday and cost... for missing numeric data.... When variables are measured using interval or ratio scales.In the case of the variable age... age and choice of shopping area variables do not require variable labels because the variable names are self-explanatory. It is possible to use alphanumeric codes for the variables......... missing value codes must also be numeric.•. then coding is not relevant unless categorisation is required... and the other variables on the questionnaire could be id.

..... by clicking the Data View tab you will notice that the variable name id has appeared in the first column as a heading.... ordinal or nominal..e....... centre or right)....... it is possible to select other variable types (such as date...... You will notice that the width of the column automatically increases to accommodate the long label. It is possible to adjust the width of the Data Editor columns or change the alignment of data in the column (left.... If you return to the Data View. type the label for the variable.. For the variable id... Measurement level You can specify the level of measurement as scale (interval or ratio).... id) and press Enter....... currency.... To 1 2 a Working in the Untitled ..... i..... double-click a variable name at the top of the column in the Data View or click the Variable View tab.... type the first variable name (i. 3 In the first blank cell of the Label column.. Identification Number........ CHAPTER 2 • Preparation of data files .....SPSS Data Editor window.•.... SPSS assumes that all new variables are numeric with two decimal places.. there are no value labels or missing values.. string) and vary the number of decimal places.. However.......... In the first blank cell of the Name column.•.Variable type By default.. Column format ... and the other properties are appropriate so you can move on to the second variable.........e...

spss: Analysis without Anguish . type the first value code for the variable (i. SPSS automatically supplies other properties such as type. Again. width. in the second blank cell of the Name column.00 = "male" Help 5 Click on OK. i. type the label for this value. Since gender requires no further explanation a label will not be typed in. Repeat this process for the second value. i.e. Click on Add. However. 1) then tab. female.00 = "female" 2. etc. gender.e. In the Value: box. type the second variable name.e. values. and press Enter. values are assigned. You will notice that there is now information in the cell. 3 4 OK Cancel 1. In the Value Label: box. You will notice that the value and its label have moved into the box below. 2 Click on the second cell of the Values column and then on the button on the right to open the Value Labels box.1 Working in the Variable View.

For the variable gender.create a 1 2 3 Click on the second cell of the Missing column and then on the button on the right to open the Missing Values dialogue box. r r. Applying variable definition attributes to other variables Once you have nominated variable definition attributes for a variable you can copy one or more attributes and apply them to one or more variables. you may have several variables that use the same response scale where I = strongly disagree. other options such as Type. 2 = disagree. Having defined these value labels for one variable you can then copy them to other variables. 9. 4 = agree and 5 = strongly agree. Select the attribute cell(s) to which you want to apply the attribute(s). Column Width and Measurement Level are also available if you are dealing with different types of variable that require special conditions. you may wish to select the nominal scale of measurement. You will notice that the missing value has been recorded. you may have several variables with a Likert scale response format. As highlighted earlier in this chapter. to 1 2 3 In the Variable View. Basic copy and paste operations are used to apply variable definition attributes. CHAPTER 2 • Preparation of data files . That is. type the missing value code. 3 = neutral. You can select multiple target variables. select the attribute cell(s) you want to apply to other variables. 4 Values The previous process is then repeated for each variable you wish to define in your data file. From the Edit menu click on Copy. i. For example. r No missing values Discrete missing values Help Range plus one optional discrete missing value I Click on OK. In the first box. Select the Discrete missing values radio button.e.

spss: Analysis without Anguish .4 From the Edit menu click on Paste. A new case (row) will be inserted. you can hold down the Control key with an arrow key to take you to the limit of the file in that direction. 1. To insert a new case between existing coses 1 2 Select any cell in the case (row) below the position where you want to insert the new case. Remember that it is more efficient to code gender numerically. Data values are not recorded until you press Enter or select another cell. 2 3 Press Enter or move to another cell by using the arrow keys or mouse. indicating that the cell is now active.SPSS Data Editor window.e. This value is displayed in the cell editor at the top of the Data Editor window and also in the cell itself. i. your Data View window will look like this: Inserting and deleting cases and variables Often you may need to insert or delete extra cases (rows) and variables (columns) in the existing data file. You can achieve this by using the menus as described below or by using the appropriate tools from the toolbar. Select the Data menu and click on Insert Case or click on the Insert Case tool. 4 Having entered data for the first two cases. Entering data enter the following two cases gender 1 male female age 27 34 allday 1 area 1 3 cost 4 2 1 2 7 In the Data View click on the first cell in the Untitled . You will notice that a heavy border appears around the cell. If you copy attributes to blank rows. Type in the first value for id. then new variables are created with default attributes for all but the selected attributes. To move around your data file quickly.

Click the variable name in the Data View or the row number in the Variable View where you want to move the variable to.To 1 2 Select any cell in the variable (column) to the right of the position where you want to insert a new variable. You will be asked to give the file a name. you can use the Delete button if you wish. then remember to change to the appropriate drive. 2 3 4 Saving data files save 1 2 3 1 2 0 dote first First ensure that you are in the Data Editor window. you can use the Delete button on the keyboard. Select the Data menu and click on Insert Variable or click on the Insert Variable tool. Alternatively. 3 CHAPTER 2 • Preparation of data files .. In the box for File Name: type in the file name of your choice. Select the Edit menu and click on Paste. If you are saving data to a floppy disk. Select the Edit menu and click on Cut. The entire variable is highlighted. Select the Edit menu and click on Clear. or select any cell within the column that you wish to remove.. Select the Edit menu and click on Clear. Moving variables You may wish to change the sequence of variables in the Data Editor window. to open the Save Data As dialogue box. The entire variable is highlighted. Again. SPSS will append the extension . or Select the File menu and click on Save As . move a variable 1 For the variable you want to move. Click on Save. A new variable (column) will be inserted. Click on the File Save tool. a case 1 2 Click on the case number on the left side of the row if you wish to delete the entire case or select any cell in the row that you wish to delete.sav automatically. then insert a new variable in the position where you want to move the variable. click the variable name at the top of the column in the Data View or the row number in the Variable View. If you want to position the variable between two existing variables. (column) 1 2 Click on the variable name at the top of the column if you wish to delete the entire variable.

Click on the File Save tool. Click on Read Text Data to open the Open File dialogue box. Select the file from the file list. files that have been created in other software packages can be imported into SPSS. Click on Open. Click on Open. Practice example You have surveyed individuals within your local community to determine their attitude towards the opening of a new school. Opening an existing data file Once a data file has been saved in SPSS it can be accessed in subsequent sessions.save an 1 2 1 First ensure that you are in the Data Editor window. You have collected data on the variables in the following table.. Your changes will be saved to the existing file. a text 1 2 3 4 S Select the File menu. an 1 2 3 4 Select the File menu. to open the Open File dialogue box. or Select the File menu and click on Save Data. • SPSS: Analysis without Anguish . Click on Open and Data . Furthermore. Follow the steps in the Text Wizard to define how to read the data file. Select the file from the file list..

CHAPTER 2 • Preparation of data files . Data must then be entered and saved in a data file.id gender Female length of residence 2 3 6 5 8 9 11 3 5 12 10 8 9 9 3 number of children 2 would you use the school? Yes 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Male Male Male Female Female Female Male Male Female Male Female Female Male Male Female Male Female Male Female No 2 4 3 2 2 0 2 3 Yes No Yes Yes Yes Yes Yes Yes No Yes 4 2 0 3 Yes Yes No Yes Yes 0. Other definitions are optional.5 4 3 2 2 Yes ? Yes Given the above data. your task is to create a data file in SPSS. allocating appropriate variable and value labels. you must define each variable. Remember.

Participants were also asked their age. Several items of this scale were negatively worded and thus required recoding. 2 == Agree. It is often useful to be able to conduct analyses on subsets of the data and to make conditional transformations of variables. using a five-point Likert response format such that: I == Strongly agree. will be demonstrated in the context of the following research Working example Community residents were surveyed to determine their attitudes towards physical exercise. Attitude towards exercise was measured using a seven-item scale comprising statements to which participants agreed or disagreed.sav on the website that accompanies this title and is shown in the following figure. These procedures example. Furthermore. number of hours per week spent in physical activity and whether they participated in team sports. mean substitution may be an alternative. then non-normal distributions may be transformed before further analysis. nonparametric techniques may also be used because they are less powerful than their parametric counterparts. Data may also need to be transformed using Recode and Compute commands. In addition. 4 == Disagree and 5 == Strongly disagree. spss: Analysis without Anguish . then this may affect the validity of the results that are produced. if data files have missing values.Data screening and transformation Data screening and transformation techniques are useful in making sure that data have been correctly entered and that the distributions of variables that are to be used in analysis are normal. The data file can be found in Work3. If distributions do vary from normal. These can be achieved using the Select If and Compute If commands. if distributions deviate dramatically. 3 == Neither agree nor disagree. If variable distributions deviate dramatically. Each individual (of the 99 students who participated) was given an identity number.

Statistics and then Frequencies .11 . CHAPTER 3 • Data screening and transformation .. hours spent doing exercise and attitude to exercise) are normal. For example. Click on Descriptive cies dialogue box. Errors in data entry Errors in data entry are common and therefore data files must be carefully screened. l L 3iJ 23 32 72 11 -l2 52 02 11 14 1S 1G 17 2S 26 2S You wish to determine whether the distributions of the continuous variables (that is. out-of-range values can be detected easily using the Frequencies or Descriptives commands and replaced in the data file with the correct value. age. while responses to the attitude items are being entered. 1 2 3 Select the Analyze menu.. to open the Frequen- Select attl to att7 and click on the IE button to move these variables into the Variable(s): box.. This procedure is not necessary for team participation because this is a categorical variable.

5 91. You will observe that all cases for att l are within the expected ensure the other items are also within the expected range.~Id <~hou".3 38.. • SPSS: Analysis without Anguish .2 31.2 313 384 3.9 100.1 100.0 5. You must Cumulative Percent 22 31 38 3 22. FREQUENCIES VARIABLES=att1 att2 att3 att4 att5 att6 att? IORDER= ANALYSIS.0 22. normal plot.0 ~ 99 Assessing normality 1l§jJ' The assumption of normality is a prerequisite for many inferential statistical techniques. 5 4 Click on OK. are available with a Lilliefors to test normality: significance level and the Shapiro- • Kolmogorov-Smirnov Wilks statistic • Skewness • Kurtosis.0 22. There are several procedures available to obtain these graphs and statistics but the Explore procedure is the most convenient when both graphs and statistics are required.9 94.0 5.2 53. a number of statistics statistic. There are a number of different ways to explore this assumption graphically: • Histogram • Stem-and-leaf • Boxplot • Normal probability plot plot • Detrended Furthermore. att1 Frequency 'Valid strongly agree agree neither agree nor disagree disagree strongly disagree Total Percent 'Valid Percent range of 1-5.4 3.rn participation ltear <~aQe rv Display frequency tables Formet.1 100.ex <~te2.

Click on the Options . Select the variable you require (i. plots and statistics will be generated only for cases with complete data. IE] button to move this Factor Ust: (t> teern par1icipaticn (j9houfsex Statistics .1 2 3 Select the Analyze menu.e. If this option is not selected then. age) and click on the variable into the Dependent List: box.. command pushbutton to obtain the Explore: dialogue box. That is. to open the Explore dialogue box. r P' Faclor levels together Dependents together None P' Stem-and~eaf c: P Histogram Cancel Help Normality plots 'Nith tests 6 7 8 9 Click on Continue. Plots sub- Click on the Histogram check box and the Normality plots with tests check box. click on the Exclude cases pairwise radio button.. CHAPTER 3• Data screening and transformation . In the Display box. ensure that Both is activated. 4 5 Click on the Plots . Options In the Missing Values box... Click on Descriptive Statistics and then Explore .. by default any variable with missing data will be excluded from the analysis... r. command pushbutton to open the Explore: sub-dialogue box. and ensure that the Factor levels together radio button is selected in the Boxplots display.

..... spss: Analysis without Anguish ..... EXPLORE VARIABLES=age IPLOT BOXPLOT STEMLEAF ICOMPARE GROUP ISTATISTICS DESCRIPTIVES ICINTERVAL 95 IMISSING PAIRWISE INOTOTAL. The stem-and-leaf plot is very similar to the histogram but is displayed on its side.............L...•...... I: III :l 1T10 III 5 Mean = 45....35 std. the midpoint of the first bar is 20 and the midpoint of the second bar is 25...--+--l 20 30 40 N = 99 50 60 70 age Above is a histogram of age...... = 12............... HISTOGRAM NPPLOT Histograms Histogram 20 15 >U IL..... These plots provide more information about the actual values in the distribution than does the histogram......(0 r 10 Exclude cases pairwise Report values Click on Continue and then OK........•... The values on the horizontal axis are midpoints of value ranges...L. Stem-and-Ieaf plots and boxplots ....... The shape of the distribution is considered normal..... Closely related to the histogram is the stem-and-leaf plot and the boxplot... Dev.--+--'_"_... The length of each row corresponds to the number of cases that fall into a particular interval... The values on the vertical axis indicate the frequency of cases...... For example... indicating that each bar covers a range of 5....... ....059 ..

it plots summary statistics such as the median. then it is positively skewed. The smallest and largest observed values within the distribution are represented by the horizontal lines at either end of the box.then these will be represented by an asterisk (*). then the distribution is negatively skewed. The median is presented by a horizontal line through the centre of the box. while the leaf is the trailing digit (23347889). 80 70 60 50 40 - 30 20 10 age If the distribution has any extreme scores - that is. the stem of the graph corresponds to the first digit of a score (2).00 4. The lower boundary of the box is the 25th percentile and the upper boundary is the 75th percentile. For example. If the median is closer to the top of the box. which plot actual values.00 0:122233344456777783999 00001222233445555555556777229 000000022223455566677773882 D005-:'66 :. To determine whether a distribution is normal you look at the median that should be positioned in the centre of the box.00 29. and extreme scores in the distribution. The spread or variability of the scores can be determined from the length of the box. and if it is closer to the bottom of the box. also summarises information about the distribution of scores.00 2 3 q 5 6 23. three or more box lengths from the upper or lower edge of the box . illustrated below. age S~em-and-:eaf Plcc :'eaf St. 25th and 75th percentiles. commonly referred to as whiskers. CHAPTER 3• Data screening and transformation . Cases with values between one-and-a-half and three box lengths from the upper or lower edge of the box are called outliers. Unlike the histogram and the stem-and-leaf plot.em & ::. and these are designated by a circle.A stem-and-leaf plot represents each case with a numeric value that corresponds to the actual observed value.00 0025 ~ac~ leaf: : case(s) The boxplot.

If the sample is from a normal distribution. o . Z ~ E . Detrended Normal Q-Q Plot of age ° 0. then the cases fall more or less in a straight line. each observed value is paired with its expected value from the normal distribution.1 - 10 20 30 40 50 GO 70 80 Observed Value spss: Analysis without Anguish .10 0 00 0 E 0 0 Z E ~ ~ 0. the points should assemble around a horizontal line through zero.0 C ° ° <P ° 0 0 '%°0 0 0 0 0 0 °cf 0 0 6' °0 0 0 ° -0. 0 0 0. then there is no pattern to the clustering of points. W -2 -4 10 20 30 40 50 60 70 80 Observed Value It is also possible to plot the actual deviations of the points from a straight line. <> )( 0 III Il..... Normal Q-Q Plot of age 2 o o .2 - 0 0 . If the sample is from a normal distribution.. This type of plot is referred to as a detrended normal plot and is illustrated in the following figure..Normal probability plots and detrended normal plots In a normal probability plot.

**Kolmogorov-Smirnov and Shapiro-Wilks statistics
**

The Kolmogorov-Smirnov statistic with a Lilliefors significance level for testing normality is produced with the normal probability and detrended probability plots. If the significance level is greater than .05, then normality is assumed. The ShapiroWilks statistic is also calculated if the sample size is less than one hundred.

Tests of Normality

Kolmogoro'l-Smimov2

Statistic age

Shapiro-';\/il\(s Sig. Statistic

.057

I

I

of

99

I

I

I

.200'

.992

I

df

99

I I

Sig .

, This is a lower bound of the true significance. a. Lilliefors Significance Correction

**Skewness and kurtosis
**

Skewness and kurtosis refer to the shape of the distribution, and are used with interval and ratio level data. Values for skewness and kurtosis are zero if the observed distribution is exactly normal. Positive values for skewness indicate a positive skew, while positive values for kurtosis indicate a distribution that is peaked (leptokurtic). Negative values for skewness indicate a negative skew, while negative values for kurtosis indicate a distribution that is flatter (platykurtic). Other descriptive statistics, such as measures of central tendency and variability, can also be used to determine the normality of the distribution. Oescriptives

Statistic age 1.lean 95%) Confidence lntsrvat for fvlean 5% Trimmed Mean Median Variance Std. Deviation r.linimum fdaximum Range Interquartile Range skewness Kurtosis

45.35

Std. Error

1.212

Lower Bound Upper Bound

42.9547.76

45.25 45.00

145.415

12.059

18

75

57 18 .126 -.316 .481

You should be familiar with most of the above statistics. However, you may not have encountered the 5 per cent Trim statistic, which is the mean of the distribution with the top 5 per cent and the bottom 5 per cent of scores removed. The purpose of this trimming is to obtain a measure of central tendency that is unaffected by extreme values.

CHAPTER

3 • Data screening

and transformation

**Assessing normality by group
**

It is sometimes necessary to assess the normality of a variable across two or more levels of another variable. For example, you may wish to assess the normality of age for team participants and non-team participants separately. This can be achieved using the preceding procedure with one addition. In the Explore dialogue box you will notice that there is a Factor List: box. By transferring your group variable (that is, team participation) into this box, the chosen statistics and plots will be generated for each group independently.

Variable transformation

Variables rarely conform to a classic normal distribution. More often, distributions are skewed and display varying degrees of kurtosis. When skewness and kurtosis are extreme, transformation is an option. The decision to transform variables depends on the severity of the departure from normality. Having decided that transformation is desirable, the researcher must select the most appropriate transformation methods. The options available can be found in any good chapter on data screening. To illustrate the process of transformation, the hours of exercise variable will be examined using the above steps. Plots and normality statistics were obtained.

Descriptives

Statistic hoursex Mean 95% Confidence lntsrval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum R3nge tnterquartile Ranos 81(evmess Kurtosis Tests of Normality Kolrncqorov-Smimov 8ig. stansttc I df .000 .178 I 99 I

s

Std. Error

1.816

17.67

Lower Bound Upper Bound

1.±.06 21.27

15.49

11.01) 326.3137 18.0613 1 87 86 21 1.770

... ~

2'« .481

3.475

noursex

3.

I

Statistic .810

I

Shapiro-\,Vilks I df I

99

I

Sig. .000

Lilliefors Significance Correction

spss: Analysis

without

Anguish

Histogram

30

25

..

>c:

<II :J <II IL

20

0"15

...

10

5

Mean = 17.67 std. Dev. 18.066 N=99

=

0

0

20

40

60

80

hoursex

Normal Q-Q Plot of hours ex

3

o

2 00

o

o

;:a E o Z

"U

...

! x_o

)(

..

w

-1

o o

-2 -20

o

20

40

60

80

100

Observed

Value

CHAPTER

3 • Data

screening

and transformation

**Detrended Normal Q-Q Plot of hoursex
**

2.0-

1.5-

00

iii E Z E

li

o >

GI

o

1.0-

o

0.5-

o o

.;

C

o o

0.0-1--.",0------....,0..--------------1

\~~fJP'b

20

-0.5-

o

40

60

80

100

Observed Value

100-

80

070 094 06 087

60-

076

-r-

40

r--'20

0hoursex

All the above charts and statistics suggest that the variable hoursex is not normally distributed but is significantly positively skewed. The boxplot indicates that there are five outliers, as illustrated by the circles. Therefore, a natural logarithmic transformation is appropriate. To transform the variable, you will need to use a data transformation command called Compute. To compute values for a other variables 1 2 Select the Transform menu. Click on Compute to open a Compute Variable dialogue box. on numeric transformations of

spss:

Analysis

without

Anguish

3 4 S

In the Target Variable: box, where the cursor is flashing, type an appropriate variable name (i.e. Inhours). From the Functions: box, select the appropriate transformation (i.e. LN(numexpr)) and press the ~ button. From the variable list box, select the variable (i.e. hoursex) and press the IE button to insert the variable in the function.

ression:

;::;'!b=:d====:::::::;~rD /!b.:ll

(!b0'.:2 i#,>att3 .jbartl. /!batto '!batt7 ~hoursex

(!batto

<iP teem

-:@a.=e

participation

6

Click on OK.

COMPUTE EXECUTE. Inhours = LN(hoursex) .

If you now return to your data file, you will see that a new variable has been created in the data file called Inhours. To make things quicker, use the SPSS Manager at the bottom of your screen to quickly move between your data and syntax editor windows and the SPPS Viewer, which displays your output.

CHAPTER

3 • Data

screening

and transformation

3183 2.027 Statistic .00 2.706 .Now that you have transformed the variable hoursex into Inhours...I-.15383 . oev.0881 2..3979 1. you can obtain normality graphs and statistics for this new transformed variable using the procedures outlined at the beginning of the chapter. .. = 1.47 4.3183 std.00 3. This output appears below.5484 2..095 I I df 99 I I Sig. Error ..331 1.I. Descriptives Statistic Inhours Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std..&.15383 N = 99 Mean 1.312 -..83 -.f--'I.47 1.481 Std.I 0.00 4. Deviation Minimum r'llaximum Range lnterouartlle Range Skewness Kurtosis Lower Bound Upper Bound 2.3376 2. Lilliefors Significance Correction Histogram 10 8 4 2 '"""""'"'""""--t-..o.010 a.243 .00 Inhours spss: Analysis without Anguish .00 = 2..00 4. .o.11596 Tests of Normality xoimooorov-smimov" Statistic Inhours .965 I Shapiro-Wilks df 99 I I I Sig..

~O >< -1 W 0 -2 0 234 s 6 Observed Value Detrended Normal Q.Q Plot of Inhours 3 o 2 1i E Z "0 0 ... 0 0 E ~ -0.ch leaf: Normal Q. 999999 e ...2 a9 0r:P 1i E 0.2 0 °0 0°0 0 ~ ~Ch 0 00<10 0 0 0 0 0 ° -0. GI u .)0 5555565 tc.0 (Q)C~ ° ° z ~ .4 0 I 0 2 3 4 s Observed Value CHAPTER 3 • Data screening and transformation . C'} i ~66~66.6666 :5.0« :'3..Q Plot of Inhours 0..

In the Variable List box. It is apparent from the preceding statisncs and graphs that the natural logarithmic transformation was appropriate because the distribution of hoursex is now relatively normal.00 -Inhours . 3 • SPSS: Analysis without Anguish . The Lilliefors statistic would suggest that there is stilI a slight problem. but all the other diagnostic data are satisfactory. • Recoding negatively worded scale items. you must acknowledge that a transformation has been undertaken on the data. By selecting Into Different Variables. collapse (I continuous In relation to the research example. median split). you wish to obtain a median split on the variable age. • Replacing missing values and bringing outlying cases into the distribution. Click on Recode and Into Different Variables to open the Recode into Different Variables dialogue box. Data transformation Recode You can modify data values by recoding. 1 2 Select the Transform menu.00 0.00 3. you will retain the original data. There are three instances where recoding variables is appropriate: • Collapsing continuous variables into categorical variables (for example. It is important to note that when reporting and interpreting results involving transformed variables.SOD -4. age) and click on the IE button to move the variable into the Input Variable -> Output Variable: box.e.00 200 1. select the variable you wish to recode (i.

1 2 Select the Transform menu.e. r r r System· or user<nissing Range: Range: r. 2).e. Click on Recode and Into Same Variables to open the Recode into Same Variables dialogue box. If you now go back into your data file you will notice that the variable agecat now has two possible values: I = 0-44 years and 2 = 45+ years. 4 = disagree and 5 = strongly disagree. 44). type the new value (i. att2. select the variable(s) you wish to recode (i. Click on the Old and New Values . 1). Click on the Add command pushbutton. 3 CHAPTER 3• Data screening and transformation . (45 thru Highest=2) INTO agecat.. To recede negatively worded scale items Three of the items on the attitude scale required recoding because they were negatively worded. RECODE age (Lowest thru 44=1) EXECUTE. Remember that the response format for the items was 1 = strongly agree. type the new variable name Click on the Change command pushbutton. This variable can now be used in analysis where categories are required. By selecting Into Same Variables.e. radio button and type the median in the box In the New Value box. agecat). Click on the second Range: (i. Click on the third Range: radio button and type the median plus one (i.. Variable box. 45). type the new value (i.e.4 5 6 7 8 9 10 11 12 In the Name: box of the Output (i.e. you will overwrite the original data.e. In the Variable List box. 3 = neutral. 2 = agree. command pushbutton to open the Recode into Different Variables: Old and New Values sub-dialogue box. In the New Value box. att4 and att6) and click on the IE button to move the variables into the Variables: box. Click on the Add command pushbutton. Range: through highest r 13 r Continue Allother values I Cancel Help Click on Continue and then OK.

4 5 6 7 8 9 10 11

Click on the Old and New Values ... command pushbutton to open the Recode into Same Variables: Old and New Values sub-dialogue box. In the Old Value box, type the old value (i.e. 1). In the New Value box, type the new value (i.e. 5). Click on the Add command pushbutton. In the Old Value box, type the second old value (i.e. 2). In the New Value box, type the new second value (i.e. 4). Click on the Add command pushbutton. Repeat steps 8,9 and 10 for the remaining two values.

(' (' System-missing

System-missing

**t: System- or user-missjng
**

('

r-~-'

Old ..>New: 2

00)

~1~oo~>5~~-------4

r Range:

(' All other values

Continue Cancel

Help

12

Click on Continue and then OK.

RECODE att2 att4 att6 EXECUTE.

(1=5)

(2=5)

(4=2)

(5=1).

Att2, Att4 and Att6 have now been recoded to allow computation variable.

of a composite

**To reploce missing values
**

Missing observations can be problematic. To avoid this problem you can replace missing values with estimates computed with one of several methods. A commonly used method is mean substitution. 1 2 3 4 5 6 7 Select the Transform menu. to open the Recode into Same Click on Recode and Into Same Variables Variables dialogue box.

In the Variable List box, select the variable(s) you wish to recode (i.e. attl) and click on the IE button to move the variable into the Variables: box. Click on the Old and New Values ... command pushbutton to open the Recode into Same Variables: Old and New Values sub-dialogue box. In the Old Value box, select the System- or user-missing radio button. In the New Value box, type the mean of the variable (i.e. 2.37, as obtained by calculating the average score on the attl item). Click on the Add command pushbutton.

•

SPSS:

Analysis

without

Anguish

r

r

(i"

Systemillissing

Systemil1issing System· or useril1issing Range:

r

r:

Range:

r

.Allothervalues

Continue

Heip

8

Click on Continue and then OK.

RECODE

att1 (MISSING=2.37).

EXECUTE.

Any rmssmg data for the variable attl will be replaced with the mean of 2.37. Recoding missing values using mean substitution allows you to include all cases in your analysis. Alternatively, you may prefer to deal with missing cases in each analysis. Most procedures allow you to exclude missing cases either pairwise or listwise. Exclusion of missing cases pairwise involves the deletion of cases with missing values only on the relevant variables. Listwise deletion involves the elimination of cases with a missing value for any variable on the data list. More advanced procedures, such as factor analysis, give you the opportunity to replace missing data with the mean for particular variables during the procedure itself.

Compute ....................................................................................................

Transformation of variables is just one instance where the Compute command may be used. The compute command is most commonly used to obtain composite scores for items on a scale. This can be achieved for the whole data set or only a subset if certain conditions apply. In the research example, a total attitude score would be appropriate. This can be obtained by adding the responses to the seven items comprising the scale for each individual case. To compute a new variable 1 2 3 4 5 6 Select the Transform menu.

Click on Compute to open the Compute Variable dialogue box. If previous settings remain, these can be cleared with the Reset command pushbutton. In the Target Variable: box, type an appropriate variable name (i.e. Totalatt). Select the first scale item from the variable list box (i.e. atti) and click on the IE button to move the variable into the Numeric Expression: box. Click on the + button. Select the second scale item from the variable list box (i.e. att2) and repeat steps 4 and 5.

CHAPTER

3 • Data screening

and transformation

1 - att2 - att3'" attcl .. att5 - artD - att!

**;::;P;::id======::;:::: ®ottl .;Pott2
**

i@ott3 <@2ttt ~~att5

C!J ,-- .., --A.BS(numexpr} ANY~est,va\;Je.va\ue ARSlf'J(numexpr} ARTAN{numexpr} ..

<@otIE

1l>att7 (@rou",.,

CDFNORM(zvalwe} .NOULU(q.pi

(€> team

(#)

participation

ece

7

Click on OK.

COMPUTE Totalatt = att1 + att2 + att3 + att4 + att5 + att6 + all? _ EXECUTE.

Again if you move back into your data file (remember to use the SPSS Manager) you will notice that items have been added to obtain a new composite variable in your data file - Totalatt.

Other transformations are possible using the calculator pad and functions options in the Compute Variable dialogue box.

If you wish to compute a new variable based on certain conditions, then a slightly dif-

ferent procedure is required. For example, you may wish to compute a total attitude score for only those people who exercise for four hours per week or less.

•

SPSS:

Analysis

without

Anguish

To compute

new

on

1 2 3 4 5 6 7 8 9 10

Select the Transform menu. Click on Compute to open the Compute Variable dialogue box. In the Target Variable: box, type an appropriate variable name (i.e. Totalatti. Select the first scale item from the variable list box (i.e. atti) and click on the IE] button to move the variable into the Numeric Expression: box. Click on the + button. Select the second scale item from the variable list box (i.e. att2) and repeat steps 4 and 5 until all items are entered. Click on the If... command pushbutton to open the Compute Variable: If Cases sub-dialogue box. Select the Include if case satisfies condition: radio button. Select the variable on which the condition is based (i.e. haursex) and click on the IE] button to move the variable into the box. Select the <= operator button, which moves the symbol into the box above, then select the digit 4.

C' Include all cases

r.

Indude

If case satisfies condition

**(i> att2 /19 att3 'i9attL
**

(t>att5 :!!>a!tE <#>att7 t>hou rse ex

,yteam participation ;:'.8S{numexpr}

**ANYftest.va!ue,va!ue,., ' CDFNORM(zva1ue) CDF.BERNOl.Illl(q.p) Continue) Cancel
**

ARSIN(numex.pr) :;"RT AN (liumexpr;-

/t>age -19 ,nhou"

<'it?> ageeat _(~ Totala'i!

11

Click on Continue and then OK.

IF (hoursex <= 4) Totalatt = att1 + att2 + att3 + att4 + attS + att6 + att? . EXECUTE.

In this example, a new variable has been computed - Totalatt - based on certain conditions, i.e. for those individuals who spent four hours or less (hoursex <= 4) in physical activity.

Data selection

In the Select Cases option in the Data menu, there are a number of procedures that can be chosen: • Selection of specified cases using the If option • Selection of a random sample of cases using the Sample option • Selection of cases based on time or case range using the Range option.

CHAPTER

3 • Data

screening

and transformation

Selection of cases using the If option is most commonly used. For example, you may wish to examine the descriptive statistics of only males or females, or you may wish to analyse only half your data set.

**To select the first 50 cases in the data file
**

1 2 3 4 5 6 7 Select the Data menu.

subsequent analysis

Click on Select Cases ... or click on the Select Cases tool to open the Select Cases dialogue box. In the Select box, click on the If condition is satisfied radio button. Click on the If... command pushbutton to open the Select Cases: If subdialogue box. Select the variable you require (i.e, id) and click on the IE button to move the variable into the box. Click on the operator of your choice (i.e. <=), which will then be pasted into the box above. Type in the value you require (i.e. 50).

<@id

<@altl

<@alt2

(@aU3 (@alt~

<@ott5

<@attS

'<@alt7

(@hoursex

(jp team

partlclpation

<~age (~ lnhours

Continue

I

Cancel

Help

8

Click on Continue and then OK.

USE ALL. COMPUTE filter_$=(id <= 50). VARIABLE LABEL filter_$ 'id <= 50 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. FORMAT filter_$ (f1.0). FILTER BY filter_$. EXECUTE.

You are now ready to do analysis on the 50 cases selected. If you go back into your data file (using the SPSS Manager) you will see that cases 51 to 99 have been crossed through.

spss:

Analysis

without

Anguish

Practice example The following scale measured how adolescents feel about their future. The scale comprised six items: 1 2 3 4 S 6 I feel optimistic about my future. I believe that every cloud has a silver lining.sav. Screen for normality for the composite variable. I have got what it takes to create a bright future. If anything bad can happen. = disagree = Given the data in Prac3. I doubt that I can achieve what I want to in life. I deserve to have good things come my way. Select a random sample of 50 cases from the data set and obtain the mean hope score for this subset. The response format for these items was: 1 = strongly agree 2 3 4 5 = agree = neutral strongly disagree. Recode negatively worded items. your tasks are to: 1 2 3 4 S 6 7 Check for incorrect data entry. Attempt an appropriate transformation on the composite variable and compare the output of this transformation with the distribution of the original variable. Recode missing values using mean substitution. Compute a total hope score. it will. CHAPTER 3 • Data screening and transformation . Data from 100 year 11 students were collected.

RECODE hope3 hope5 (1=5) (2=4) (4=2) (5=1) . EXAMINE VARIABLES=hopetot IPLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT ICOMPARE GROUP ISTATISTICS DESCRIPTIVES ICINTERVAL 95 IMISSING PAIRWISE INOTOTAL. EXECUTE. COMPUTE sqrefhop = SQRT(refhope) . FORMAT filter_$ (f1. VARIABLE LABELS sqrefhop 'Square root of reflected hope' . EXECUTE. RECODE hope5 (MISSING=4. • SPSS: Analysis without Anguish .Solutions Syntax DESCRIPTIVES VARIABLES=hope1 hope2 hope3 hope4 hope5 hope6 ISTATISTICS=MEAN STDDEV MIN MAX.34) . COMPUTE refhope = 26 . EXECUTE.0). EXECUTE. EXAMINE VARIABLES=sqrefhop IPLOT BOXPLOT STEMLEAF HISTOGRAM NPPLOT ICOMPARE GROUP ISTATISTICS DESCRIPTIVES ICINTERVAL 95 IMISSING PAIRWISE INOTOTAL. VARIABLE LABEL filter_$ 'id <= 50 (FILTER)'. VALUE LABELS filter_$ 0 'Not Selected' 1 'Selected'. COMPUTE filter_$=(id <= 50). RECODE hope4 (MISSING=1.63) . DESCRIPTIVES VARIABLES=hopetot ISTATISTICS=MEAN STDDEV MIN MAX. DESCRIPTIVES VARIABLES=hope1 hope2 hope3 hope4 hope5 hope6 ISTATISTICS=MEAN STDDEV MIN MAX. EXECUTE. EXECUTE. RECODE hope6 (MISSING=3. USE ALL.55) . RECODE hope2 (MISSING=4. EXECUTE.31) . FILTER BY filter_$. COMPUTE hopetot = hope1 + hope2 + hope3 + hope4 + hope5 + hope6 .hopetot . EXECUTE. EXECUTE.

002 1.162 .63 .3338 19. a total hope score can be computed.26 1 5 3. Deviation Minimum Maximum Range Interquartile Range Skevv'ness Kurtosis Lower Bound Upper Bound Std.31 1 Std.ill's (If I 100 I 8ig.775 1. To obtain the means needed for mean substitution of missing values. descriptives must be found for recoded variables.8786 20.00 11.lean liledian variance Std. Deviation .121 99 95 nescnpnves Statistic nopetct Mean 95% Confidence Interval for r.922 100 co .263 2.29420 14..478 Tests of ~JormalitJ' Kotmccorov-Srnirnov Statistic s nopetot .\.76 4. Error 20. Lilliefors Significance Correction CHAPTER 3 • Data screening and transformation .00 ·.22942 20.960 I I Shapiro-'.182 I I .000 Statistic .lean 5% Trimmed r.00 25. Having recoded the negatively worded items and the missing values. .lean 1.004 a.976 .00 3.241 .3709 2CL5900 5.424 . 100 98 99 1._.Output The minimum and maximum show that no out-of-range entries have been made. Screening for normality Descriptive Statistics II hope1 hope2 hope3 hope4 hope5 hope!3 Valid IJ (listwise} Minimum Maximum I.7890 .705 .jf I 100 I Sig .

Il 10 5 Mean = 20..00 Stem ~.3338 Std.. = 2....00 7.00 11 .00 2.00 hopetot hope~c~ Scem-and-Leaf F~equency 1.00 2000 22..00 5.00 20..5 . .00 13.idth: .0 0 1 .00 o 14.00 2.00 Stem.00 0000 Each leaf: spss: Analysis without Anguish .00 18.' Ext1:emes 15 16 Plc~ Leaf (=<14.Histogram 25 20 ~ 15 II :l >0- L&. :9 20 21 22 24 2.oo .2942 N~100 16. Dev..5 :8 o 1.00 -:.0) 00 0000 0000005 0000000 00000 0000000000000000000000056 00000000000000000033 0000000000000 00000000000 ~. .00 24. .

00 24. u ~-1 0 "U II .00 22. CHAPTER 3 • Data screening and transformation .00 -14..00 -- 20.Normal Q... )( 0 0 0 0 w -2 0 0 -3 14 16 18 20 22 24 26 Observed Value 26..00 094 hopetot There is no problem with these data but we will do a transformation for practice.00 16. We will do a reflection and then a square root.00 -- 18.. E Z 0 .Q Plot of hopetot 2 o .

00 346 2.000 Correction Square root of reflected hope Histogram 25 20 » c ::J tr II. Lilliefors ..253 .241 .. """"'+-..65 -. Deviation Minimum Maximum Range tnterquartile Skewness Kurtosis Range Mean Lower Bound Upper Bound 2...3271 Std.3431 2....Descriptives Statistic Square root of reflected hope r.50 N = 100 Square root of reflected hope • SPSS: Analysis without Anguish ..3259 ...2273 24270 2.lean 95% Confidence Interval for Mean 5% Trimmed Median 'Variance Std..003 .328 ....05033 ..50326 1. u 15 at at ~ 10 5 Mean = 2...00 1~ 2m 2~ 3m 3.46 Std.. Error . Dev.144 Significance I s Sig Statistic .3271 2. = 0..""l " 1.478 Tests of Normality Kolmogorov-Smirnov Statistic Square root of reflected hope a.959 Shapiro-\'Vilks df 100 df 100 Sig .589 ..50326 .

5 3.... <> )( ~o W -1 0 -2 0 1...50 ..00 I Mean 20.5 2. E ...00 Square root of reflected hope The mean total hope score of a random sample of 50 cases is as follows: Descriptives Descriptive Statistics N hopetot Valid r···J 50 (listwise) 50 Minimum 16..Normal Q-Q Plot of Square root of reflected hope 3 2 .50 -091 1...0 3.. Deviation 208760 CHAPTER 3 • Data screening and transformation .3366 Std.00- 2.. 0 QI 1 Z "0 . .5 Observed Value Observed Value 3....00 ..00 Maximum 25.50- -- 3.0 2.0 1... r--"- 2. '1.

You can also examine the normality of the distribution through the Frequencies procedure. For categorical variables. interquartile range. standard deviation and variance. For continuous variables. measured on nominal or ordinal scales. The data file can be found in Work4. as shown in chapter 3. All of these measures of variability are more appropriate for interval or ratio data. bar charts are suitable. The frequency distribution can be represented in tabular form or. the average length of residence in a community etc. The measures of variability include range. with more visual clarity. histograms or frequency polygons are appropriate.Descriptive statistics Descriptive statistics are used to explore the data collected. • SPSS: Analysis without Anguish . Other statistics such as standard deviation and variance give more information about the distribution of each variable. Gender and number of aces were recorded for each player. the age range and average (mean) age. Descriptive statistics may be particularly useful if one just wants to make general observations about the data collected. measured on ratio or interval scales. Working example One hundred tennis players participated in a serving competition.sav on the website that accompanies this title and is shown in the following figure. in graphical form. and to summarise and describe those data. for example number of males and females. median and mean. Frequency distributions A frequency distribution is a display of the frequency of occurrence of each score value. Measures of central tendency and variability The three main measures of central tendency are mode.

command pushbutton to open the Frequencies: sub-dialogue box. r" Ir I Ir I Chart Type . You will notice that you can also obtain a normal curve overlay. r. Click on the Statistics. Click on the Charts . and Mode check In the Percentile Values box. Charts Click on the Histogram(s) radio button. command Statistics sub-dialogue box..Quartiles r Cut points lor:r:- equal groups r Values r..~-. select the Quartiles box.. r. Continue Cancel Help I r Pie charts r. Histograms: P' With normal curve CHAPTER 4• Descriptive statistics • .. Minimum and Maximum check boxes. select the Std..E. deviation. Click on Descriptive Statistics Frequencies dialogue box. aces) and click on the IE button to move the variable into the Variable(s): box.. Tendency pushbutton to open the Frequencies: check box.. measures of central tendency and variability 1 2 3 4 5 6 7 Select the Analyze menu.--~-~~~ None Barcharts .obtain a frequency table. Median In the Dispersion box..e. to open the Select the variable(s) you require (i.. so click on the With normal curve check box. Minimum Maximum S.. Range. Variance. and then on Frequencies.mean are group midpoints r 8 9 10 Click on Continue. select the Mean. In the Central boxes.

0 8.00 8.00 7.4 ::::2.0 31. Deviation Variance Range Minimum Maximum Percentiles ~'" i. the frequency column summarises the total number of aces served.00 10.0 6.0 100.0 89.0 9.0 \ialid 100 2.11 Click on Continue and then OK.. Because you have no missing data in this example.00 4.0 15. • SPSS: Analysis without Anguish .0 \/alid Percent 3.0 8. RANGE MINIMUM MAXIMUM MEAN MEDIAN MODE Statistics aces N I. By obtaining the 25th and 75th percentiles for the distribution.0 4.00 sto. Therefore.0000 5. For example. in this example. The cumulative percent column is the summation of the percentage for that score with the percentage for all lesser scores.00 4.00 Total e 7 15 35 15 8 6 .0 81.83620 3.0 1.00 5. the interquartile range can be calculated by subtracting one from the other.0 35.00 6. the interquartile range is equal to 6 .00 1.0 1000 Cumulative Percent 3.0 95.0 7.0 99.0 66. only one person served ten aces.0 6..1 1 100 In the frequency table.0 15.0 6.0 15.0000 5.lean Median Mode vane Missing 100 0 5.0 4. the percent and valid percent columns are identical.0 1.0 7.0 15.0 6.0 100..0 16.0 35.372 9. The valid percent column is the proportion of scores only for those cases that are valid.0000 aces Frequency 3 Percent 3. The percent column displays this frequency in percentage form for all cases. including those cases that may be missing. FREQUENCIES VARIABLES=aces INTILES= 4 ISTATISTICS=STDDEV VARIANCE IHISTOGRAMA@NORMAL IORDER= ANALYSIS.0000 6.00 9.: 50 75 10.00 1.00 3.1100 5.

0 CHAPTER 4 • Descriptive statistics • ... command pushbutton to open the Frequencies: Charts sub-dialogue box..0 2. Click on Continue and then OK.0 100. 10 Mesn = 5..00 50 50 100 Percent Valid Percent 50.. oev. FREQUENCIES VARIABLES=gender ISTATISTICS=MODE IBARCHART FREQ IORDER= ANALYSIS.00 10.11 o -FL.OO 8. e zo GI . rrnr...--L-T='--I""'"'" 0.00 12......-.pnnnrr'll Click on Descriptive Statistics and then on Frequencies .00 2. s::: GI .. pushbutton to open the Frequencies: In the Central Tendency box.L. Select the variable(s) you require (i....I-r-'----l'-r-. Select the Bar chart(s) radio button....Histogram 40 30 > . click on the Mode check box. command Statistics sub-dialogue box.00 Total 50.. Click on Continue.0 100. Click on the Charts .0 Frequencj Valid 1.. U._. = 1 ..8362 r-l = 100 aces To obtain the appropriate output for a 1 2 3 4 5 6 7 8 9 Select the Analyze menu.00 G. Click on the Statistics.J..0 100. gender Cumulative Percent 50..-lo--L.00 Std.0 50..e.. to open the Frequencies dialogue box.00 4..0 50.. gender) and click on the [B button to move the variable into the Variable(s): box.

Click on the Options command pushbutton. Click on Descriptive Statistics and then Descriptives .. aces) and click on the ~ button to move the variable into the Variable(s): box. Furthermore. which is useful in data screening.. interaction terms in multiple regression) or in comparing samples from different populations.e.. These standardised or Z-scores are useful for further analysis (for example. inspection of Z-scores will allow identification of outlying cases. ' 20 10 1. Select the Save standardized values as variables check box.00 gender Descriptives command It is also possible to obtain certain measures of central tendency and variability through the Descriptives command.00 2. to open the Descriptives dialogue box.gender 50 40 >< ~ 30 011 :3 IT 011 LL . This command also allows you to save standardised values as variables. Select the variable(s) you require (i. • SPSS: Analysis without Anguish . ~ 1 2 3 4 5 To obtain descriptive statistics and Z-scores Select the Analyze menu. Z-scores greater than +3 and less than -3 are considered to be outliers..

1100 1. Minimum and Maximum check boxes are automatically selected. Click on Continue and then OK.~ -.lean Std.E. CHAPTER 4 • Descriptive statistics • . Minimum Maximum P' Std.00 5. Variable r r 6 r list .83620 If you switch back to the Data Editor window.Dispersion ---.mean r Skewness r. deviation.00 10. deviation P' ~ r Variance r Range Help r S. select the appropriate check boxes.faximum !. If you wish to obtain additional descriptive statistics. you will notice that the Z-scores have been saved as another variable: Zaces. your task is to obtain a frequency table and the appropriate chart and descriptive statistics for each variable in the data file. Std. Descriptive Statistics f.- _.sav. Deviation r·J aces Valid rJ (listwise) Minimum 100 100 1.Alphabetic Ascending means Descending means Note that the Mean. Given the data in Prac4.---. 7 DESCR IPTIVES VARIABLES=aces/SAVE /STATISTICS=MEAN STDDEV MIN MAX. Practice example Sales in $lOOOs by 20 junior and senior salespeople working in a white goods shop were recorded at the end of a week.

....00 100..6 Frequency \lalid junior senior Total r.0 10 9 19 20 valid Percent 52..0 9.....0 45..·lissing Total Percent 50..0 5..........6 47............0 The appropriate chart for a categorical variable (rank) the appropriate measure of central tendency.0 100.............0 95....4100.... FREQUENCIES VARIABLES=rank 1ST ATISTICS=MODE IBARCHART FREQ IORDER= ANALYSIS................ rank IS a bar chart and the mode is 10 8 o I junior ssnioi rank • SPSS: Analysis without Anguish ..................... FREQUENCIES VARIABLES=sales INTILES= 4 MODE/STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM MEAN MEDIAN Output rank Cumulative Percent 52...............Solutions Syntax .

00 76.00 77.8 42.:i7 .00 50 75 3.(1000 eXist The smauest value is shewn sales in $1000 Percent 5.9 63.0 56.023 50.0 2 19 10.00 .3 10.00 51.lissing 19 1 63.0 5.3 5.1ec!ial1 l.00 74.0 100.3 :-.3 26.3 5.3 5.9 84.3 :.0 5. the percentage and valid percentage columns are not identical.1 .0 5.(1 5.5 100.3 5. 1.3 5.3 10.13Y. For continuous data (sales).3 31'3 36.00 30.3 10.00 Total f.(1 5.00 60.2 89.2 68.00 80.4 5:.0000 75.3 5.5 52_6 57.00 1 5.(103 1449218 210.6316 68.linimurn 1.0 Valid Percent Cumulative Percent Frequency Valid 3000 43.17.00 72.0 10.5 15.iation Variance Rallge l. Statistics sales fJ Llean 1.0 5.lulllple modes 51.0 :0 5'.3 ~.lo':!6 SId.00 80.00 20 2 :-.I Ul11 111 Percentiles 2Ein S1000 vsuc I.00 68.51 = 24.0000 68.0 95. De·.3 5. The interquartile range is 75 .0000 75.0 5.0 5. a histogram is appropriate.00 66.8 21. and either the mean or median is the chosen measure of central tendency.0 5.3 5.00 64.0 :.0 5.0 CHAPTER 4 • Descriptive statistics .478.3 5.5 100.00 75.0 5.00 70.00 45.1issing Total 99.0 5.1 5.You will notice that because one case was missing.

Histogram 10 8 4 o -f----f----t----I----t------_+_' 30.00 50.00 40. spss: Analysis without Anguish . = 14.00 GOOD 70. Dev.49218 i'-I = 19 sales in $1000 You may also notice that the data for the variable sales appear to be positively skewed.00 rv1enn = G3J331 G '~1d.00 80.

the variability in scores for one variable is roughly the same at all values of the other variable. CHAPTER 5• Correlation . Partial correlation provides a single measure of linear association between two variables while adjusting for the effects of one or more additional variables. A correlation between two dichotomous or categorical variables is called a phi coefficient and is available through the Crosstabs option from the Analyze and Descriptive Statistics menus. However. refers to the correlation between two continuous variables and is the most common measure of linear relationship. A correlation between a continuous and a categorical variable is called a point-biserial correlation. The value indicates the strength of the relationship. also referred to as zero-order correlation. while the sign (+ or -) indicates the direction.Correlation Correlation looks at the relationship between two variables in a linear fashion. you can use a Pearson product-moment correlation to correlate a dichotomous and a continuous variable but the proportion of each category of the dichotomous variable must be approximately equal. In this chapter you will address bivariate and partial correlations using Pearson's product-moment correlation. This option is not available in the SPSS for Windows package. When the assumptions underlying correlation cannot be met adequately. 2 3 4 5 Homoscedasticity . A Pearson-product-moment correlation coefficient describes the relationship between two continous variables and is available through the Analyze and Correlate menus. and variables must be coded as 0 and 1. if you obtain a score on an X variable. Simple bivariate correlation. This coefficient has a range of possible values from -1 to + 1. Scale of measurement Normality Linearity data should be interval or ratio in nature. Assumption 3 can be tested using the procedures outlined in chapters 3 and 4.data must be collected from related pairs: that is. the relationship between the two variables must be linear. That is. Assumption testing Correlational analysis has a number of underlying assumptions: 1 Related pairs . it is concerned with how the scores cluster uniformly about the regression line. Assumptions 1 and 2 are a matter of research design. a nonparametric alternative is Spearman's rank-order correlation. Assumptions 4 and 5 can be tested by examining scatterplots of the variables. the scores for each variable should be normally distributed. there must also be a score on the Y variable from the same participant.

You suspect that a positive relationship exists between these two variables and wish to test this directional hypothesis (one-tailed). the course averages for the same 20 students were obtained. Click on Scatter .e. The data file can be found in WorkS. to open the Scatterplot Ensure that the Simple Scatterplot dialogue box. To obtain a scatter-plot 1 2 3 Select the Graphs menu.. option is selected.sav on the website that accompanies this title and is shown in the following figure. You also wish to determine whether the relationship between TEE scores and course average is significant when IQ is controlled in the analysis. Q lEJ Overlay 4 5 Click on the Define command pushbutton to open the Simple Scatterplot dialogue box.Working example Twenty students wishing to enter university were given an intelligence test (IQ) and their tertiary entrance examination scores (TEE) were recorded. sub- Select the first variable (i. At the end of the academic year.. • SPSS: Analysis without Anguish . tee) and click on the ~ button to move the variable into the Y Axis: box.

iq) and click on the ~ button to move the variable into the X Axis: box.00 0 0 390. Given that the scores cluster uniformly around the regression line. CHAPTER 5 • Correlation • . the assumption of homoscedasticity has not been violated.00- 0 11(3 124 112 114 113 120 iq As you can see from the scatterplot. WITH tee 420.00 0 0 0 360. there is a linear relationship between IQ and TEE scores.000 0 0 0 0 0 0 0 0 0 0 0 0 300000 0 270.00 0 0 0 330.e. GRAPH ISCATIERPLOT(BIVAR)=iq IMISSING=LlSTWISE .6 Select the second variable (i. [J [J 7 Click on OK.

el 1 To interpret the correlation coefficient.05). To obtain a bivariate Pearson product-moment correlation 1 2 3 4 5 Select the Analyze menu. the scatterplot was obtained for TEE scores and course average.OOCI 25 tee Pearson Correlation Sig (Hailed: N " Correlation is significant at the Cl. Click on Correlate dialogue box.r Rag sJgmicant correlations 6 Click on OK. Ensure that the Pearson correlation option has been selected.e. iq and tee) and click on the move the variables into the Variables: box. Thus. spss: Analysis without Anguish .767. higher intelligence scores are associated with higher TEE scores. select the One-tailed radio button.O Ie . Correlations ia iq Pearson Correlation Sig ! Hailed: 1·1 tee 767~~ 000 25 . you examine the coefficient and its associated significance value (p).. p < .Similarly. to open the Bivariate Correlations [E Select the variables you require (i. The output confirms the results of the scatterplot in that a significant positive relationship exists between IQ and TEE (r = . and then Bivariate . indicating that assumptions of linearity and homoscedasticity were not violated. CORRELATIONS NARIABLES=iq tee IPRINT =ONETAIL NOSIG IMISSING=PAIRWISE .. In the Test of Significance box. The output of this plot is not displayed. button to Vanables: ~iq ·~tee r.

". "'" . Click on Correlate and then on Partial.1 ru.e... Variables: I~~:>:J<~ uruev (~tee r ~ "". Select the variable to be controlled (i.354. t-tailed: .. PARTIAL CORR NARIABLES=tee uniav BY iq ISIGNIFICANCE=ONETAIL IMISSING=LlSTWISE . the higher the TEE score. • Display actual significance level 6 Click on OK. Correlations Control variables lq tee tee Ccrretation Significance df uniav Correlation Significance . p < . the higher the course average for the first year of university (r = .J''''''-O'''"'' Twoiailed r. select the One-tailed radio button. having controlled for intelligence scores.354 045 i 1-t8iI8(1) unlav . CHAPTER 5 • Correlation • .05). -. Select the variables you wish to correlate (i. Again. tee and uniav) and click on the lEI button to move the variables into the Variables: box..v.(lOD 22 1.e.354 045 1.1 2 3 4 5 Select the Analyze menu. to open the Partial Correlations dialogue box. iq) and click on the lEIbutton to move the variable into the Controlling for: box.' One{ailed Options . In the Test of Significance box.000 I) clf 22 The output indicates that the relationship between TEE scores and course averages after the first year of university is also significant.

.. Descriptive I...6800 Statistics Sk8''.......... WITH lungfunc WITH lungfunc CORRELATIONS NARIABLES=lungfunc cigsday IPRINT =ONETAIL NOSIG IMISSING=PAIRWISE ................ PARTIAL CORR NARIABLES= lungfunc cigsday BY years ISIGNIFICANCE=ONETAIL IMISSING=LlSTWISE ........ GRAPH ISCATIERPLOT(BIVAR)=cigsday IMISSING=LlSTWISE ................:29 '365 Std Error ......lean Statistic lung function 3V€ragB cigarettes years of smoking «(alid N (list'Nlse: 3 ca..50200 1l'.:...... Statistic 4... Assumption testing Normality It is evident from the output that all variables are relatively normally distributed..!4e09 10.... Determine whether the above relationship remains significant...........91910 stansnc ......64 464 2~ 25 25 25 Std Error ............ The researchers also believe that the relationship between these variables may be influenced by the number of years for which the person has been a smoker.287 1002 • SPSS: Analysis without Anguish ...lIrtosis Statistic ':64 Stet Statistic 2.4800 30.......... Output .......•......... Solutions Syntax ..•... Data were collected from smokers who have had their lung function assessed and their average daily cigarette consumption recorded... Given the data in Prac5......sav........3600 10..................... Conduct the appropriate analysis to determine whether cigarette consumption is related to lung capacity. having controlled for the length of the smoking habit. Lung function was assessed in such a way that higher scores represent greater health.902 902 -.........Practice example Medical researchers believe that there is a relationship between smoking and lung damage......'vness ~. your tasks are to: 1 2 3 Check your data for violations of assumptions. DESCRIPTIVES VARIABLES=lungfunc ISTATISTICS=MEAN cigsday years STDDEV KURTOSIS SKEWNESS...... GRAPH ISCATIERPLOT(BIVAR)=years IMISSING=LlSTWISE ............. A negative relationship between the variables was expected..........

Correlations a. a one-tailed probability test was appropriate.00 0 0 000 o 0.3 .00 0 average cigarettes a day However.. p < . Given that a directional hypothesis was stated.el \1-taile. Pearson II Correlation 81£1.01Ie.j' 000 2': There is a significant negative correlation between average number of cigarettes smoked per day and lung function. t1-t3lielj .rrelation 8i'J ! r-taueo: :)00 cigarettes 3 rs tl averaue ci£13rettes . The second scatterplot has a similar shape and indicates no violation of these assumptions.=! 4.00 4000 50.oo 00 0 .~ u G.00 0 m 0 0 0 2. Correlations a.00 30.sr3Q8 lun'J function lung function Pearson C':. Correlation IS si~Hliflc3nt atU-le 0.00 0 0 0 .:13:.Linearity and homoscedasticity The scatterplot of cigarettes per day and lung function suggests a somewhat curvilinear relationship. indicating that lung function decreases as cigarette consumption increases (r = -.00 0 0 8.2 e o 0 0 00 c . because this tendency is not marked.er3Q2 ciQ3rettes Contrcl.001)..:ari3bles lung funC11. the assumptions of linearity and homoscedasticity are not violated.)n lung function Cf y88fS ·)f smOiClIly 22 1000 J1f CHAPTER 5 • Correlation • .00 20.825. 10.

05).The partial correlation coefficient.378. p < . even controlling for years of smoking. is still significant (r = -. whereby years of smoking was controlled. it is demonstrated that the relationship between the number of cigarettes smoked per day and lung function is still significant. spss: Analysis without Anguish . Thus.

Normality the data should be at the interval or ratio level of the scores should be randomly sampled from the popu- the scores should be normally distributed in the population. Assumption 3 can be tested in a number of different ways. These assumptions need to be evaluated because the accuracy of test interpretation depends on whether assumptions are violated. Some of these assumptions are generic to all types of t-test. Does engine efficiency with and without the additive differ between manual and automatic cars? This is an independent groups t-test. Random sampling lation of interest. Clearly. as outlined in chapter 3. assumptions 1 and 2 are a matter of research design and not statistical analysis. Assumption testing Each statistical test has certain assumptions that must be met before analysis. CHAPTER 6 • T-tests . Three main types of t-test may be applied: • One-sample • Independentgroups • Repeated measures. Twenty-two cars were test driven both with and without the additive. The mean number of kilometres per litre was 10.T-tests A t-test is used to determine whether there is a significant difference between two sets of scores. During an earlier trial 22 cars were test driven using the additive. and the number of kilometres per litre was recorded. The generic assumptions underlying all types of t-test are: 1 2 3 Scale of measurement measurement.5. Whether the car was automatic or manual was also recorded and coded as I = manual and 2 = automatic. Working example A major oil company developed a petrol additive that was supposed to increase engine efficiency. You are interested in addressing the following questions: 1 2 3 Are the cars in the present trial running more efficiently than those in the earlier trial? The single sample t-test will help answer this question. but others are more specific. Does engine efficiency improve when the additive is used? This is a repeated measures t-test design.

e.5 (MISSING = ANALYSIS N ARIABLES = with add (CRITERIA = C1 (.. IE] button to move <i> cart::. withadd) and click on the the variable into the Test Variable(s): box.5! Options".sav on the website that accompanies this title and is shown in the following figure. Select the variable you require (i.5). 10. T-TEST fTESTVAL = 10.The data can be found in Work6. to open the OneSample T Test dialogue box. Click on Compare Means and then One-Sample T Test.~withadd [2!J Paste Reset Cancel Help Test Value:llD. In the Test Value: box type the mean score (i. I I I I I 5 Click on OK. SPSS: Analysis without Anguish .e. .95) . The one-sample t-test The one-sample t-test is used when you have data from a single sample of participants and you wish to know whether the mean of the population from which the sample is drawn is the same as the hypothesised mean.:e <i>wrthcw1 Test Vafiable(s): ·. r-test 1 2 3 4 Select the Analyze menu.

. if the two sets of scores are random samples from different populations... degree of freedom (df) and two-tail significance. violations of this assumption are of little concern. CHAPTER 6 • T-tests . then the difference between the means is significant.586 Test Value = 10.. Studies which employ a pretest-posttest design are commonly analysed using repeated measures t-tests.86 Statistics Std. In other words. If the value for two-tail significance is less than .. after some intervention or manipulation.. Testing this assumption involves the same procedures used for the single-sample t-test. The output indicates that there is a significant difference in engine efficiency between the present trial and the earlier trial..... In this form of design..000 Mean Difference 3..05).... The repeated measures t-test.... then you can attribute any difference between means across conditions to the independent variable or the treatment effect.58 T-tests with more than one sample In the last section.... you need to test the normality of each variable separately.. the cars in the present trial appear to have greater engine efficiency than that of those in the earlier trial ..t (21) = 5...the difference between the scores for each participant should be normally distributed..364 I I Upper 4... If they are random samples from the same population.748 One-Sample Test Std.. One-Sample N withadd 22 Mean 13.. You wish to determine whether the difference between means for the two sets of scores is the same or different..5 95% Confidence Interval of the Difference Lower 2.......... Error Mean . a one-sample t-test is used to determine whether a single sample of scores was likely to have been drawn from a hypothesised population.. p < .... you must ensure that the assumptions of repeated measures t-test are met. Providing the sample size is not too small (30+). Because you have two dependent variables. (2-tailed) ..... Having evaluated the assumption of normality for both pretest and posttest measures.....05....It is possible to determine whether a difference exists between the sample mean and the hypothesised mean by consulting the t-value. This section extends the understanding of sampling distributions to ask whether two sets of scores are random samples from the same or different populations. Repeated measures t-test . which allows you to assume that the difference scores are normally distributed.741 df 21 Sig....... that a number of assumptions are generic to all types of t-test.. Data that are collected from the same group of participants are also referred to as withinsubjects...... then any differences across conditions or groups can be attributed to random sampling variability.15 t withadd 5... a score on the posttest.. That is.. from the section on assumption testing.... The repeated measures t-test has one additional assumption: 1 Normality of population difference scores ...05 (p < . Before you attempt to answer this question.. an individual obtains two scores under different levels of the independent variable.•. you are ready to conduct a repeated measures t-test. However............ because the same subject performs in both conditions..... the same participant obtains a score on the pretest and.. Deviation 2..74.. is used when you have data from only one group of participants.. Remember. also referred to as the dependent-samples or paired t-test..

Error Mean .335 2.651 Upper -4.586 Paired Samples Correlations N Pair 1 without & withadd 22 Correlation .86 N 22 22 Std. to open the Select the variables you require (i.000 By looking at the t-value. without and withaddi and press the IE button to move the variables into the Paired Variables: box.748 Std. The 95 per cent confidence interval indicates that 95 per cent of the time the interval specified will contain the true difference between the population means.663 df 21 Mean Pair 1 without .904 Std.e. The correct way to determine significance is to consult the critical totables that are available at the back of most statistical textbooks. Paired Samples Statistics Mean Pair 1 without withadd 8.withadd -5.711 . . If the probability value is less than the specified alpha value. T Test. Deviation 3. spss: Analysis without Anguish . then the observed t-value is significant.619 Sig. However. T-TEST PAIRS= without WITH withadd (PAIRED) /CRITERIA=C1 (.076 t -8.. using the degrees of freedom. significance can also be determined by looking at the probability level (p) specified under the heading 'two-tail significance' .007 Paired Samples Test Paired Differences 95% Confidence Interval of the Difference Lower -6..50 13.1 2 3 Select the Analyze menu. (2-tailed) . Deviation 2.95) /MISSING=ANAL YSIS. Click on Compare Means and then Paired-Samples Paired-Sample T Test dialogue box. df and two-tail significance you can determine whether the groups come from the same or different populations.559 Sig. 4 Click on OK.364 Std. Error Mean .

. you wish to determine whether the difference between means for the two sets of scores is significant.05). p < ......... when the participants in one condition are different from the participants in the other condition.....e. To test for homogeneity of variance. a significant difference exists between engine efficiency with and without the additive..... The independent groups t-test has two additional assumptions.05).. CHAPTER 6 • T-tests . This is achieved through the Explore dialogue box using the Factor List option........... If this test is significant (p < . Dependent Ust: <it> '/.. In this instance the unequal variance estimates are consulted. This explanation will make more sense when you consult the output of the independent groups t-test....... This is commonly referred to as a between-subjects design.... Assumption 1 is a matter of research design while assumption 2 is tested in the independent groups analysis... Again.... Because you have different participants in each condition... to open the Explore dialogue box....05)..... Before proceeding you need to check the normality of the data.. Independent groups t-test . you need to check the normality of each set of scores separately. SPSS uses the Levene test for equality of variances............ then you reject the null hypothesis and accept the alternative hypothesis that the variances are unequal.As can be seen from the output.. without and withaddi and click on the IE button to move the variables into the Dependent List: box. Select the grouping variable (i. cartype) and click on the IE button to move this variable into the Factor List: box...66. The additive significantly improves the number of kilometres to the litre... If the test is not significant (p > .. To screen 1 2 3 4 Select the Analyze menu.... Both! 5 Click on OK... participants should appear in only one group and Homogeneity of variance ... 1 2 Independence of groups these groups are unrelated.....e....'ithOlit <ywrthadd Factor Ust: O'splay········ (0 .. An independent groups t-test is appropriate when different participants have performed in each of the different conditions.... in other words....the groups should come from populations with equal variances.. In this case you would consult the equal variance estimates.... Click on Descriptive Statistics and then Explore ....... Options . ! (21) = -8.... Select the dependent variable(s) (i. ············i o Statistics .. then you accept the null hypothesis that there are no significant differences between the variances of the groups.

Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Lower Bound Upper Bound Statistic 8.152 automatic Lower Bound Upper Bound .00 4.43 11.253 14.661 1.600 3. it is clear that there is minimal violation to the assumption of normality.57 8.661 1.002 automatic Lower Bound Upper Bound .325 8 19 11 5 .661 1. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std.622 withadd manual Lower Bound Upper Bound .92 7. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std.36 11.13 15. Descriptives cartype without manual I Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std.355 -.40 14.661 1.012 13.063 11 17 6 4 -.94 7.35 12.656 -.169 -.36 12.75 14.60 13.821 4 16 12 6 .00 6.279 1.279 1.EXAMINE VARIABLES=without withadd BY cartype ICOMPARE GROUP ISTATISTICS DESCRIPTIVES ICINTERVAL 95 IMISSING LlSTWISE INOTOTAL.013 -1.00 8.200 2.89 9.279 In reviewing the descriptive statistics and the other output such as stem-and-leaf plots and boxplots (not shown).749 Std.00 14.055 3. • SPSS: Analysis without Anguish .08 9.863 .255 2. Error .279 .864 4 13 9 5 .704 9.98 15.00 6.00 11.

To conduct on t-test 1 2 3 4 Select the Analyze menu. type the lowest value for the variable (i.. 1). . T Test. then tab. Test Variable(s): (~'N~hout <t< P3:ir: Reset Cancel I I I I ~ Define Groups". without) and then click on the IB button to move the variables into the Test Variable(s): box..e. cartype) and click on the IB button to move the variable into the Grouping Variable: box. 2) in the Group 2: box. Options". CHAPTER 6• T-tests . to open Select the test variable(s) (i. _j Reset Cancel Help I I 7 Click on Continue and then OK T-TEST GROUPS=cartype(1 2) IMISSING=ANAL YSIS N ARIABLES=without ICRITERIA=CIC95) . Enter the second value for the variable (i. command pushbutton to open the Define Groups sub-dialogue box. Select the grouping variable (i.e. I 5 6 Click on the Define Groups . In the Group 1: box.. Click on Compare Means and then Independent-Samples the Independent Samples T Test dialogue box.e.e.

000 1. (2-tailed) Mean Difference Std. ~ (20) = -. Error Mean .440 1.05.080 .000 1.695 -.000 1180 1180 -1. t df Sig.00 9. Although it is possible to perform two t-tests with the one command.390 .003 2.440 -4.496 -1. Deviation 2.409 1.05 and thus is not significant. you can assume that the population variances are relatively equal.000 -1.152 Independent Levene's Test for Equality of Variances t-test for Equality of Means 95% Confidence Interval of the Difference Lower Upper F without Equal variances assumed Sig.461 3. Deviation 2. The two groups must come from the same population because no significant differences exist. Error Mean . Therefore.683 -.018 2.05. Group Statistics I without cartype manual automatic N 11 11 Mean 8.002 Independent Levene's Test for Equality 01 Variances Samples Test t-test lor Equality 01 Means 95% Conlidence Interval 01 the Difference Lower Upper F with add Equal variances assumed Sig.863 1.00 Samples Test Std.492 3.492 Equal variances not assumed • SPSS: Analysis without Anguish . df and two-tail significance for the equal variance estimates to determine whether car type differences exist.461 -1. You therefore accept the null hypothesis and reject the alternative hypothesis.063 3.821 Std.95) Grouup Statistics I withadd cartype manual automatic N 11 11 Mean 14. Error Difference 3.495 . P > .The syntax for the independent groups t-test is different from that of the repeated measures t-test. (2-tailed) Mean Difference Std.172 .695 20 18.695.622 1. T-TEST GROUPS = cartype(1 2) IMISSING = ANALYSIS N ARIABLES = with add ICRITERIA = CI(. for the sake of clarity two separate procedures are shown.848 20 16.018 Equal variances not assumed Given that Levene's test has a probability greater than . Error DiHerence .864 3.848 .36 Std.407 . In the case of the independent groups t-test you have a grouping variable so you can distinguish between groups I and 2 when comparing engine efficiency.003 -4.704 . The two-tail significance for without additive indicates that p > . t df Sig. you can use the t-value.36 13.539 .325 Std.

T-TEST GROUPS = gender(1 2) IMISSING = ANALYSIS NARIABLES = hyprecal ICRITERIA = CI(. so you have access to descriptive statistics from a similar group of adults. They are then asked to recaJl as many as they can. Determine whether there was any change in recall as a result of hypnosis for the entire sample. Practice example You have been asked to determine whether hypnosis enhances memory. Levene's test was not significant and thus we interpret the equal variance estimates. p > .6 IMISSING = ANALYSIS NARIABLES = natrecal ICRITERIA = CI(. The next week they are asked to memorise a similar list of words and then to recall as many as possible while under hypnosis. Determine whether men and women recaJl equal numbers of words when under hypnosis. That is.95) .848. CHAPTER 6 • T-tests • . T-TEST PAIRS = natrecal WITH hyprecal (PAIRED) ICRITERIA = CI(.05). your tasks are to: 1 2 3 Determine whether the participants in the present study are comparable with those in the earlier study in terms of recall in a normal state. without hypnosis. The mean number of words recalled in the earlier study. df and two-tail significance.sav. Forty men and women are given five minutes to attempt to memorise a list of unrelated words. was 34. there is no significant difference in engine efficiency between manual and automatic cars either with or without the additive.95) . again no significant differences are apparent (p > . Consulting our t-value.95) IMISSING = ANALYSIS.05. Solutions Syntax EXAMINE VARIABLES=natrecal hyprecal BY gender IPLOT BOXPLOT STEM LEAF ICOMPARE GROUP ISTATISTICS DESCRIPTIVES ICINTERVAL 95 IMISSING LlSTWISE INOTOTAL. Given the data in Prac6. You performed a study last year with another sample. ! (20) = . T-TEST ITESTVAL = 34.6.In relation to withadd.

..95 64.77 63.261 12.00 70.....629 12.... Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std....Output ..737 11.497 male Lower Bound Upper Bound .06 36.992 2.50 155... Error 2...992 Normality was assessed through the Explore Option.....23 58...33 46..724 12.....126 Std. An examination of the output reveals that the data are normally distributed for each group..339 21 76 55 12 ... Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Assumption testing Oescriptives gender recall in natural state female I Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std.169 35 78 43 14 -. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Lower Bound Upper Bound Statistic 46............512 ...79 70..992 2.......00 52..452 69.759 ....989 21 71 50 19 .50 152.....357 1..134 58....512 .790 recall under hypnosis female Lower Bound Upper Bound ..266 -.75 33.512 .. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std...55 40....218 .881 ..83 39.11 75.•.905 male Lower Bound Upper Bound ...50 124...17 58..•.77 52..67 45...50 168... spss: Analysis without Anguish .992 2....218 39.33 45....475 45 94 49 15 -.....512 .....

p < .081 Paired Samples Correlations N Pair 1 recall in natural state & recall under hypnosis Correlation Sig.15 Std. Error Mean 2.70 The t-value is significant (p < .05. ! (39) = 4. Error Mean 2.40 t recall in natural state 4.474 -25.98 N 40 40 Std.051 2. Deviation 12.829 -15.169 df 39 Sig.475 11. (2-tailed) .00 Std.497 CHAPTER 6 • T-tests • .169.970 13. Error Mean 2.790 2. Deviation Std.825 15.283 . Inspection of the means suggests that the present sample has better word recall than that of the earlier sample. ! (39) = -8.550 I I Upper 12.05.051 One-Sample Test Test Value = 34. Deviation 12. T-tests for independent group samples Group Statistics I gender recall under hypnosis female male N 20 20 Mean 69.95 58.970 Std. 40 .418 39 .15 63.169 Std.418.05) and therefore it can be stated that there is a significant improvement in recall when the participants are under hypnosis. Deviation 12.One-sample t-tests One-Sample N recall in natural state 40 Statistics Mean 43.077 Paired Samples Test Paired Differences 95% Confidence Interval of the Difference Lower Upper T df Mean Pair 1 recall in natural state . (2-tailed) -20. T-tests for paired samples Paired Samples Statistics Mean Pair 1 recall in natural state recall under hypnosis 43.05) and therefore it can be concluded that the present sample is significantly different from the previous sample.000 The t-value is significant (p < .161 Std.647 2.821 -8. P < .6 95% Confidence Interval of the Difference Lower 4.recall under hypnosis Std. Error Mean Sig.000 Mean Difference 8.

! (38) = 3. Error Difference .003 11.367 19.036 .95) recalled more words under hypnosis than men (Mean = 58).192 38 37.851 3. SPSS: Analysis without Anguish . Inspection of the means in the group statistics table suggests that women (Mean = 69. From the equal estimates t-value it can be seen that there is significance (p < .950 3.05. t df Sig.950 11. (2-tailed) Mean Difference Std.370 4.744 4.003 .744 3.544 .Independent Levene's Test lor Equality of Variances Samples Test t-test for Equality of Means 95% Confidence Interval of the Difference Lower Upper F recall under hypnosis Equal variances assumed Sig.192. P < .05 it can be assumed that there is homogeneity of variance.533 Equal variances not assumed Given that the Levene's test has a probability greater than .530 19.05) and therefore it can be stated that there was a significant difference between men and women in the recall of words under hypnosis.192 3.

the scores in each group should have homogeneous variances. a one-way analysis of variance (ANOYA) is appropriate. that is. The more options a test offers. At the heart of ANOYA is the notion of variance.populations from which the samples have been drawn should be normal. The Scheffe test. There are a number of post-hoc tests available. The two assumptions of concern are: 1 Population normality . where the significant differences lie needs to be worked out. allows every possible comparison to be made but is tough on rejecting the null hypothesis. One of these estimates (between-groups variance) is a measure of the effect of the independent variable combined with error variance. Assumption testing Before conducting the ANOYA the necessary assumptions must be met. The assumptions for ANOYA are the same as those for the t-test. As with the t-test. The other estimate (within-groups variance) is of error variance by itself.One-way between-groups ANOVA with post-hoc • comparisons In the last chapter you tested the null hypothesis that two population means were equal. 2 CHAPTER 7 • One-way between-groups ANOVA with post-hoc comparisons . Because the null hypothesis is rejected if any pair of means is unequal. given that all the possible comparisons are going to be made. Tukey's honestly significant difference (HSD) test is more lenient. but the types of comparison that can be made are restricted. doing an entire set of comparisons. Levene's test determines whether variances are equal or unequal. This chapter illustrates the Tukey HSD post-hoc test. This type of testing carries risks of type I errors. This requires post-hoc analysis. Check this for each group using normality statistics such as skewness and Shapiro-Wilks. post-hoc tests are designed to protect against type I errors. Homogeneity of variance . Post-hoc analysis involves hunting through the data for any significance. Unlike planned comparisons. When you wish to compare the means of more than two groups or levels of an independent variable. In contrast. for example. The basic procedure is to derive two different estimates of population variance from the data. A significant F-ratio indicates that the population means are probably not all equal. The F-ratio is the ratio of between-groups variance to within-groups variance. These tests are stricter than planned comparisons and so it is harder to obtain significance. then calculate a statistic from the ratio of these two estimates. the stricter its determination of significance.

cost) and click on the IE button to move the variable into the Dependent List: box. to open the One-Way ANOVA dialogue box. A repeated measures design requires a different class of procedures... The data file can be found in Work7. with post-hoc analysis Click on Compare Means and One-Way ANOVA . She obtained random samples of 25 two-person households from each city and asked them to keep records of their energy expenditure over a sixmonth period...Working example An economist wished to compare household expenditure on electricity and gas in four major cities in Australia. Select the independent variable (i... SPSS: Analysis without Anguish . I Post Hoc . If the same participants were in all conditions.e.sav on the website that accompanies this title and is shown in the following figure. Dependent Ust: <it.cost of electricity OK and ~ Paste Reset Cancel Help I I Contrasts . Select the dependent variable (i..e.. so this design is discussed in chapter 10. To conduct a one-way 1 2 3 4 Select the Analyze menu. I Options . Note that this is an independent groups design because different households are in different cities. then it would be a within-subjects or repeated measures design. city) and click on the IE button to move the variable into the Factor: box.

. CHAPTER 7 • One-way between-groups ANOVA with post-hoc comparisons .. command pushbutton to open the One-Way ANOVA: Options sub-dialogue box. Click on the check box for Thkey. Cancel ONEWAY cost BY city ISTATISTICS DESCRIPTIVES HOMOGENEITY IMISSING ANALYSIS IPOSTHOC = TUKEY ALPHA(.. command pushbutton to open the One-Way ANOVA: Post Hoc Multiple Comparisons sub-dialogue box. Click on the Post Hoc . 9 r ~ S·N·K r Duncan r Hochberg's GT2 r Gabriel r Tuke}' Tukey's-b I Continue I 10 Click on Continue and then OK. In this example you will use the Tukey's HSD multiple comparison test. W W r Fixed and random effects Homo geneit' I of variance lest Brown-Forsythe Help Descriptive r Means plot 7 8 Click on Continue.05).. Click on the check boxes for Descriptive and Homogeneity-of-variance. You will notice that a number of multiple comparison options are available.5 6 Click on the Options .

05 level. Significance can also be determined by looking at the F-probability value.78 -85. Levene's test for homogeneity of variances is not significant (p > .05) and therefore you can be confident that the population variances for each group are approximately equal. Deviation 56. Error 11.61 585.385 Std. Given that p < . the correct way to determine significance is to use the Critical F tables.86 Std.755 .920 Comparisons 95% Confidence Std.86 -61.920 spss: Analysis without Anguish . . 39.920 17.013 Having obtained a significant result you can also go further and determine.14 -22. Again. Error 17.840' 18.333 3940.360 -23.79 525.488 To determine whether you have a significant F-ratio you use the degrees of freedom (df) (3. .723 .280 33. the F-ratio and the F-probability.26 85.42 64.06 -70.306 12.96) = 3.50 -11..802.83 Minimum 383 397 397 429 383 Maximum 621 647 677 739 739 In interpreting this output you must first ensure that the homogeneity assumption has not been violated.755 . Multiple Dependent Variable: cost of electricity and gas Tukey HSD Mean Difference (I-J) -18.98 -80.040 df 3 96 99 Mean Square 14982.70 70.78 22.34 61.50 104.576 65.08 537.040 423246. . that is.920 15.34 11.42 -7.560 -33.34 (I) city Adelaide (J) city Hobart Melbourne Perth Adelaide Melbourne Perth Adelaide Hobart Perth Adelaide Hobart Melbourne IS -57.231 Lower Bound -64.51 504.86 12.755 17. .817 dfl 3 of Variances df2 96 Sig.34 -104.14 80. between which cities is there a significant difference in energy costs? You can see by looking at the results of the Tukey test in the table that follows that Adelaide and Perth have significantly different mean energy costs.05.127 .823 .529 63.280 23.Oescriptives cost of electricity and gas 95% Confidence Interval for Mean N Adelaide Hobart Melbourne Perth Total 25 25 25 25 100 Mean 497.535 Hobart Melbourne Perth 57.823 .008 .628 56.360 -39.98 31.91 492.560 -15.50 Interval Upper Bound 27.539 Lower Bound 473. F(3.326 11.755 17.84 531.05 you can reject the null hypothesis and accept the alternative hypothesis that states that expenditure on electricity and gas is different across capital cities.26 -27.89 Upper Bound 520.755 17.755 17.755 17.65 539.755 17. ANOVA cost of electricity and gas Sum of squares Between Groups Within Groups Total 44947.615 F 3. using a Tukey HSD test.802 Sig.231 .755 Sig. Test of Homogeneity cost of electricity and gas Levene Statistic .976 72.755 17.515 6.28 515.840' 17.20 555. p < .06 7.723 . 96).755 17.795 14.12 524.755 17.70 -12.16 511. where the significance lies.008 .000 378299.127 .50 -31.17 557.535 'The mean difference significant at the .

sav.05).491 . ONEWAY wtgain BY supp ISTATISTICS HOMOGENEITY IMISSING ANALYSIS IPOSTHOC = TUKEY ALPHA(. Determine whether there are significant differences in weight gain across the food supplements.00 26. Given the data in Prac7. your tasks are to: 1 2 3 Test the underlying assumptions of ANOVA.000 5. Output Assumption Testing Normality Oescriptives food supplement weight gain supplement A Mean 95% Confidence Interval for Mean 5% Trimmed Mean Median Variance Std. Locate the source of these differences using post-hoc analysis.087 Lower Bound Upper Bound .062 -.953 (continued) CHAPTER 7 • One-way between-groups ANOVA with post-hoc comparisons . Error 1.195 Std.00 9.26 12. Each group had a different supplement added to its food. Deviation Minimum Maximum Range Interquartile Range Skewness Kurtosis Statistic 12.099 1 22 21 7 . and the rats' weight gain over the ensuing six months was recorded in grams.Practice example A biologist wished to examine the nutrient value of six different food supplements. Syntax EXAMINE VARIABLES=wtgain BY supp ICOMPARE GROUP ISTATISTICS DESCRIPTIVES ICINTERVAL 95 IMISSING LlSTWISE INOTOTAL.05 11. One hundred and fifty-four rats of the same species were randomly assigned to one of six groups. Solutions 4jjJ.74 14.

Are you sure?

This action might not be possible to undo. Are you sure you want to continue?

We've moved you to where you read on your other device.

Get the full title to continue

Get the full title to continue listening from where you left off, or restart the preview.

scribd