Professional Documents
Culture Documents
Republic of Rwanda Kigali Independent University Ulk P.O Box 2280, KIGALI
Republic of Rwanda Kigali Independent University Ulk P.O Box 2280, KIGALI
Republic of Rwanda Kigali Independent University Ulk P.O Box 2280, KIGALI
ICT SKILLS
Module IV: SPSS
Compiled by:
0
1 Module Code: _ALL 40X......... Faculty: ALL
2 Module Title:__GENERAL SKILLS 4..............................................................................
Lectures
40 50
Practical classes/laboratory
10 10
Structured exercises
30 40
Set reading etc.
10 10
Self-directed study
10 --------
Assignments – preparation and writing
10 --------
Examination – revision and attendance
10 10
TOTAL
120 120
The objective of this course is to equip students with ICT 3 highly practical skills needed on
the marketplace. Trainees will learn the functioning of a computer, the desktop environment,
and file management. Then, they will learn how to use effectively various software for
decision support . A particular attention will be put on the fact that best practice and quality
issues are understood and implemented so as to help trainees improve productivity at work.
1
i) Knowledge and Understanding; Cognitive/Intellectual skills/Application of
Knowledge; Communication/ICT/Numeracy/Analytic Techniques/Practical Skills
and General Transferable Skills
-To make students understand that English is a vital working tools globally.
2
7. Indicative Content
. i: indefinite adjectives
. ii : conjuctions
. viii: gerunds
. xi: consolidation
- Workshops
3
9. Assessment Strategy
-practical group class work
-Various assignments and lastly the final examination on the general module
10 Assessment Pattern
In-course assessment:
Assignment 30% 1, 2, 3
Final assessment:
End-of-Semester 40% 1, 2, 3
Examination
Each Presentation is marked, marks post on the course Web on the University
Online Campus Platform, with immediate feedback (direct contact with the student or
contact through the online courses platform);
12 Indicative Resources
Core Text (include number in library or URL) (inc ISBN)
4
TABLE OF CONTENT
Module Objective............................................................................................................................- 6 -
UNIT I: DATABASE (MICROSOFT ACCESS 2007).................................................................- 6 -
2.0 Database terminologies.............................................................................................................- 7 -
2.0 Steps in designing a database...................................................................................................- 7 -
3.0 Getting Started with MS Access 2007....................................................................................- 10 -
4.0 Data Types...............................................................................................................................- 15 -
5.0 Table Relationships.................................................................................................................- 18 -
UNIT II: STATISTICAL PACKAGE FOR SOCIAL SCIENCES (SPSS)..............................- 26 -
2.1Introduction.................................................................................................................- 26 -
2.2 Starting SPSS for Windows.......................................................................................- 26 -
2.3 Entering data into Data editor....................................................................................- 28 -
2.4 Using the Help System..............................................................................................- 30 -
2.5 Reading Data..............................................................................................................- 31 -
2.6 Using the Data Editor................................................................................................- 36 -
3.0 Running an Analysis...............................................................................................................- 45 -
3.1 Viewing Results.........................................................................................................- 46 -
3.2 Examining Summary Statistics for Individual Variables..........................................- 47 -
3.2.1 Level of Measurement.........................................................................................- 47 -
3.3 Cross-tabulation tables...............................................................................................- 53 -
3.3.1 Creating and Editing Charts................................................................................- 57 -
3.3.2 Making analysis with the help of syntax.............................................................- 62 -
3.3.4 Sorting and Selecting Data..................................................................................- 64 -
5
Module Objective
The objective of the module is to equip students with the necessary practical skills about
Computerized database as well as data analysis and interpretation with the help of Ms Access
and SPSS (Statistical Package for Social Sciences) respectively. In Ms Access students
should learn how to create database and related tables, create forms, relationships, data
definition and capturing, queries and reports whereas in SPSS students will learn variables
definition, data capturing and manipulation, running analysis, creating and editing charts and
working with the output.
Ms Access is a tool that allows users to access data in relational database management. Some
of other tools are: Oracle, Sybase, Informix, Microsoft SQL client/Server, and others, by
allowing users to describe the data the user wishes to see. Allows users to define the data in a
database, and manipulate them.
Database File: This is your main file that encompasses the entire
database and that is saved to your hard-drive or floppy disk.
Example) StudentDatabase.mdb
6
Example of Database applications includes: computerized library systems, automated teller
machine, flight reservation, Names and addresses, Business contacts, customers, and sales
prospects, Employee and personnel information, invoices, payments, and bookkeeping etc
Table: a single collection of data about a particular subject or area. The data is presented in
columns (known as fields/attributes) and rows (known as records). [Example, an addressbook
table]
Field: a column from a table that contains a specific type of information. [Example, is an
addressbook, one field could be NAME or PHONE.]
Record: one complete entry or row in a table that contains a collection of information about
one particular item. [ Example, in an addressbook, one record could be for John Smith, 223-
7456.]
Key: one particular field that always contains a unique value for every Record in the table
[Example: SSN is often used as a Key Field]
Query: specific database search that answers a question about the data and produces a dynaset
result
The first step in designing an access database is to determine the purpose of the database and
how it's to be used. This tells you what information you want from the database. From that,
you can determine what subjects you need to store facts about (the tables) and what facts you
need to store about each subject (the fields in the tables).
7
Talk to the people who will use the database and brain storm about the questions you'd like
the database to answer.
Step Two: Determine the tables you need. Once you have a clear purpose for your
database, you can divide your information into separate subjects such as "Employees" or
"Orders“. Each subject will be a table in your database
The power in a relational database management system such as Ms access comes from its
ability to quickly find and bring together information stored in separate tables. In order for
MS-Access to work most efficiently, each table in your database should include a field or set
of fields that uniquely identifies each individual record stored in the table. This is often a
unique identification number, such as an employee ID number or a serial number.
In database terminology, this information is called the primary key of the table. MS-Access
uses primary key fields to quickly associate data from multiple tables and bring them together
for you. MS Access doesn't allow duplicate values in a primary key field for example, don't
use people's names as a primary key, because names aren't unique. You could easily have two
people with the same name in the same table. If you don't already have a unique identifier in
mind for a table, you can use a field that simply numbers the records consecutively.
Note: When choosing primary key fields, keep these points in mind:
Now that you've divided your information into tables, you need a way to tell Microsoft
Access how to bring it back together again in meaningful ways. Microsoft Access is a
relational database management system that stores related data in separate tables. Then you
define relationships between the tables, and Microsoft Access uses the relationships to find
associated information stored in your database.
8
Three types of relationships:
One-to-many relationships
Many-to-many relationships
One-to-one relationships
This type of relationship takes place when many occurrences of an entity are related to many
occurrences of the second entity and vice-versa. Example; employees must be assigned to at
least one, and possibly more departments. In order to support a many-to-many dimension
relationship, a primary key–foreign key relationship must be defined in the data source view
between all the tables that are involved. Otherwise, you will not be able to select the correct
intermediate measure group.
In a one-to-one relationship, a record in Table A can have no more than one matching record
in Table B, and a record in Table B can have no more than one matching record in Table A.
For example, a ROOF covers one BUILDING; a BUILDING is covered by one ROOF.
To prevent the duplication of information in a database by repeating fields in more than one
table, table relationships can be established to link fields of tables together.
Follow the steps below to set up a relational database:
- Click the Relationships button on the toolbar.
- From the Show Table window, double click on the names of the tables you would like to
include in the relationships. When you have finished adding tables, click Close. Link fields in
9
two different tables, click and drag a field from one table to the corresponding field on the
other table and release the mouse button.
The Ribbon
It has four tabs: Home, Create, External Data, and Database Tools. Each tab is divided into
groups. The groups are logical collections of features designed to perform function that you
will utilize in developing or editing your Access database.
10
Home: Views, Clipboard, Fonts, Rich Text, Records, Sort & Filter, Find
Create: Tables, Forms, Reports, Other
External Data: Import, Export, Collect Data, SharePoint Lists
Database Tools: Show/Hide, Analyze, Move Data, Database Tools, Macro.
Database Terms
Tables
Query
Queries select records from one or more tables in a database so they can be viewed, analyzed,
and sorted on a common datasheet. A query can also perform calculations and display the
results. The resulting collection of records, called a dynaset (short for dynamic subset), is
saved as a database object and can therefore be easily used in the future. The query will be
updated whenever the original tables are updated.
To run a query:
11
Form
A form is a graphical interface that is used to display and edit data. Forms can be developed
from a table or a query. Forms can include calculations, graphics and objects.
Report
A report is an output of data arranged in the order you specify. Reports can perform
calculations and display the results. Reports can be used to print data.
To view data using a form:
You can create a new database from scratch or you can create a database from the database
wizard.
12
New Database
To create a new database from scratch:
Create a Table
Table Views
There are two ways to view a table in Access to add data to the table: Design View and
Datasheet View.
In Design View you can view all the fields with the data types and descriptions. The records
of information that has been added to the database is not viewable.
To go to Design View:
In Datasheet View you can display the records in a table, where one row is one record. The
column headers are the fields you have defined for the database.
13
To go to Datasheet View:
There are many ways to enter new fields into a database. New fields can be added in the
Datasheet View or in the Design View.
There are two ways to add a new field in Datasheet View: Add A New Field or the New
Field Button.
14
4.0 Data Types
There are many types a data that a field can be predefined to hold. When you create a new
field in a database you should closely match the data type to what will be entered into the
field.
15
When creating tables, you should define the data types of the tables to most closely match the
type of data that will be entered in the field.
16
Manage Tables
Delete a Table
To delete a table:
Open the desired database by clicking the Microsoft Office Button and clicking
Open
Right click on a table and choose Delete
Rename a Table
To rename a table:
Open the desired database by clicking the Microsoft Office Button and clicking
Open
Right click on a table and choose Rename
Type in the new name
Keys
Primary Key
The primary key is a unique identifier for a record. The primary key cannot be the same for
two records. This field can never be blank.
17
Foreign Key
A foreign key is a field or combination of fields that are related to the primary key of another
table.
Table relationships are the associations of data between tables. By defining table
relationships, you can pull records from related tables based on matching fields.
One-to-One Relationship
A one-to-one relationship is between two tables where the primary key in one table and the
foreign key in another table are the same. For each record in the first table, there is a single
matching record in the second table.
18
One-to-Many Relationship
A one-to-many relationship occurs between two tables where the primary key in one table
can be duplicated many times in another table
19
Select the desired tables
Click Add
Click Close
20
Querying a Database
A query allows you to select and filter data from multiple tables. Queries can be saved and
utilized as often as you need them.
Query Wizard
The Query Wizard walks you through the steps to set up a query. To run a query using the
query wizard:
You can also design a query with the Query Design Button. To design a query using the
Query Design Button:
21
Select the tables that you would like to query
Click Add
Double click the name of the field you would like to query
Repeat this process for as many fields as you would like in the query
Click Run
Query Criteria
Query criteria are search conditions used in a query to retrieve specific data. You can set
query criteria to be a specific number or data set, or you can set the criteria to be a range of
data.
22
Type in the appropriate query criteria in the Criteria Box
Designing Forms
Forms allow you to control the look and feel of the screen for the input of data and the reports
generated.
Create a Form
Choose a style
Click Next
23
Generating Reports
Reports are a means to view and analyze large amounts of data. You can use the Report
Wizard or create a custom report that meets your specific needs.
ReportViews
Reports can be displayed in four views:
Create a Report
Report Wizard
24
Choose a style
Click Next
25
UNIT II: STATISTICAL PACKAGE FOR SOCIAL SCIENCES (SPSS)
2.1Introduction
SPSS is a comprehensive system for analyzing data. SPSS can take data from almost any type
of file and use them to generate tabulated reports, charts, and plots of distributions and trends,
descriptive statistics, and complex statistical analysis. SPSS makes statistical analysis more
accessible for the beginners and more convenient for the experienced user. The Data Editor
offers a simple and efficient spreadsheet-like facility for entering data and browsing the
working data file.
Originally SPSS is an acronym of Statistical Package for the Social Science. SPSS is one of
the most popular statistical packages which can perform highly complex data manipulation
and analysis with simple instructions.
Launch SPSS either by double-clicking the SPSS icon on the desktop, or from the Start menu
SPSS will have a group under programs. The opening screen should appear as
26
Open a data file
Before you can analyze data, you need some data to analyze. From the menu choose: File,
open then data. The open file dialog box is displayed as shown below
27
The following data will be displayed in the SPSS data editor
By default, SPSS-format data files (.sav extension) are displayed. You can display
other file formats using the Files of Type drop-down list. By default, data files in the folder
(directory) in which SPSS is installed are displayed. Select a file. Click open , in dialog box
to open the data you have selected.
To begin the process of adding data, just click on the first cell that is located in the upper left
corner of the datasheet. It's just like a spreadsheet. You can enter your data as shown. Enter
each datapoint then hit [Enter]. Once you're done with one column of data you can click on
the first cell of the next column.
28
If you're entering data for the first time, like the above example, the variable names will be
automatically generated (e.g., var00001, var00002,....). They are not very informative. To
change these names, click on the variable name button. For example, double click on the
"var00001" button. Once you have done that, a dialog box will appear. The simplest option is
to change the name to something meaningful. For instance, replace "var00001" in the textbox
with "RT" (see figure below).
In addition to changing the variable name one can make changes specific to [Type], [Labels],
[Missing Values], and [Column Format].
[Type] One can specify whether the data are in numeric or string format, in addition
to a few more formats. The default is numeric format.
29
[Labels] Using the labels option can enhance the readability of the output. A variable
name is limited to a length of 8 characters, however, by using a variable label the
length can be as much as 256 characters. This provides the ability to have very
descriptive labels that will appear at the output. Often, there is a need to code
categorical variables in numeric format. For example, male and female can be coded
as 1 and 2, respectively. To reduce confusion, it is recommended that one uses value
labels . For the example of gender coding, Value:1 would have a correspoding Value
label: male. Similarly, Value:2 would be coded with Value Label: female. (click on
the [Labels] button to verify the above)
[Missing Values] See the accompanying help. This option provides a means to code
for various types of missing values.
[Column Format] The column format dialog provides control over several features
of each column (e.g., width of column).
Once data has been entered or modified, it is advisable to save. In fact, save as often as
possible [File => SaveAs].
30
SPSS offers a large number of possible formats, including their own. A list of the available
formats can be viewed and selected by clicking on the Save as type: on the SaveAs dialog
box. If your intention is to only work in SPSS, then there may be some benefit to saving in
the SPSS(*.sav) format. I assume that this format allows for faster reading and writing of the
data file. However, if your data will be analyzed and looked by other packages (e.g., a
spreadsheet), it would be advisable to save in a more universal format (e.g., Excel(*.xls).
Once the type of file has been selected, enter a filename, minus the extension (e.g., sav, xls).
You should also save the file in a meaningful directory, on your hard-drive or floppy. That is,
for any given project a separate directory should be created. You don't want your data to get
mixed-up.
Dialog box Help buttons. Most dialog boxes have a Help button that takes you directly to a
Help topic for that dialog box. The Help topic provides general information and links to
related topics.
Pivot table context menu Help. Right-click on terms in an activated pivot table in the
Viewer and select What's This? from the context menu to display definitions of the terms.
Statistics Coach. The Statistics Coach item on the Help menu provides a wizard-like method
for finding the right statistical or charting procedure for what you want to do.This gives a
custom Help topic, based on your selections in the Statistics Coach
Case Studies. The Case Studies item on the Help menu provides hands-on examples of how
to create various types of statistical analyses and interpret the results. The sample data files
used in the examples are also provided so that you can work through the examples to see
exactly how the results were produced.
31
Data can be entered directly into SPSS, or it can be imported from a number of different
sources. The processes for reading data stored in SPSS data files, spreadsheet applications
like Microsoft Excel, database applications like Microsoft Access, and text files are all
discussed in this chapter.
SPSS data files, which have a .sav file extension, contain your saved data. . To read it, from
the menus choose:
File
Open
Data... , then select the file and open.
The data are now displayed in the Data Editor
Rather than typing all of your data directly into the Data Editor, you can read data from
applications like Microsoft Excel. You can also read column headings as variable names.
From the menus choose:
File
Open
Data...
Select Excel (*.xls) from the Files of Type drop-down list.
Make sure Read variable names from first row of data is selected. This option reads column
headings as variable names.
If the column headings do not conform to the SPSS variable-naming rules, they are converted
into valid variable names and the original column headings are saved as variable labels. If
you want to import only a portion of the spreadsheet, specify the range of cells to be imported
in the Range field.
Click OK to read the Excel file.
The data now appear in the Data Editor, with the column headings used as variable names. If
you're using a spreadsheet application other than Excel or Lotus, you should be able to export
your data to a supported format that can then be read into SPSS.
32
Data from database sources are easily imported using the Database Wizard. Any database that
uses ODBC (Open Database Connectivity) drivers can be read directly by SPSS after the
drivers are installed.
Select MS Access Database from the list of data sources and click Next
Click Browse to navigate to the Access database file you want to open.
Open File dialog box
33
Select demo.mdb and click Open to continue.
Click OK in the login dialog box.
In Step 2, you can specify the tables and variables you want to import.
34
Drag the entire demo table to the Retrieve Fields in This Order list.
Click Next.
If you do not want to import all cases, you can import a subset of cases (for example, males
older than 30), or you can import a random sample of cases from the data source. For large
data sources, you may want to limit the number of cases to a small, representative sample to
reduce the processing time. The default is to retrieve all cases.
35
Click Next to continue
Field names are used to create variable names. If necessary, the names are converted to valid
variable names. The original field names are preserved as variable labels.
You can also change the variable names before importing the database.
Click the Value Labels cell in the Gender field. This option converts string variables to
integer variables and retains the original value as the value label for the new variable.
36
Click Next to continue
The SQL statement created from your selections in the DatabaseWizard appears in the
Results dialog box. This statement can be executed now or saved to a file for later use.
All of the data in the Access database that you selected to import are now available in the
SPSS Data Editor.
The Data Editor displays the contents of the active data file. The information in the Data
Editor consists of variables and cases.
In Data View, columns represent variables and rows represent cases (observations).
In Variable View, each row is a variable, and each column is an attribute associated with that
variable.
Variables are used to represent the different types of data that you have compiled. A common
analogy is that of a survey. The response to each question on a survey is equivalent to a
variable. Variables come in many different types, including numbers, strings, currency, and
dates.
Data can be entered into the Data Editor, which may be useful for small data files or for
making minor edits to larger data files.
Click the Variable View tab at the bottom of the Data Editor window.
Define the variables that are going to be used. In this case, only three variables are needed:
age, marital status, and income.
37
In the first row of the first column, type age.
In the second row, type marital.
In the third row, type income.
Non-numeric data, such as strings of text, can also be entered into the Data Editor.
Click the Variable View tab at the bottom of the Data Editor window.
In the first cell of the first empty row, type sex for the variable name
38
Select String to specify the variable type.
39
Defining Data
In addition to defining data types, you can also define descriptive variable and value labels
for variable names and data values. These descriptive labels are used in statistical reports and
charts.
Labels are meant to provide descriptions of variables. These descriptions are often longer
versions of variable names. Labels can be up to 256 characters long. These labels are used in
your output to identify the different variables.
Click the Variable View tab at the bottom of the Data Editor window.
In the Label column of the age row, type Respondent's Age.
In the Label column of the marital row, type Marital Status.
In the Label column of the income row, type Household Income.
40
Changing Variable Type and Format
The Type column displays the current data type for each variable. The most common are
numeric and string, but many other formats are supported. In the current data file, the income
variable is defined as a numeric type.
Click the Type cell for the income row, and then click the button to open the Variable Type
dialog box.
The formatting options for the currently selected data type are displayed.
Select the format of this currency. For this example, select $###,###,###.
Click OK to save your changes
41
Adding Value Labels for Numeric Variables
Value labels provide a method for mapping your variable values to a string label. In the case
of this example, there are two acceptable values for the marital variable.
A value of 0 means that the subject is single, and a value of 1 means that he or she is married.
Click the Values cell for the marital row, and then click the button to open the Value Labels
dialog box.
The value is the actual numeric value.
The value label is the string label applied to the specified numeric value.
Type 0 in the Value field.
Type Single in the Value Label field.
Repeat the process, this time typing 1 in the Value field and Married in the Value Label field.
Click Add, and then click OK to save your changes and return to the Data Editor.
These labels can also be displayed in Data View, which can help to make your data more
readable.
Click the Data View tab at the bottom of the Data Editor window.
From the menus choose:
View
Value Labels
The labels are now displayed in a list when you enter values in the Data Editor. This has the
benefit of suggesting a valid response and providing a more descriptive answer
Adding Value Labels for String Variables
String variables may require value labels as well. For example, your data may use single
letters, M or F, to identify the sex of the subject. Value labels can be used to specify that M
stands for Male and F stands for Female.
Click the Variable View tab at the bottom of the Data Editor window.
Click the Values cell in the sex row, and then click the button to open the Value Labels dialog
box.
Type F in the Value field, and then type Female in the Value Label field.
Click Add to add this label to your data file.
42
Repeat the process, this time typing M in the Value field and Male in the Value Label field.
Click Add, and then click OK to save your changes and return to the Data Editor.
Because string values are case sensitive, you should make sure that you are consistent. A
lowercase m is not the same as an uppercase M.
You can then, use drop down list to enter the data values.
Missing or invalid data are generally too common to ignore. Survey respondents may refuse
to answer certain questions, may not know the answer, or may answer in an unexpected
format. If you don't take steps to filter or identify these data, your analysis may not provide
accurate results.
For numeric data, empty data fields or fields containing invalid entries are handled by
converting the fields to system missing, which is identifiable by a single period. The reason a
value is missing may be important to your analysis. For example, you may find it useful to
distinguish between those who refused to answer a question and those who didn't answer a
question because it was not applicable.
Click the Variable View tab at the bottom of the Data Editor window.
Click the Missing cell in the age row, and then click the button to open the Missing Values
dialog box.
In this dialog box, you can specify up to three distinct missing values, or a range of values
plus one additional discrete value.
43
Select Discrete missing values.
Type 999 in the first text box and leave the other two empty.
Click OK to save your changes and return to the Data Editor.
Now that the missing data value has been added, a label can be applied to that value.
Click the Values cell in the age row, and then click the button to open the Value Labels
dialog box.
Type 999 in the Value field.
Type No Response in the Value Label field.
Missing values for string variables are handled similarly to those for numeric values.
Unlike numeric values, empty fields in string variables are not designated as system missing.
Rather, they are interpreted as an empty string.
Click the Variable View tab at the bottom of the Data Editor window.
Click the Missing cell in the sex row, and then click the button to open the Missing Values
dialog box.
Select Discrete missing values.
Type NR in the first text box.
Missing values for string variables are case sensitive. So, a value of nr is not treated as a
missing value.
Click OK to save your changes and return to the Data Editor.
Now you can add a label for the missing value.
Click the Values cell in the sex row, and then click the button to open the Value Labels dialog
box.
Type NR in the Value field
44
Click Add to add this label to your project.
Click OK to save your changes and return to the Data Editor.
Once you've defined variable attributes for a variable, you can copy these attributes and apply
them to other variables.
In Variable View, type agewed in the first cell of the first empty row.
45
From the menus choose:
Edit
Paste
The defined values from the age variable are now applied to the agewed variable
You can also copy all of the attributes from one variable to another
All of the attributes of the marital variable are applied to the new variable.
The Analyze menu contains a list of general reporting and statistical analysis categories. Most
of the categories are followed by an arrow, which indicates that there are several analysis
procedures available within the category; they will appear on a submenu when the category is
selected.
We'll start with a simple frequency table (table of counts).
46
From the menus choose:
Analyze
Descriptive Statistics
Frequencies...
A more complete description of each variable pops up when the cursor is over it. The variable
name (in square brackets) is inccat, and it has the variable label Income category. If there
were no variable label, only the variable name would appear in the list box.
In the dialog box, you choose the variables you want to analyze from the source list on the
left and move them into the Variable(s) list on the right. The OK button, which runs the
analysis, is disabled until at least one variable is placed in the Variable(s) list.
Additional labeling information can be easily obtained for any variable on the list by clicking
on the variable name with the right mouse button.
Click the right mouse button on Income Category [inccat], and then click (left mouse button)
Variable Information. All of the defined value labels for the variable are displayed.
A pound sign (#) icon next to the variable name indicates that the variable is numeric.
An icon with the letter “A” indicates that the variable is a string (alphanumeric) variable,
which may contain both letters and numbers. A less-than sign (left angle bracket) indicates
that the variable is a short string, containing eight or fewer characters. Move to the right
income category and gender.
Click OK to run the procedure.
47
Creating Charts
Although some statistical procedures can create high-resolution charts, you can also use the
Graphs menu to create charts.
Scroll down the source variable list and select wireless as the Category Axis variable.
The bar chart is displayed in the Viewer. It shows that people with wireless phone service are
far more likely to have PDAs than people without wireless service.
You can edit charts and tables by double-clicking on them in the contents pane of the Viewer
window, and you can copy and paste your results into other applications.
Exiting SPSS
To exit SPSS:
From the menus choose:
File
Exit
Click No if you get an alert asking if you want to save your results
Different summary measures are appropriate for different types of data, depending on the
level of measurement:
Categorical. Data with a limited number of distinct values or categories (for example, gender
or marital status). Also referred to as qualitative data. Categorical variables can be string
(alphanumeric) data or numeric variables that use numeric codes to represent categories (for
example, 0 = Unmarried and 1 = Married). There are three basic types of categorical data:
Nominal. Categorical data where there is no inherent order to the categories. For example, a
job category of “sales” isn't higher or lower than a job category of “marketing” or “research.”
Another example, what is marital status? Single, married, divorced, separated,……..
Ordinal. Categorical data where there is a meaningful order of categories, but there isn't a
measurable distance between categories. For example, there is an order to the values high,
medium, and low, but the “distance” between the values can't be calculated. Another example
48
is what is the rate of satisfaction do you get from your enterprise? Not satisfied, less satisfied,
satisfied, highly satisfied.
Scale. Data measured on an interval or ratio scale, where the data values indicate both the
order of values and the distance between values. For example, a salary of $72,195 is higher
than a salary of $52,398, and the distance between the two values is $19,797. Also referred to
as quantitative or continuous data. Another example, what is your monthly income? Between
500 - 4500, 4500 – 8500, 8500 – 12500, 12500+
For ordinal data, the median (the value above and below which half the cases fall) may also
be a useful summary measure if there is a large number of categories.
The Frequencies procedure produces frequency tables that display both the number and
percentage of cases for each observed value of a variable.
49
Frequency tables
The frequency tables are displayed in the Viewer window. The frequency tables reveal that
only about 21% of the people own PDAs, but almost everybody owns a TV (99.2%). This
might not be an interesting revelation, although it might be interesting to find out more about
the small group of people who do not own televisions.
You can graphically display the information in a frequency table with a bar chart or pie chart.
Open the Frequencies dialog box again. (The two variables should still be selected)
Click Charts.
Select Bar charts and then click Continue.
50
In addition to the frequency tables, the same information is now displayed in the form of bar
charts, making it easy to see that most people do not own PDAs but almost everyone owns a
TV
There are many summary measures available for scale variables, including:
Measures of central tendency. The most common measures of central tendency are the
mean (arithmetic average) and median (value above and below which half the cases fall).
Example, making a research on annual potato earnings from Ruhengeri farmers. Here are the
following results/findings/observations: 2000frw, 2500frw, 1500frw, 3000frw, 1000frw
The mean/average = (2000+2500+1500+3000+1000)/5 = 2000frw
So the mean or average of 2000frw represents the earning of each farmer from Ruhengeri
Note1:
If the number of observation is odd like the one shown above you consider the middle value
as the median whereas in case the number of observation is even, the median is calculated by
averaging the two middle observations/results
51
Note2:
The difference between mean and median is that the mean represents each individual
observation of the respondent from which it is calculated and there fore it is the better
estimation of the sample or population while median does not consider every individual
element and therefore it is not a better estimation of the sample or population
Measures of dispersion. Statistics that measure the amount of variation or spread in the data
include the standard deviation, minimum, and maximum.
Standard deviation is the measure showing how the individual observations/results are
distanced/dispersed from each other.
Note: The lower the standard deviation the lower the dispersion and vice-versa. This means
that if the standard deviation is too low as compared to the mean it is concluded that the
farmers’ earnings are almost the same but if otherwise there is a big difference between
Farmers’ earnings
Click Statistics.
Select Mean, Median, Std. deviation, Minimum, and Maximum.
52
Frequencies Statistics dialog box
Click Continue.
Deselect Display frequency tables in the main dialog box. (Frequency tables are usually not
useful for scale variables since there may be almost as many distinct values as there are cases
in the data file.)
Click OK to run the procedure.
The Frequencies Statistics table is displayed in the Viewer window.
In this example, there is a large difference between the mean and the median, with the mean
being more than 25,000 greater than the median. This indicates that the values are not
normally distributed.
53
3.3 Cross-tabulation tables
Cross-tabulation tables (contingency tables) display the relationship between two or more
categorical (nominal or ordinal) variables. The size of the table is determined by the number
of distinct values for each variable, with each cell in the table representing a unique
combination of values. Numerous statistical tests are available to determine whether there is a
relationship between the variables in a table.
A simple cross-tabulation
What factors affect the products that people buy? The most obvious is probably how much
money people have to spend. In this example, we'll examine the relationship between income
level and PDA (personal digital assistant) ownership
Analyze, Descriptive Statistics, Crosstabs
54
The cells of the table show the count or number of cases for each joint combination of values.
For example, 455 people in the income range $25,000 - $49,000 own PDAs as shown in the
table below
None of the numbers in this table, however, stand out in any obvious way, indicating any
obvious relationship between the variables.
Click Continue and then click OK in the main dialog box to run the procedure
A clearer picture now starts to emerge. The percentage of people who own PDAS rises as the
income category rises (see the figure below)
55
The purpose of a cross-tabulation is to show the relationship (or lack thereof) between two
variables.
Your results can be used in many applications. For example, you may want to include a chart
or graph in a presentation or report. Applications such as Microsoft's PowerPoint or Word
can display your results as plain text, rich text, or as a metafile, which is a graphical
representation of the output.
The following examples are specific to Microsoft Word, but they may work similarly in other
word processing applications.
You can paste pivot tables into Word as native Word tables. Text formatting, such as font
size and color, is not retained, but columns and rows are properly aligned.
Because the table is in a text format, the data can be edited after you paste it into your
document.
Click the Marital status table in the Viewer.
From the menus choose:
Edit
Copy
Open your word processing application.
From the word processor's menus choose:
Edit
Paste Special
Select Formatted Text (RTF) in the Paste Special dialog box.
56
Click OK to paste your results into the current document.
The table is now displayed in your document. You can apply custom formatting, edit the data,
and resize the table to fit your needs.
57
3.3.1 Creating and Editing Charts
Creating charts
In this example, we'll create a simple pie chart that shows how many respondents have
Internet service at home.
From the menus choose:
Graphs
Pie...
Since we want to base the chart on a single variable, we selected Summaries for groups of
cases. Chart elements (bars, pie slices) can also be based on summaries of separate variables
or values from individual cases in the data file.
Select Internet as the variable that defines slices (Define Slices by).
Define Pie dialog box
58
When charts are created, they do not show the missing category by default. You want to
display this category to make sure that the number of cases with missing values is not
excessive.
Click Options.
Select Display groups defined by missing values, and then click Continue
Click OK in the Define Pie dialog box to create the pie chart
The pie chart reveals that most respondents do not have Internet service at home. From
the chart, it appears that only about a quarter of the respondents have Internet service.
You can edit charts in a variety of ways. For the sample pie chart we created, we will:
Add a title.
59
Remove the small category of missing data.
Display percentages for the two remaining categories in the chart.
The first thing we'll do is add a title.
Double-click the pie chart to open it in the Chart Editor.
60
Select the pie chart.
From the Chart Editor menus choose:
Edit
Properties
In the Properties window, click the Categories tab. Move Missing from the Order list to the
excluded list (select and click X).
Click Apply
The pie chart clearly shows that most respondents do not have Internet service at home, and it
looks like almost three-quarters of the respondents are in the No category. However, it might
be useful to see the exact percentages.
Select the pie chart.
61
From the menus choose:
Elements
Show data labels
Now the pie chart displays labels of counts. We can add also the percentages.
On the Data Value Labels tab, move percent from ‘not displayed’ to ‘displayed’.
Click Apply.
62
3.3.2 Making analysis with the help of syntax
SPSS provides a powerful command language that allows you to save and automate many
common tasks. Most commands are accessible from the menus and dialog boxes. However,
some commands and options are available only by using the command language. The
command language also allows you to save your jobs in a syntax file so that you can repeat
your analysis at a later date or run it in an automated job with the Production Facility. A
command syntax file is simply a text file that contains SPSS commands. You can open a
syntax window and type commands directly, but it is often easier to let the dialog boxes do
some or all of the work for you.
The easiest way to create syntax is to use the Paste button located on most dialog boxes as
shown in the figure below
63
Procedures to create syntax
► Click the arrow button to move the variable to the Variable(s) list.
► Click Charts.
Click Paste to copy the syntax created as a result of the dialog box selections to the
Syntax Editor.
64
You can use this syntax
alone as shown in the table
below, add it to a larger
syntax file, or refer to it in
a Production Facility job.
Then save the syntax for
future use and thereafter
run menu in the syntax
window to run the
commands.
Data files are not always organized in the ideal form for your specific needs. To prepare data
for analysis, you can select from a wide range of file transformations, including the ability to:
Sort data. You can sort cases based on the value of one or more variables.
Select subsets of cases. You can restrict your analysis to a subset of cases or perform
simultaneous analysis on different subsets.
The examples in this chapter use the data file demo.sav.
Sorting Data
Sorting cases (sorting rows of the data file) is often useful and sometimes necessary for
certain types of analysis.
To reorder the sequence of cases in the data file based on the value of one or more sorting
variables:
From the menus choose:
Data
Sort Cases...
This opens the Sort Cases dialog box.
Sort Cases dialog box
65
Add the Age in years (age) and Household income in thousands (income) variables to the
Sort By list.
If you select multiple sort variables, the order in which they appear on the Sort By list
determines the order in which cases are sorted. In this example, based on the entries in the
Sort By list, cases will be sorted by the value of Household income in thousands (income)
within categories of Age in years (age). For string variables, uppercase letters precede their
lowercase counterparts in sort order (for example, the string value Yes comes before yes in
the sort order).
Using SPSS 14.0 Software, create required variables, capture data of the following survey
related to meals preferences and run some analysis.
Questionnaire form
66
10 beans in how many meals per 0 for never, 1 for 1 to 4times, 2 4
week for 5 to 7times, 3 for 8 to 14
times, 4 for more than 14 times
2. Draw a pie chart showing different levels of education with % in the chart
II. Using spss file cereal.sav located in tutorial folder sample_files subfolder :
1. Run frequency analysis on ‘age category ‘ and answer to the following questions :
2. Draw a crosstabs analysis ‘age category ‘ against ‘ preferred breakfast’ and answer
67
III. Using spss file ‘patient_los’ located in tutorial folder sample_files subfolder :
VI. Using spss file ‘1991 US general social survey’ located in SPSSEval folder :
1. Run frequency analysis on ‘Region of the US’ and answer to following questions :
68