You are on page 1of 83

Econ 115a - Econometrics

Module 1: Introduction to Statistics and SPSS


Econ 115a

Lesson 2: Getting to Know SPSS

Learning objectives:
- Familiarize the features of SPSS
- Identify the different SPSS Menus, Windows, and Tabs
- Perform SPSS installation on a computer
Econ 115a

Outline
2.1 Why SPSS?
2.2 What is SPSS?
2.3 SPSS Add-on Products
2.4 SPSS Toolbars and Shortcuts
2.5 SPSS Windows
2.6 Data and Variable View
2.7 Output Window
2.8 Opening/Importing Data
Econ 115a

2.9 SPSS File Formats


2.10 Installing SPSS
Econ 115a

2.1 Why SPSS?


Econ 115a

Statistical Software available in the market


1. Open source
- free of access, use, distribution, and modification

2. Paid
- needs to pay or purchase for subscription
- available in monthly/yearly/perpetual subscription
- some offers trial versions
Econ 115a

Statistical Software available in the market:


1. Open source
R, gretl, JASP, Phyton
jamovi, SOFA, GNU PSPP
Scilab, etc.

2. Paid
SPSS, Stata, SAS,
MatLab, Minitab, Tableau
SHAZAM, Analytica Microsoft Excel, etc.
Econ 115a

Why SPSS?
- one of the most widely used statistical software (Musuade, et. al.,2021)
- leading statistical analysis software package for the social sciences (Ozgur, et al.,
2014)
- ease of use (point and click)
- user friendly GUI (graphical user interface)
Econ 115a

Commonly used Statistical


Software from 1997-2017
(Musuadi, et. al., 2021)
Econ 115a

Jobs requiring various


software (Muenchen, 2014).
Econ 115a

Current trends
Nowadays, the most used statistical software are those that are open-source due to
costs consideration.

However, it requires more time for learning as it mostly involves coding (or
programming) skills since most of them are command based.

Newbies in Statistics or Data Science choses user friendly software for a kickstart.
Econ 115a

Why not Excel?


Excel can also be a
good alternative.

However, it has limited


statistical functions
and/or methods.
Econ 115a
Econ 115a

2.2 What is SPSS?


Econ 115a

Statistical Packages for Social Sciences (SPSS)


- a product of International Business Machines (IBM)
- developed for solving and analyzing statistical
problems and data
- includes both basic and advance descriptive and
Inferential statistics
- user friendly graphical user interface (GUI): point and click
- other features: graphical outputs, syntax, journal logging,
structural equation modelling, text analytics, etc.
Econ 115a

Pros and Cons

Software Pros Cons


R - Free open-source software - Steep learning curve
- Strong online user community - Can be slow
- Programmable with more functions for
data analysis
Stata - User friendly and easy to learn - Individual license can cost
- Version control between $125 and $425 annually
- Many free online resources for learning - Limited to certain types of data
- Cannot program new functions
SPSS - Quick and easy to learn - Expensive
- Can handle large amounts of data - Limited functionality, needs add-ons
- Great user interface
- Rapid development
Econ 115a

History of SPSS
1968-1975: "SPSS becomes a product," when the technology was first developed and
grew on its own as an academic enterprise. SPSS founders (from University of
Stanford), Norman H. Nie, C. Hadlai Hull and Dale H. Bent, distribute tapes of source
code to a small, but enthusiastic, user community, while maintenance and
enhancement was done by the original authors.

1975-1984: "SPSS becomes a corporation." The Company is separately incorporated


when its revenues threatened the non-profit status of its original hosting institution,
the National Opinion Research Center at the University of Chicago.
Econ 115a

During this start-up phase, the business was organized, and a number of development
initiatives were undertaken.

1984-1992: "The age of the PC, " with the Company growing from $18m to $38m on
the strength of the market-leading statistical analysis system for PC DOS. SPSS was
the first to market with a statistical software product on PC DOS.
Econ 115a

1992-1996: "The age of Windows," with the Company shipping the first Windows
version of a statistical software package in 1992. This version drove revenues to $84m
by 1996. The business was focused on statistical products, and the acquisition strategy
complemented this direction by bringing in other statistical products companies, such
as SYSTAT (1994) and Jandel (1996).

1997-2002: "The transition to the enterprise." This period has been the age of growth
by acquisition and the rise of analytic applications as a complement to the core
statistical products business. The Company grew from $110m in 1997 to a projected
$209m in 2002.
Econ 115a

2003: Predictive analytics is successfully established as a market segment. SPSS


played a thought-leadership role in the emergence during 2003 of predictive analytics
as an important, distinct segment within the broader business intelligence software
sector.

Predictive analytics complements and enhances other information technologies. SPSS


saw a growing awareness of these benefits among the commercial, public sector, and
academic organizations its serves.
Econ 115a

2004: Predictive analytic applications come of age. In 2004, SPSS a ccelerated the
introduction of predictive analytics applications, leveraging skills and integrating
technologies.

2009: IBM acquired SPSS; it is now fully integrated into the IBM Corporation Business
Analytics Software portfolio.
Econ 115a

Version history of SPSS


1968: SPSS 6, 7, 8, 9
1983: SPSSX (“X” standing for version 10) due to major revisions. Then “X” was
dropped, and versions continued until version 15.
2008: SPSS 16, 17 and SPSS also added “Statistics” to the name, making it to be
known as SPSS Statistics
2009: 17.02 and 18, and name was changed to Predictive Analytics Software (PASW)
when IBM acquired SPSS Inc. However, the name was disputed until it was
finally solved in the court.
Econ 115a

2010: IBM SPSS Statistics was the official name and is part of IBM’s analytic portfolio

New versions followed, nearly every year.


2017: version 25
2021: version 28
Econ 115a

SPSS version used for this course


IBM SPSS Statistics Base
- contains the core capabilities you need to take the analytical process from start to
finish.
Version: 23

NOTE: Functionalities and features that will be covered by this course is only limited to
what this version can offer.
Econ 115a

2.3 SPSS Add-on Products


Econ 115a

The SPSS Graphical User Interface (GUI)


Econ 115a

SPSS Add-on Products


1. IBM SPSS Statistics Server
Offers all the features of IBM® SPSS® Statistics but with faster performance.
Processing is centralized on the server, so there is no need to transfer data over the
network.

2. IBM SPSS Modeler


Enables more rapid predictive modeling, fast performance on large data volumes, and
flexible deployment (including distributed and real-time scoring) to improve decision-
making.
Econ 115a

3. IBM SPSS Text Analytics for Surveys


Offers all the features of IBM® SPSS® Statistics but with faster performance.
Processing is centralized on the server, so there is no need to transfer data over the
network.

4. IBM SPSS Data Collection


Includes a set of tools that promote the authoring, interviewing, reporting and
management of complex surveys.
Econ 115a

5. IBM SPSS Collaboration and Deployment Services


Foundation for managing and deploying analytics. Enables collaboration and
automation of developed analytical processes throughout the enterprise.

6. IBM SPSS Amos


Easy-to-use structural equation modeling (SEM) that tests relationships between
observed and unobserved variables to quickly test hypotheses and confirm
relationships.
Econ 115a

7. IBM SPSS SamplePower


Find the right sample size for your research in minutes and test the possible results
before you begin your study.

NOTE: The add-ons mentioned were only based on SPSS version 23. There might be
other add-ons available in higher versions of SPSS.
Econ 115a

2.4 SPSS Toolbars and Shortcuts


Econ 115a

The SPSS Menu Bar and Toolbars and Shortcuts


Menu Bar

Toolbars &
Shortcuts
Econ 115a
Econ 115a
Econ 115a

2.5 SPSS Windows


Econ 115a

Three (3) main windows of SPSS


1. Data Editor Window
- a spreadsheet-like (e.g., Excel) system for defining, entering, editing, and displaying
data.

2. Output Viewer Window


- a results and log window. All output and errors are displayed in this window.

3. Syntax Editor Window


- is a text editor system for syntax (commands) composition
Econ 115a

The “Data Editor Window”


Econ 115a

The Data Editor window with sample data


Econ 115a

The “Syntax Editor Window”


Econ 115a

The Syntax Editor window with sample commands


Econ 115a

The “Output Viewer Window”


Econ 115a

The Output Viewer window with sample results


Econ 115a

2.6 Data and Variable View


Econ 115a

Two (2) Data Editor tabs when working with data:


1. Data View
The default tab when SPSS opens. It displays the open data set: variables appear in
columns, and cases appear in rows.
2. Variable view
displays information about variables in the open data (but not the data themselves),
such as variable names, types, and labels, etc.

NOTE: The tab that is currently displayed will be yellow in color.


Econ 115a

Data View

Each row may be called as:


case; or
observation; or
record
Econ 115a

Variable View

Columns are variable properties Each variable


Rows are variables property has its
own definition and
rules to follow.

Each of the
variable represents
a single question in
a questionnaire (if
variables are based
on a survey
questionnaire).
Econ 115a

Variable properties
1. Name
- Enter a unique name in this column for each variable. This name will appear at the
top of the corresponding column in the data view and helps you to identify variables
in the data view.
- Use of symbols that have other uses in SPSS such as the following: +, - , $, & and
spaces

Not following proper naming of variables will


make an error!
Econ 115a

2. Type
This specifies the type of data that the variable will have.
Econ 115a

2.1 Numeric – data values that are numbers, can be sorted numerically or entered
into arithmetic calculations

2.2 Comma - Numeric variables that include commas that delimit every three places
(to the left of the decimals) and use a period to delimit decimals. SPSS will recognize
these values as numeric even if they contain commas or use scientific notation.

2.3 Dot - Numeric variables that include periods that delimit every three places and
use a comma to delimit decimals. SPSS will recognize these values as numeric even if
they contain periods or use scientific notation.
Econ 115a

2.4 Scientific notation – Numeric variables whose values are displayed with an E and
power-of-ten exponent. Exponents can be preceded by either an E or a D, with or
without a sign, or only with a sign (no E or D). SPSS will recognize these values as
numeric, with or without an exponent.

2.5 Date - Numeric variables that are displayed in any standard calendar date or
clock-time formats. Standard formats may include commas, blank spaces, hyphens,
periods, or slashes as space delimiters.
Econ 115a

2.6 Dollar - Numeric variables that contain a dollar sign (i.e., $) before numbers.
Commas may be used to delimit every three places, and a period can be used to
delimit decimals.

2.7 Custom Currency – Numeric variables that are displayed in a custom currency
format. Custom currency characters are displayed in the Data Editor but cannot be
used during data entry.
Econ 115a

2.8 String - also called alphanumeric variables or character variables – have values
that are treated as text. This means that the values of string variables may include
numbers, letters, or symbols.

In the Data View tab, missing string values will appear as blank cells.

2.9 Restricted Number - Numeric variables whose values are restricted to non-
negative integers (in standard format or scientific notation).The values are displayed
with leading zeroes padded to the maximum width of the variable.
Econ 115a

3. Width
The number of digits displayed for numerical values or the length of a string variable.

4. Decimals
The number of digits to display after a decimal point for values of that variable. Does
not apply to string variables.

Note that this changes how the numbers are displayed but does not change the
values in the dataset.
Econ 115a

5. Labels
A brief but descriptive definition or display name for the variable. You may also put
here the actual question if you’re using questionnaires. When defined, a variable's
label will appear in the output in place of its name.
Econ 115a

6. Values
For coded categorical variables, the value label(s) that should be associated with each
category abbreviation. It is useful primarily for categorical (i.e., nominal or ordinal)
variables, especially if they have been recorded as codes (e.g., 1, 2, 3). It is strongly
suggested that you give each value a label so that you (and anyone looking at your
data or results) understands what each value represents.
Econ 115a

7. Missing
User-defined data values (or ranges of values) should be treated as missing. Note that
this property does not alter or eliminate SPSS's default missing value code for numeric
variables (".").

This column merely allows the user to specify up to three unique missing value codes
for the given variable; or, to specify a range of numbers to treat as missing, plus one
additional unique missing value code.
Econ 115a

8. Columns
The width of each column in the Data View spreadsheet. Note that this is not the
same as the number of digits displayed for each value.

This simply refers to the width of the actual column in the spreadsheet.
Econ 115a

9. Align
The alignment of content in the cells of the SPSS Data View spreadsheet. Options
include left-justified, right-justified, or center-justified.

10. Measure
The level of measurement for the variable (e.g., nominal, ordinal, or scale). It is vital
that you correctly define each variable's measurement level. This setting affects
everything from graphs to internal algorithms for statistical analysis. Incorrectly
specifying measurement level can have unintended and potentially disastrous effects
on your results.
Econ 115a

11. Role
The role that a variable will play in your analyses (i.e., independent variable,
dependent variable, both independent and dependent).
11.1 Input: The variable will be used as a predictor (independent variable). This
is the default assignment for variables.
11.2 Target: The variable will be used as an outcome (dependent variable).
11.3 Both: The variable will be used as both a predictor and an outcome
(independent and dependent variable).
11.4 None: The variable has no role assignment.
Econ 115a

11.5 Partition: The variable will partition the data into separate samples.
11.6 Split: Used with the IBM® SPSS® Modeler (not IBM® SPSS® Statistics).
Econ 115a

Important variable properties


Econ 115a

2.7 Output Window


Econ 115a

Output
Outline

Output items
Econ 115a

Output Outline composition:


Log – logs the syntax/command
[Method] – method used in the analysis (e.g., T-test)
Title – contains the title header in the output items
Notes – contains the notes (if there is/are)
[Results 1] – displays the result of the method used
[Results 2] – displays the result of the method used
Econ 115a

Log

Title

Result 1

Result 2

Result 3
Econ 115a

2.8 Opening/Importing Data


Econ 115a

Opening a data
There are several ways to open a data in SPSS:
Method 1. Directly open your file by double clicking it and make sure to use SPSS
when prompted
Method 2. Open SPSS then drag and drop the SPSS file to the SPSS Data Editor
window
Method 3. Use the Menu Bar (under File menu) – most useful when opening data that
are not in SPSS formats (Excel, Database, Texts, etc.)
Method 4. Use syntax/commands
Econ 115a
1
To open a file using 2
3
File menu, go to:

File>Open>Data
Econ 115a

Supported file types:


SPSS, Excel
Lotus, dBase (Database)
SAS, Stata, Text files
Econ 115a

To open a file using


commands/syntax, use the
“get data” or “GET”
command.

NOTE: Make sure to know


the file type (Excel, Stata,
SAS, Texts, etc.), file location,
and sheet name (for Excel
files).
Econ 115a

2.9 SPSS File Formats


Econ 115a

Three (3) main file types of SPSS


.sav – SPSS data file, containing data/datasets

.sps – SPSS Syntax, contains a compilation/series of SPSS commands that can be


executed

.spv – SPSS Output file, contains outputs generated by running analysis in SPSS
Econ 115a

2.10 Installing SPSS


Econ 115a

Minimum system requirements


- Intel or AMD processor running at 1 gigahertz (GHz) or higher.

- 1 gigabyte (GB) of RAM or more

- 800 megabytes (MB) of available hard-disk space.

If you will install more than one help language, each additional language requires 60-
70 MB of disk space.
Econ 115a

Double click (or right click and


choose “Open” or “Install” the IBM
SPSS Statistics xx file. Make sure to
click the Windows Installer type of
file.
Econ 115a
Econ 115a
Econ 115a
Econ 115a
Econ 115a
Econ 115a

Click “Install” to start the installation


of the software to the computer.

Once done, a window will pop up


showing that the installation was
successful.

Software Licensing will then follow.


Licensing steps will depend on the
license that you will key in.
Econ 115a

References:
Kent State University Libraries. (2017, May 15). SPSS tutorials. Retrieved November 17, 2020, from
https://libguides.library.kent.edu/SPSS/
Masuadi, E., Mohamud, M., Almutairi, M., Alsunaidi, A., Alswayed, A. K., & Aldhafeeri, O. F. (2021). Trends in the
Usage of Statistical Software and Their Associated Study Designs in Health Sciences Research: A Bibliometric
Analysis. Cureus, 13(1), e12639. https://doi.org/10.7759/cureus.12639
Nie, N. H. (1975). SPSS: Statistical package for the social sciences. New York: McGraw-Hill.
Ozgur, Ceyhun & Dou, Min & Li, Yang & Rogers, Grace. (2017). C.Ozgur, M. Dou, Y. Li & G. Rogers Selection of
Statistical Software for Solving Big Data Problems for Teaching Modern Journal of Applied Statistics 2017,
forthcoming. Journal of modern applied statistical methods: JMASM. vol 16.
http://www.spss.com.hk/corpinfo/history.htm
http://www.unige.ch/ses/sococ/cl/bib/qual/spss.history.html?
THANK YOU!

You might also like