210 views

Uploaded by api-19644056

save

You are on page 1of 42

Analysis with SPSS

生物統計諮詢中心

蔡培癸

Download The Instructional

Documents

• Point your browser to

http://core.ecu.edu/psyc/wuenschk/SPSS/SPS

.

• Click on Principle Components Analysis .

• Save, Desktop, Save.

• Do same for Factor Analysis .

When to Use PCA

**• You have a set of p continuous variables.
**

• You want to repackage their variance into

m components.

• You will usually want m to be < p, but not

always.

Components and Variables

• Each component is a weighted linear

combination of the variables

Ci = Wi 1 X 1 + Wi 2 X 2 + + Wip X p

• Each variable is a weighted linear

combination of the components.

X j = A1 j C1 + A2 j C2 + + Amj Cm

Factors and Variables

• In Factor Analysis, we exclude from the

solution any variance that is unique, not

shared by the variables.

X j = A1 j F1 + A2 j F2 + + Amj Fm + U j

**• Uj is the unique variance for Xj
**

Goals of PCA and FA

• Data reduction.

• Discover and summarize pattern of

intercorrelations among variables.

• Test theory about the latent variables

underlying a set a measurement variables.

• Construct a test instrument.

• There are many others uses of PCA and FA.

Data Reduction

• Ossenkopp and Mazmanian (Physiology and

Behavior, 34: 935-941).

• 19 behavioral and physiological variables.

• A single criterion variable, physiological

response to four hours of cold-restraint

• Extracted five factors.

• Used multiple regression to develop a multiple

regression model for predicting the criterion

from the five factors.

Exploratory Factor Analysis

• Want to discover the pattern of

intercorrleations among variables.

• Wilt et al., 2005 (thesis).

• Variables are items on the SOIS at ECU.

• Found two factors, one evaluative, one on

difficulty of course.

• Compared FTF students to DE students,

on structure and means.

Confirmatory Factor Analysis

• Have a theory regarding the factor

structure for a set of variables.

• Want to confirm that the theory describes

the observed intercorrelations well.

• Thurstone: Intelligence consists of seven

independent factors rather than one global

factor.

Construct Test Instrument

• Write a large set of items designed to test the

constructs of interest.

• Administer the survey to a sample of persons

from the target population.

• Use FA to help select those items that will be

used to measure each of the constructs of

interest.

• Use Cronbach alpha to check reliability of

resulting scales.

An Unusual Use of PCA

• Poulson, Braithwaite, Brondino, and Wuensch

(1997, Journal of Social Behavior and

Personality, 12, 743-758).

• Simulated jury trial, seemingly insane

defendant killed a man.

• Criterion variable = recommended verdict

– Guilty

– Guilty But Mentally Ill

– Not Guilty By Reason of Insanity.

• Predictor variables = jurors’ scores on 8

scales.

• Discriminant function analysis.

• Problem with multicollinearity.

• Used PCA to extract eight orthogonal

components.

• Predicted recommended verdict from

these 8 components.

• Transformed results back to the original

scales.

A Simple, Contrived Example

• Consumers rate importance of seven

characteristics of beer.

– low Cost

– high Size of bottle

– high Alcohol content

– Reputation of brand

– Color

– Aroma

– Taste

• Download FACTBEER.SAV from

http://core.ecu.edu/psyc/wuenschk/SPSS/SPS

.

• Analyze, Data Reduction, Factor.

• Scoot beer variables into box.

• Click Descriptives and then check Initial

Solution, Coefficients, and KMO and

Bartlett’s Test of Sphericity. Click

Continue.

• Click Extraction and then select Principal

Components, Correlation Matrix,

Unrotated Factor Solution, Scree Plot, and

Eigenvalues Over 1. Click Continue.

• Click Rotation. Select Varimax and

Rotated Solution. Click Continue.

• Click Options. Select Exclude Cases Listwise

and Sorted By Size. Click Continue.

**• Click OK, and SPSS completes the Principle
**

Components Analysis.

Checking for Unique Variables

• Check the correlation matrix.

• If there are any variables not well

correlated with some others, might as well

delete them.

• Bartlett’s test of sphericity tests null that

the matrix is an identity matrix, but does

not help identify individual variables that

are not well correlated with others.

• For each variable, check R2 between it and the

remaining variables.

• Look at partial correlations – variables with

large partial correlations share variance with

one another but not with the remaining

variables – this is problematic.

• Kaiser’s MSA will tell you, for each variable,

how much of this problem exists.

• The smaller the MSA, the greater the problem.

• An MSA of .9 is marvelous, .5 miserable.

• Use SAS to get the partial correlations and

individual MSAs.

• SPSS only gives an overall MSA, which is

of no use in identifying problematic

variables.

KMO and Bartlett's Test

**Kaiser-Meyer-Olkin Measure of Sampling
**

Adequacy.

.665

**Bartlett's Test of Approx. Chi-Square 1637.9
**

Sphericity df 21

Sig. .000

Extracting Principal Components

• From p variables we can extract p components.

• Each of p eigenvalues represents the amount of

standardized variance that has been captured by

one component.

• The first component accounts for the largest

possible amount of variance.

• The second captures as much as possible of what

is left over, and so on.

• Each is orthogonal to the others.

• Each variable has standardized variance =

1.

• The total standardized variance in the p

variables = p.

• The sum of the m = p eigenvalues = p.

• All of the variance is extracted.

• For each component, the proportion of

variance extracted = eigenvalue / p.

• For our beer data, here are the

eigenvalues and proportions of variance

for the seven components:

Initial Eigenvalues

% of Cumulative

Component Total Variance %

1 3.313 47.327 47.327

2 2.616 37.369 84.696

3 .575 8.209 92.905

4 .240 3.427 96.332

5 .134 1.921 98.252

6 9.E-02 1.221 99.473

7 4.E-02 .527 100.000

Extraction Method: Principal Component Analysis.

How Many Components to Retain

• From p variables we can extract p

components.

• We probably want fewer than p.

• Simple rule: Keep as many as have

eigenvalues ≥ 1.

• A component with eigenvalue < 1

captured less than one variable’s worth of

variance.

• Visual Aid: Use a Scree Plot

• Scree is rubble at base of cliff.

• For our beer data,

Scree Plot

3.5

3.0

2.5

2.0

1.5

1.0

Eigenvalue

.5

0.0

1 2 3 4 5 6 7

Component Number

• Only the first two components have

eigenvalues greater than 1.

• Big drop in eigenvalue between

component 2 and component 3.

• Components 3-7 are scree.

• Try a 2 component solution.

• Should also look at solution with one fewer

and with one more component.

Loadings, Unrotated and Rotated

• loading matrix = factor pattern matrix =

component matrix.

• Each loading is the Pearson r between one

variable and one component.

• Since the components are orthogonal, each

loading is also a β weight from predicting X from

the components.

• Here are the unrotated loadings for our 2

component solution:

Component Matrixa

Component

1 2

COLOR .760 -.576

AROMA .736 -.614

REPUTAT -.735 -.071

TASTE .710 -.646

COST .550 .734

ALCOHOL .632 .699

SIZE .667 .675

Extraction Method: Principal Component Analysis.

a. 2 components extracted.

**• All variables load well on first component,
**

economy and quality vs. reputation.

• Second component is more interesting,

economy versus quality.

• Rotate these axes so that the two

dimensions pass more nearly through the

two major clusters (COST, SIZE, ALCH

and COLOR, AROMA, TASTE).

• The number of degrees by which I rotate

the axes is the angle PSI. For these data,

rotating the axes -40.63 degrees has the

desired effect.

• Component 1 = Quality versus reputation.

• Component 2 = Economy (or cheap

drunk) versus reputation.

a

Rotated Component Matrix

Component

1 2

TASTE .960 -.028

AROMA .958 1.E-02

COLOR .952 6.E-02

SIZE 7.E-02 .947

ALCOHOL 2.E-02 .942

COST -.061 .916

REPUTAT -.512 -.533

Extraction Method: Principal Component Analysis.

Rotation Method: Varimax with Kaiser Normalization.

a. Rotation converged in 3 iterations.

Number of Components in the

Rotated Solution

• Try extracting one fewer component, try one more

component.

• Which produces the more sensible solution?

• Error = difference in obtained structure and true

structure.

• Overextraction (too many components) produces

less error than underextraction.

• If there is only one true factor and no unique

variables, can get “factor splitting.”

• In this case, first unrotated factor ≅ true

factor.

• But rotation splits the factor, producing an

imaginary second factor and corrupting

the first.

• Can avoid this problem by including a

garbage variable that will be removed prior

to the final solution.

Explained Variance

• Square the loadings and then sum them across

variables.

• Get, for each component, the amount of

variance explained.

• Prior to rotation, these are eigenvalues.

• Here are the SSL for our data, after rotation:

Total Variance Explained

**Rotation Sums of Squared
**

Loadings

% of Cumulative

Component Total Variance %

1 3.017 43.101 43.101

2 2.912 41.595 84.696

Extraction Method: Principal Component Analysis.

**• After rotation the two components together
**

account for (3.02 + 2.91) / 7 = 85% of the

total variance.

• If the last component has a small SSL,

one should consider dropping it.

• If SSL = 1, the component has extracted

one variable’s worth of variance.

• If only one variable loads well on a

component, the component is not well

defined.

• If only two load well, it may be reliable, if

the two variables are highly correlated with

one another but not with other variables.

Naming Components

• For each component, look at how it is

correlated with the variables.

• Try to name the construct represented by

that factor.

• If you cannot, perhaps you should try a

different solution.

• I have named our components “aesthetic

quality” and “cheap drunk.”

Communalities

• For each variable, sum the squared

loadings across components.

• This gives you the R2 for predicting the

variable from the components,

• which is the proportion of the variable’s

variance which has been extracted by the

components.

• Here are the communalities for our beer

data. “Initial” is with all 7 components,

“Extraction” is for our 2 component

solution.

Communalities

Initial Extraction

COST 1.000 .842

SIZE 1.000 .901

ALCOHOL 1.000 .889

REPUTAT 1.000 .546

COLOR 1.000 .910

AROMA 1.000 .918

TASTE 1.000 .922

Extraction Method: Principal Component Analysis.

Orthogonal Rotations

• Varimax -- minimize the complexity of the

components by making the large loadings

larger and the small loadings smaller

within each component.

• Quartimax -- makes large loadings larger

and small loadings smaller within each

variable.

• Equamax – a compromize between these

two.

Oblique Rotations

• Axes drawn between the two clusters in

the upper right quadrant would not be

perpendicular.

• May better fit the data with axes that are

not perpendicular, but at the cost of having

components that are correlated with one

another.

• More on this later.

- Zainudin Awang utm dr. k.pdfUploaded byNazaniah Badrun
- Data Analysis and Findings (Edit)Uploaded byArdiana Amran
- ERP SystemsUploaded byvivek kumar prasad
- Exploratory Factor AnalisysUploaded byFernanda Gusmão Louredo
- Moers (2)Uploaded byVânia Arcelino
- Z OConnor 2008 ThesisUploaded bySen Thanga
- Test Mayer OlkinUploaded byJuan Aitor Navas
- Factors of Influencing Brand (English)Uploaded byBagus Joddy
- spss3Uploaded byRiekoDita
- imp1-s2.0-S0167811605000108-mainUploaded byAlina Khan
- Acl Risks w Lateral CuttingUploaded byJesse Denham
- Factor AnalysisUploaded byVikas Bansal
- Evaluating Research Papers Relational Demographyv2.0Uploaded bySteven Lau Chee Siang
- A Factor Analysis of Bond Risk PremiaUploaded byDong Song
- Artikel NandaUploaded byrsudpwslampung
- Scales Construction for Consumers' Personal ValuesUploaded byAnonymous 01wDu2
- 07062706Uploaded byMelySemakinSendiri
- SpSS_Guide_Xavier_Uploaded bypearlbuds
- IJCTT-V3I2P111Uploaded bysurendiran123
- Diseño de sensor soft para flotación, Metodologia,Marcos OschardUploaded byEuracio Mella
- Sd ArticleUploaded bylengocthang
- Sensory evaluation of dairy products.pdfUploaded byRoque Virgilio
- The Interrelationships Among Informal Institutions, Formal InstitutionsUploaded byEduardo Luiz Machado
- BI CanettoUploaded byAleksandra Vinokurova
- PsychometricProperties.pdfUploaded byInes
- Entrepreneurial Career Intentions Among Malay Ethnic University Students in MalaysiaUploaded byWorld-Academic Journal
- Rainwater in Montereal IslandUploaded bySampurna Maharjan
- 0718R-607-619Uploaded byNeeraj D Sharma
- Superior PerformanceUploaded byMarius Iacomi
- Paper Florian LipsmeierUploaded byvisbio

- 樣本平均數之抽樣分析Uploaded byapi-19644056
- 主成份分析&因素分析Uploaded byapi-19644056
- 統計學導論Uploaded byapi-19644056
- 常態分佈Uploaded byapi-19644056
- 共變數分析Uploaded byapi-19644056
- PCA-SPSSUploaded byapi-19644056
- 統計分析概論Uploaded byapi-19644056
- 無母數分析Uploaded byapi-19644056
- 圖表及描述統計Uploaded byapi-19644056
- 母體平均數之估計_顯著性檢定Uploaded byapi-19644056
- 護理人員的戒煙衛教對初次心肌梗塞抽煙病患之成效探討Uploaded byapi-19644056
- FA-SPSSUploaded byapi-19644056
- RepeatedmeasureANOVAUploaded byapi-19644056
- ㄚ牛Uploaded byapi-19644056
- FA[1]Uploaded byapi-19644056
- catergorical-IUploaded byapi-19644056
- multiplelinearregressionUploaded byapi-19644056
- SPSS盒形圖Uploaded byapi-19644056
- N.K.H.S.---94-年讀書報告Uploaded byapi-19644056
- 03-研究人員對IRB應有的認識Uploaded byapi-19644056