MV - Factor

Attribution Non-Commercial (BY-NC)

22 views

MV - Factor

Attribution Non-Commercial (BY-NC)

- FACTORS AFFECTING CUSTOMER SATISFACTION
- Chemical and Biological Indicators of Soil Quality - Yadvinder Singh, PAU
- Full Text
- Customers Perception About Service Quality in Commercial Health and Fitness Clubs in Uganda
- job invol 3
- Consumer's Value
- AmitMSThesis
- Lewicki Network1998SpikeSorting
- Image Transformation
- Castella Cc i
- Multiple Regression AMR
- IEAAIE LNAI 2012 Presentation
- Lecture 8
- Consumer LoyaCL).pdf
- Blind Ident
- Prat as 2005
- DSI Detailed Syllabus v10.2
- Cluster Analysis of XRD Data for Ore Evaluation
- Predicting Mechanical Properties of Galvanized Steels: Data Mining Approach
- Regression Techniques

You are on page 1of 27

Faculty of Science University of Colombo

Session 3 Factor Analysis

Factor Analysis

Analysis of Interdependence: for data reduction and the discovery of underlying themes in the data

The term factor analysis was first introduced by Thurstone, 1931 Factor analysis tries to simplify attitudinal data by providing an alternative way of looking at it

What are the main underlying themes in the data? Which perceptions are related?

FA is based on analysing correlation matrix of attributes and aims to identify questions that measure, what respondents see as, similar or related concepts Essentially factor analysis is applied as a data reduction or structure detection method

Illustration

Can a set of 30 imagery statements for the shampoo category be simplified without any loss of information? There seem to be as many as 42 purchase decision criteria for my category. Can you help summarize these criteria? What all would Customer Service Orientation constitute? What variables? Can the variables be grouped into themes / dimensions? I want to deploy an objectively tested, valid scale for my Customer Satisfaction studies. My team has developed a huge battery of statements? Can you help?

Factor Analysis

Factor Analysis

Investigates interrelationships among variables. Variable reduction exercise: Reduces the variables into a sub-set of factors without loss of information Used to define or discover themes or underlying (latent) dimensions of a large set of attributes / variables. Often an intermediate step to some other procedure Factors are used as independent variables in Multiple Regression. Interdependence technique: no variable designated dependent or independent. All variables to be metric (interval) Large samples preferred The worth of the solution often depends on the intuitive interpretability of the factors rather than statistical rules.

To understand the principle behind factor analyses let us just consider two variables from a study to measure the peoples satisfaction with their life:

1. How satisfied they are with their hobbies? 2. How intensely they are pursuing their hobbies?

One can summarize the correlation between the two variables in a scatter plot. A regression line can be fitted that represents the best summary of the linear relationship between two variables

Simple Regression

9 8 7 6 5 4 3 2 1 0 0 2

Intensity of pusrsuing Hobby

Ajay Macaden

4 6 8 10 Satisfaction with Hobbies

7 point scales

If we could define a variable representing the regression line then that variable would capture the essence of two variables

In a sense we have reduced two variables to one factor

Combining two correlated variables in to one factor, illustrates the basic idea of Factor Analysis or Principal Component Analysis (PCA) If we extend the two variables example to multiple variables then computations become more involved but basic principle of representing two or more variables by single factor remains the same

Orthogonal Factors

After we have found the line on which the Variance is maximal, there remains variability around this line We continue and define another line that maximizes the remaining variability In this manner consecutive factors are extracted Because each factor is defined to maximize the variability that is not captured by preceding factor, consecutive factors are independent of each other

Put another way consecutive factors are uncorrelated or ORTHOGONAL to each other

Customers asked to rate bus travel on a number of attributes on a 10 point scale: 1 = Doesnt describe bus travel at all 10 = Totally describes bus travel

Relaxed Friendly Nervous Tolerate it Easy Interesting Uncertain Waste of time

Which statements did they rate similarly? ie which statements are correlated? common themes in the data

Correlations Grouped Q1 Relaxed

Q2 Friendly

Q3 Nervous

Q4 Tolerate it

Q5 Easy

Q6 Interesting

Q7 Uncertain

Q8 - Waste of

time

Component

1

Q2 Friendly Q1 Relaxed Q6 Interesting Q5 Easy Q4 - Tolerate it 0.823 0.803 0.732 0.725 0.456

2

-0.186

First four statement load mainly on first factor Positive bus travel Other 4 load on second factor Negative about bus travel Tolerate it loads on both

-0.265 0.253

Q7 Uncertain

Q3 Nervous Q8 - Waste of time -0.144

0.767

0.697 0.691

Component

1

Q2 Friendly Q1 Relaxed Q6 Interesting Q5 Easy Q4 - Tolerate it 0.82 0.80 0.73 0.72 0.45

First four statement load mainly on first factor Positive bus travel

Q7 Uncertain

Q3 Nervous Q8 - Waste of time

0.77

0.70 0.70

Other 4 load on second factor Negative about bus travel Tolerate it loads on both

Principal components analysis (PCA):

The most common form of factor analysis, PCA seeks a linear combination of variables such that the maximum variance is extracted from the variables. It then removes this variance and seeks a second linear combination which explains the maximum proportion of the remaining variance, and so on. This is called the principal axis method and results in orthogonal (uncorrelated) factors. PCA analyzes total (common and unique) variance.

Correlations

Grouped

Q1 Relaxed

Q2 Friendly

Q5 - Easy

Q6 Interesting

Q4 Tolerate it

Q3 Nervous

Q7 Uncertain

Q8 - Waste of

time

Total Variance Explained Initial Eigenvalues Component Total 1 2 3 4 5 6 7 8 9 5.459 1.249 .900 .830 .631 .478 .431 .353 .295 .204 Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

% of Cumulative % of Cumulative % of Cumulative Total Total Variance % Variance % Variance % 49.628 11.359 8.179 7.546 5.736 4.348 3.917 3.208 2.682 1.850 49.628 5.459 60.986 1.249 69.165 76.711 82.448 86.795 90.713 93.921 96.603 98.453 .900 49.628 11.359 8.179 49.628 2.894 60.986 2.634 69.165 2.080 26.312 23.948 18.905 26.312 50.260 69.165

How much of the total variation in the data is explained by the factors 11 .170 1.547 100.000 The factors should explain at least 2/3 of the Extraction Method: Principal Component Analysis. variance. In this data, the first three factors explain 69% of the variable.

10

Rotated Component Matrix(a) Component 1 Offers value-for-money products and services Offers wide range of products and services to suit different needs Progressive and provides innovative insurance solutions Has expertise in providing insurance solutions A reputable insurance provider/company An insurance company I can trust Global insurance company An insurance company with financial strength 2 3

.865 .257 -.006 .836 .101 .192 .741 .197 .432 .657 .326 .267 .251 .849 .086 .187 .809 .208 .425 .593 .283 .074 .575 .458

One of the insurance companies that I would first recommend to my customers .172 .086 .821 Has strong working relationships with its distributors/intermediaries Established local insurance company Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. .200 .342 .689 .334 .481 .543

Review factor loadings to decipher the factors. The factor loadings are the correlations between the factor and the attribute.

Rotated Component Matrix(a) Component 1 Offers value-for-money products and services Offers wide range of products and services to suit different needs Progressive and provides innovative insurance solutions Has expertise in providing insurance solutions A reputable insurance provider/company An insurance company I can trust Global insurance company An insurance company with financial strength 2 3

.865 .257 -.006 .836 .101 .192 .741 .197 .432 .657 .326 .267 .251 .849 .086 .187 .809 .208 .425 .593 .283 .074 .575 .458

Factor 2: Reputation

One of the insurance companies that I would first recommend to my customers .172 .086 .821 Has strong working relationships with its distributors/intermediaries Established local insurance company Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. .200 .342 .689 .334 .481 .543

1. 2. 3.

A three factor solution is selected for these data: Practical solutions Reputation Distribution/how well established

Key Concepts

Eigen Value

Also called characteristic roots. The eigen value for a given factor measures the variance in all the variables which is accounted for by that factor. The ratio of eigen values is the ratio of explanatory importance of the factors with respect to the variables. If a factor has a low eigen value, then it is contributing little to the explanation of variances in the variables and may be ignored as redundant with more important factors.

Variance Explained

To get the percent of variance in all the variables accounted for by each factor, add the sum of the squared factor loadings for that factor (column) and divide by the number of variables. (Note the number of variables equals the sum of their variances as the variance of a standardized variable is 1.) This is the same as dividing the factor's eigenvalue by the number of variables.

Types of Rotation

Varimax rotation is an orthogonal rotation of the factor axes to maximize the variance of the squared loadings of a factor (column) on all the variables (rows) in a factor matrix, which has the effect of differentiating the original variables by extracted factor. Each factor will tend to have either large or small loadings of any particular variable. A varimax solution yields results which make it as easy as possible to identify each variable with a single factor. This is the most common rotation option.

Promax rotation

is an alternative non-orthogonal rotation method which is computationally faster than the direct oblimin method and therefore is sometimes used for very large datasets.

There are two criteria of choosing the number of factors

The Kaiser Criterion

We can retain only factors with eigenvalue greater than 1 In essence this is like saying that, unless a factor extracts at least as much as the equivalent of one original variable, we drop it A graphical method is the Scree Test. We can plot the eigenvalues in a simple line plot The place where the smooth decrease of eigenvalues appears to level off to the right of plot. To the right of this point, presumably, one finds only factorial scree scree is the geological term referring to debris which collects on the lower part of a rocky slope.

The scree plot helps you to determine the optimal number of components. The eigenvalue of each component in the initial solution is plotted. Generally the components on the steep slope are extracted

Scree Plot

Eigenvalue

0 1 2 3 4 5 6 7 8 9

Component Number

Statistics to look at

KMO : Should be more than 0.5

Tells whether the partial correlation between variables is small or large

(0-1) should be close to 1 Below 0.5 implies factor analysis wont be useful.

Tells whether the variables are correlated or not

Bartlettsignificance level should be very small; say below 0.05. and NOT above 0.10

Choosing the number of factors is an art, as much as a science

Usual practice is to run several alternative analyses Researcher and analysts collaborative judgment are important, to generate a solution that provides a plausible explanation and interpretation of the factors

Must achieve a balance between, one the one hand, having enough factors to explain the variation in the original data satisfactorily and, on the other, not having so many factors that little or no data reduction had been achieved. Look for at least 65-70%+ with scale data, but 50+% with binary How big a sample is needed?

The larger the sample size, the more accurately we can estimate the correlations between questions and the more repeatable the analysis will be A sample of 400 or more should provide a stable factor analysis Minimum sample size of c200

Preferably interval data (5 or 7 point Likert, Agree/disagree scale is actually ordinal data but is treated as interval) as the correlations better estimated Binary (yes/no) variables often have a lower correlation

Question

Factor - respondent vs response ?

Factor vs Cluster ?

Summarises large amounts of data Identifies patterns easily that can be hard to find By basing factors on data patterns, analysis based on actual results, not preconceptions or questionnaire issues Used in conjunction with MLR But....

All variability in data not usually accounted for in factor analysis Factors can be hard to interpret - represent many measures Factors depend on data, and can differ for different sets of data

What it does

Identifies underlying families of parameters that are highly correlated, and each family represents a different factor. Helps reduce data from a large number of parameters to a small number of factors. Produces a set of independent variables to be used for further analysis.

Examples of application

What are the main characteristics based on which consumers form brand images in their mindsets? What are the main service aspects retailers consider when evaluating overall satisfaction with the service provided by supplier? Which 10 or 15 attributes should I finalize to be measured from a list of 30 attributes

Requirements

Type of scales: Interval Free association data (ordinal type) could be treated as binary interval data. Some binary nominal scales (involving opposites) could be treated as interval Exclude variables with very low variance, if any. If a pair of variables has a very high correlation, keep one, exclude the other.

Factor loading: correlation between factor and standardized variable. Higher loading means more weightage of the variable in the factor Use Promax Rotation to get correlated factors; For exploratory understanding Use Varimax rotation when un-correlated factors are required; to be used for regression and cluster Remove variables which load into multiple factors for cleaner solutions; especially for regression

Guidelines

Number of factors, based on: Total variance explained above 60% Eigen value above 1. Number of variables divided by 3 or 4. KMO measure: (0-1) should be close to 1.Below 0.5 implies factor analysis wont be useful. Bartlettsignificance level should be very small; say below 0.05. and NOT above 0.10

Analyze> data reduction > factor

Extraction > method >principal components > ( scree plot) Rotation > method > varimax. Eigen values over > 1 or number of factors =? Descriptives > KMO and Bartletts test of sphericity Scores > save as variables > method: regression Options > missing values > exclude cases pair wise.

- FACTORS AFFECTING CUSTOMER SATISFACTIONUploaded byPurparkerz Takia'genjy Emo'Vanquizher
- Chemical and Biological Indicators of Soil Quality - Yadvinder Singh, PAUUploaded byCSISA Project
- Full TextUploaded byelaendil
- Customers Perception About Service Quality in Commercial Health and Fitness Clubs in UgandaUploaded byAlexander Decker
- job invol 3Uploaded byAyuhartati
- Consumer's ValueUploaded bySumeet Pal
- AmitMSThesisUploaded byAmit C Kale
- Lewicki Network1998SpikeSortingUploaded byvkumar8282
- Image TransformationUploaded bysgrrsc
- Castella Cc iUploaded byCarlos Pizarro Gomez
- Multiple Regression AMRUploaded byAkshay Agarwal
- IEAAIE LNAI 2012 PresentationUploaded bybinuq8usa
- Lecture 8Uploaded byarvindtopno
- Consumer LoyaCL).pdfUploaded byMuhammad Yassir
- Blind IdentUploaded byEd
- Prat as 2005Uploaded byAngela Mzk
- DSI Detailed Syllabus v10.2Uploaded byNii Okai Quaye
- Cluster Analysis of XRD Data for Ore EvaluationUploaded byTamires de Jesus
- Predicting Mechanical Properties of Galvanized Steels: Data Mining ApproachUploaded byIjaems Journal
- Regression TechniquesUploaded byRitesh Raman
- Simplis MaUploaded byevy VS
- Using Factor Analysis to Organize Student ServicesUploaded byDinanti Pratiwi
- Influence over the Dimensionality Reduction and Clustering for Air Quality Measurements using PCA and SOMUploaded byIjaems Journal
- 7.Format.man-Development of Model on Student Engagement and Student SatisfactionUploaded byImpact Journals
- A Comparison of IRT and CFA Methodologies for Establishing Measurement Equivalence-InvarianceUploaded byRosa Camacho
- Answers (Chapter 15)Uploaded byLast Sab
- Using Orthogonal Garch to Forecast Covariance Matrix of Stocks ReturnsUploaded byPablo Rodriguez
- Natural Language Processing in Python Master Data Science and Machine Learning for Spam Detection Sentiment Analysis Latent Semantic Analysis and Article Spinning Machine Learning in PythonUploaded byMridul Hakeem
- 10.1111@pops.12500Uploaded byraul
- 1-s2.0-S0925521414003019-main.pdfUploaded byDhika Utami

- Music Theory Practice Paper Grade 5Uploaded byNick Stroud
- Khoya Khoya ChandUploaded byRochana Ramanayaka
- L. V. Beethoven piano transcription of string quartetsUploaded byRochana Ramanayaka
- fur eliseUploaded byHector Madrigal
- Software Requirement Specification Software EngineeringUploaded byRochana Ramanayaka
- CssUploaded byRochana Ramanayaka
- 5 Variations, WoO 79.pdfUploaded byzirconplus
- 12 Variations, WoO 68.pdfUploaded byRochana Ramanayaka
- Computor scienceUploaded byRochana Ramanayaka
- Internet TechnologiesUploaded byRochana Ramanayaka
- Abide With Me CelloUploaded byRochana Ramanayaka
- Uploads Resources 116 Cello.lstUploaded byRochana Ramanayaka
- Data Communication and NetworksUploaded byRochana Ramanayaka
- WEB AUTHORING (HTML) Samples Based on ExercisesUploaded byRochana Ramanayaka
- Selesthina CelloUploaded byRochana Ramanayaka
- Computor scienceUploaded byRochana Ramanayaka
- Web Authoring (HTML)Uploaded byRochana Ramanayaka
- Java ScriptUploaded byRochana Ramanayaka
- computor scienceUploaded byRochana Ramanayaka
- Java Scripts - ExamplesUploaded byRochana Ramanayaka
- Styles FinalUploaded byRochana Ramanayaka
- PHP (2)Uploaded byRochana Ramanayaka
- Data Input and Output10122009 (2)Uploaded byRochana Ramanayaka
- PHP Basics 1-1 (4)Uploaded byRochana Ramanayaka
- PHP Basics 1-1 (3)Uploaded byRochana Ramanayaka
- PHP Basics 1-1 (2)Uploaded byRochana Ramanayaka
- PHP Basics 1-1Uploaded byRochana Ramanayaka
- Nissan Micra K12 (2002,EN)Uploaded byRochana Ramanayaka
- Mama Tharam Adare - ScoreUploaded byRochana Ramanayaka

- Inventory RCM - Version 1Uploaded bytalktopuneet
- calculadora EstadisticaUploaded byJosé Damián Quispe Almeida
- Dimension Reduction and Classification Using PCA, Factor Analysis and ant Functions a Short OverviewUploaded byAdrian Albert
- Lecture 4 Statistical Data Treatment and Evaluation ContdUploaded byRobert Edwards
- Credit Risk Modelling FinalUploaded bySharad Paward
- datalessonplantemplateUploaded byapi-301349748
- LCIA BigData Opportunities ValueUploaded byNathaniel Miller
- 950 Math M [PPU] Semester 2 Topics-SyllabusUploaded byJosh, LRT
- Forecasting Methods - Mark LittleUploaded bygiorgio802000
- Analysis of Customer SatisfactionUploaded bykushan.kumar
- The Crisp Dm Model the New Blueprint for Data Mining Shearer ColinUploaded byMinoru Osorio Garcia
- Bs Project Grp 8Uploaded byRia George Kallumkal
- Bioestadisticas Para El ClinicoUploaded byJuan José Tijerina
- Weibull7 TrainingUploaded byPravin Shekhar
- udacity-dandsyllabusUploaded byAiRia-misaki'usui Wookiekyu-yeeunhyuk Giseob Hottestsuperbeautyshineebang
- Multilevel models slidesUploaded bySlava Yurkov
- ec117-1-1Uploaded byRaziele Raneses
- STATISTICSUploaded bySathish Kumar
- Correlation Explained.pdfUploaded byK Murugan
- AMR_Sec-B_Group_1Uploaded bySudhanshu Jaiswal
- SPSS Discriminant Function Analysis.pdfUploaded byankur4042007
- Heizer_om10_ch04 Forecasting TOM UPDATEDUploaded byMutiara Nuvi Nafisah
- 4. IJLL - CDA Undertaken Through SFL Employment of Appraisal in Critical Discourse Analysis of the NewsUploaded byiaset123
- M&E hand bookUploaded byMokone Mokokoane
- NSE BA Sample Paper With SolutionUploaded bySanjay Singh
- IT Business Analyst or Systems AnalystUploaded byapi-79063180
- Clinical Data-Mining in the Era of Evidence-based Practice, EpsteinUploaded bypiMLeon
- formulacard.pdfUploaded bynoramirah
- NMNR5104 Research MethodologyUploaded bynadia
- Encyclopedia of Social Science Research MethodsUploaded bymedeea32