32 views

Uploaded by Rahul Ghosale

CFA AND SEM.pdf

- Using ICT in School
- Burke Tal 2011
- Self-report Measures of Psychopathy,
- 2013 Dozier Sha Shen
- Additional Simulation Results for the Comparison of Methods To
- Introduction to Quantitative Techniques in Business
- Sem Web Page
- Heckman Pinto (2013) Mediation Analysis
- Publications - 2016 - Fa - Supply Side IPR 33437-36383-1-PB_2
- Syllabus Bcom 3 Sem
- Ch03SM
- Course Manual Spring 2017
- Beaver
- Dejong e 2001
- Business Decision Making
- e
- 17010037.docx
- Week 04
- 4029 Wiper Systems
- Statistics - MBA

You are on page 1of 18

Model fit refers to how well our proposed model (in this case, the model of the factor structure)

accounts for the correlations between variables in the dataset. If we are accounting for all the

major correlations inherent in the dataset (with regards to the variables in our model), then we

will have good fit; if not, then there is a significant discrepancy between the correlations

proposed and the correlations observed, and thus we have poor model fit. Our proposed model

does not fit the observed or estimated model (i.e., the correlations in the dataset).

Metrics

There are specific measures that can be calculated to determine goodness of fit. The metrics that

ought to be reported are listed below, along with their acceptable thresholds. Goodness of fit is

inversely related to sample size and the number of variables in the model. Thus, the thresholds

below are simply a guideline. For more contextualized thresholds, see Table 12-4 in Hair et al.

2010 on page 654.

Modification indices

Modification indices offer suggested remedies to discrepancies between the proposed and

estimated model. In a CFA, there is not much we can do by way of adding regression lines to fix

Page 1 of 18

model fit, as all regression lines between latent and observed variables are already in place.

Therefore, in a CFA, we look to the modification indices for the covariances. We cannot covary

error terms with observed or latent variables, or with other error terms that are not part of the

same factor. Thus, the only modification available to us is to covary error terms that are part of

the same factor. The figure below illustrates this rule. In general, you want to address the largest

modification indices before addressing more minor ones.

Residuals

Residuals are much like modification indices; they point out where the discrepancies are between

the proposed and estimated models. However, they also indicate whether or not those

discrepancies are significant. A significant standardized residual is one with an absolute value

greater than 0.4. Significant residuals significantly decrease your model fit. Fixing model fit per

Page 2 of 18

the residuals matrix is similar to fixing model fit per the modification indices. The same rules

apply. For a more specific run-down of how to calculate and locate residuals, refer to the HOW

TO CFA video tutorial.

when doing a CFA. If your factors do not demonstrate adequate validity and reliability, moving

on to test a causal model will be useless - garbage in, garbage out! There are a few measures that

are useful for establishing validity and reliability: Composite Reliability (CR), Average Variance

Extracted (AVE), Maximum Shared Squared Variance (MSV), and Average Shared Squared

Variance (ASV). The thresholds for these values are as follows:

Reliability

CR > 0.7

Convergent Validity

CR > (AVE)

AVE > 0.5

Discriminant Validity

ASV < AVE

If you have convergent validity issues, then your variables do not correlate well with each other

within their parent factor; i.e, the latent factor is not well explained by its observed variables. If

you have discriminant validity issues, then your variables correlate more highly with variables

outside their parent factor than with the variables within their parent factor; i.e., the latent factor

is better explained by some other variables (from a different factor), than by its own observed

variables.

Page 3 of 18

If you need to cite these suggested thresholds, please use the following:

Hair, J., Black, W., Babin, B., and Anderson, R. (2010). Multivariate data analysis (7th ed.):

Prentice-Hall, Inc. Upper Saddle River, NJ, USA.

Before creating composite variables for a path analysis, configural and metric invariance should

be tested during the CFA to validate that the factor structure and loadings are sufficiently

equivalent across groups, otherwise your composite variables will not be very useful (because

they are not actually measuring the same underlying latent construct for both groups).

Configural

Configural invariance tests whether the factor structure represented in your CFA achieves

adequate fit when both groups are tested together and freely (i.e., without any cross-group path

constraints). To do this, simply build your measurement model as usual, create two groups in

AMOS (e.g., male and female), and then split the data along gender. Next, attend to model fit as

usual. If the resultant model achieves good fit, then you have configural invariance. If you dont

pass the configural invariance test, then you may need to look at the modification indices to

improve your model fit or to see how to restructure your CFA.

Metric

If we pass the test of configural invariance, then we need to test for metric invariance. To test for

metric invariance, simply perform a chi-square difference test on the two groups just as you

would for a structural model. The evaluation is the same as in the structural model invariance

test: if you have a significant p-value for the chi-square difference test, then you have evidence

of differences between groups, otherwise, they are invariant and you may proceed to make your

composites from this measurement model (but make sure you use the whole dataset when you

create composites, instead of using the split dataset).

Page 4 of 18

Contingency Plans

If you do not achieve invariant models, here are some appropriate approaches in the order I

would attempt them.

1. Modification indices: Fit the model for each group using the unconstrained

measurement model. You can toggle between groups when looking at modification

indices. So, for example, for males, there might be a high MI for the covariance between

e1 and e2, but for females this might not be the case. Go ahead and add those covariances

appropriately for both groups. When adding them to the model, it does it for both groups,

even if you only needed to do it for one of them. If fitting the model this way does not

solve your invariance issues, then you will need to look at differences in regression

weights.

2. Regression weights: You need to figure out which item or items are causing the

trouble (i.e., which ones do not measure the same across groups). The cause of the lack of

invariance is most likely due to one of two things: the strength of the loading for one or

more items differs significantly across groups, or, an item or two load better on a factor

other than their own for one or more groups. To address the first issue, just look at the

standardized regression weights for each group to see if there are any major differences

(just eyeball it). If you find a regression weight that is exceptionally different (for

example, item2 on Factor 3 has a loading of 0.34 for males and 0.88 for females), then

you may need to remove that item if possible. Retest and see if invariance issues are

solved. If not, try addressing the second issue (explained next).

3. Standardized Residual Covariances: To address the second issue, you need to

analyze the standardized residual covariances.

Handling 2nd order factors in AMOS are not difficult, but it is tricky. And, if you don't get it

right, it won't run. The pictures below offer a simple example of how you would model a 2nd

order factor in a measurement model and in a structural model.

Page 5 of 18

Page 6 of 18

Page 7 of 18

STRUCTURAL EQUATION MODELING

Structural equation modeling (SEM) grows out of and serves purposes similar to multiple

regression, but in a more powerful way which takes into account the modeling of interactions,

nonlinearities, correlated independents, measurement error, correlated error terms, multiple latent

independents each measured by multiple indicators, and one or more latent dependents also each

with multiple indicators. SEM may be used as a more powerful alternative to multiple regression,

path analysis, factor analysis, time series analysis, and analysis of covariance. That is, these

procedures may be seen as special cases of SEM, or, to put it another way, SEM is an extension

of the general linear model (GLM) of which multiple regression is a part.

SEM is an umbrella concept for analyses such as mediation and moderation. This wiki page

provides general instruction and guidance regarding how to write hypotheses for different types

of SEMs, what to do with control variables, mediation, interaction, multi-group analyses, and

model fit for structural models. Videos and slides presentations are provided in the subsections.

Hypotheses

Hypotheses are a keystone to causal theory. However, wording hypotheses is clearly a struggle

for many researchers (just select at random any article from a good academic journal, and count

the wording issues!). In this section we offer examples of how you might word different types of

hypotheses. These examples are not exhaustive, but they are safe.

Direct effects

An increase in hours spent watching television will negatively affect weight loss.

Page 8 of 18

Mediated effects

For mediated effects, be sure to indicate the direction of the mediation (positive or negative), the

degree of the mediation (partial, full, or simply indirect), and the direction of the mediated

relationship (positive or negative).

Exercise positively and partially mediates the positive relationship between diet and

weight loss.

Television time positively and fully mediates the positive relationship between diet and

weight loss.

Diet affects weight loss positively and indirectly through exercise.

Interaction effects

Exercise positively moderates the positive relationship between diet and weight loss.

Exercise amplifies the positive relationship between diet and weight loss.

TV time negatively moderates (dampens) the positive relationship between diet and

weight loss.

Multi-group effects

Body Mass Index (BMI) moderates the relationship between exercise and weight loss,

such that for those with a low BMI, the effect is negative (i.e., you gain weight - muscle

mass), and for those with a high BMI, the effect is positive (i.e., exercising leads to

weight loss).

Age moderates the relationship between exercise and weight loss, such that for age < 40,

the positive effect is stronger than for age > 40.

Diet moderates the relationship between exercise and weight loss, such that for western

diets the effect is positive and weak, for eastern (asia) diets, the effect is positive and

strong.

Page 9 of 18

Handling controls

When including controls in hypotheses (yes, you should include them), simply add at the end of

any hypothesis, when controlling for.[list control variables here] For example:

Exercise positively moderates the positive relationship between diet and weight loss

when controlling for TV time and diet.

Diet has a positive effect on weight loss when controlling for TV time and diet.

Supporting Hypotheses

Getting the wording right is only part of the battle, and is mostly useless if you cannot support

your reasoning for WHY you think the relationships proposed in the hypotheses should exist.

Simply saying X has a positive effect on Y is not sufficient to make a causal statement. You

must then go an explain the various reasons behind your hypothesized relationship. Take Diet

and Weight loss for example. The hypothesis is, "Diet has a positive effect on weight loss".

Weight is gained as we consume calories. Diet reduces the number of calories consumed.

Therefore, the more we diet, the more weight we should lose.

Controls

Controls are potentially confounding variables that we need to account for, but that dont drive

our theory. For example, in Dietz and Gortmaker 1985, their theory was that TV time had a

negative effect on school performance. But there are many things that could effect school

performance, possibly even more than the amount of time spent in front of the TV. So, in order

to account for these other potentially confounding variables, the authors control for them. They

are basically saying, that regardless of IQ, time spent reading for pleasure, Hours spent doing

homework, or the amount of time parents spend reading to their child, an increase in TV time

still significantly decreases school performance regardless of all these other factors. These

relationships are shown in the figure below.

Page 10 of 18

As a cautionary note, you should nearly always include some controls; however, these control

variables still count against your sample size calculations. So, the more controls you have, the

higher your sample size needs to be. Also you get a higher R square but with increasingly

smaller gains for each added control. Sometimes you may even find that adding a control

drowns out all the effects of the IVs, in such a case you may need to run your tests without

that control variable (but then you can only say that your IVs, though significant, only account

for a small amount of the variance in the DV). With that in mind, you cant and shouldnt control

for everything, and as always, your decision to include or exclude controls should be based on

theory.

Handling controls in AMOS is easy, but messy (see the figure below). You simply treat them

like the other exogenous variables (the ones that dont have arrows going into them), and have

them regress on whichever endogenous variables they may logically affect. In this case, I have

valShort, a potentially confounding variable, as a control, with regards to valLong. And I have

LoyRepeat as a control on LoyLong. Ive also covaried the Controls with each other and with the

other exogenous variables. When using controls in a moderated mediation analysis, go ahead and

put the controls in at the very beginning.

Page 11 of 18

When reporting the model, you do need to include the controls in all your tests and output, but

you should consolidate them at the bottom where they can be out of the way. Also, just so you

dont get any crazy ideas, you would not test for any mediation between a control and a

dependent variable. However, you may report how the control effects a dependent variable

differently based on a moderating variable. For example, valshort may have a stronger effect on

valLong for males than for females. This is something that should be reported, but not

necessarily focused on, as it is not likely a key part of your theory. Lastly, even if effects from

controls are not significant, you do not need trim them from your model (although, there are also

other schools of thought on this issue).

Mediation

Concept

Mediation models are used to describe chains of causation. Mediation is often used to provide a

more accurate explanation fo the causal effect the antecedent has on the dependent variable. The

mediator is usually that variable that is the missing link in a chain of causation. For example,

Page 12 of 18

Intelligence leads to increased performance - but not in all cases, as not all intelligent people are

high performers. Thus, some other variable is needed to explain the reason for the inconsistent

relationship between IV and DV. This other variable is called a mediator. In this example, work

effectiveness, may be a good mediator. We would say that work effectiveness fully and positively

mediates the relationship between intelligence and performance. Thus, the direct relationship

between intelligence and performance is better explained through the mediator of work

effectiveness. The logic is, even if you are intelligent, if you don't work smarter, then you won't

perform well. However, intelligent people tend to work smarter (but not always).

Types

There are three main types of simple mediation: 1) partial, 2) full, and 3) indirect. Partial

mediation means that both the direct and indirect effects from the IV to DV are significant. Full

means that the direct effect drops out of significance when the mediator is added, and that the

indirect effect is significant. Indirect means that the direct effect never was significant, but that

the indirect effect is. The figure below illustrates these types of mediation. Please refer to the

step by step guide listed above for determining significance of the mediation.

Page 13 of 18

Interaction

Concept

In factorial designs, interaction effects are the joint effects of two predictor variables in addition

to the individual main effects. This is another form of moderation (along with multi-grouping)

i.e., the X to Y relationship changes form (gets stronger, weaker, changes signs) depending on

the value of another explanatory variable (the moderator). So, for example

you lose 1 pound of weight for every 500 calories you cut back from your regular diet

but when you exercise while dieting, the you lose 2 pounds for every 500 calories you cut back

from your regular diet, in addition to the 1 pound you lose for exercising for one hour; thus in

total, you lose three pounds

So, the multiplicative effect of exercising while dieting is greater than the additive effects of

doing one or the other. Here is another simple example:

Chocolate is yummy

Cheese is yummy

but combining chocolate and cheese is yucky!

Page 14 of 18

The following figure is an example of a simple interaction model.

Types

Interactions enable more precise explanation of causal effects by providing a method for

explaining not only how X effects Y, but also under what circumstances the effect of X changes

depending on the moderating variable of Z. Interpreting interactions is somewhat tricky.

Interactions should be plotted (as demonstrated in the HOW TO video). Once plotted, the

interpretation can be made using the following four examples (in the figures below) as a guide.

Page 15 of 18

Model fit again

You already did model fit in your CFA, but you need to do it again in your structural model in

order to demonstrate sufficient exploration of alternative models. The method is the same: look

at modification indices, residuals, and standard fit measures like CFI, RMSEA etc. The one thing

that should be noted here in particular, however, is logic that should determine how you apply

the modification indices to error terms.

If the correlated variables are not logically causally correlated, but merely statistically correlated,

then you may covary the error terms in order to account for the systematic statistical correlations

without implying a causal relationship.

o e.g., burnout from customers is highly correlated with burnout from management

o We expect these to have similar values (residuals) because they are logically similar and

have similar structures, but they do not necessarily have any causal ties.

If the correlated variables are logically causally correlated, then simply add a regression line.

o e.g., burnout from customers is highly correlated with satisfaction with customers

Page 16 of 18

o We expect burnC to predict satC, so not accounting for it is negligent

Lastly, remember, you don't need to create the BEST fit, just good fit. If a BEST fit model (i.e.,

one in which all modification indices are addressed) isn't logical, or does not fit with your theory,

you may need to simply settle for a model that has worse (yet sufficient) fit, and then explain

why you did not choose the better fitting model.

Multi-group

Multi-group moderation is a special form of moderation in which a dataset is split along values

of a categorical variable (such as gender), and then a given model is tested with each set of data.

Using the gender example, the model is tested for males and females separately. The use of

multi-group moderation is to determine if relationships hypothesized in a model will differ based

on the value of the moderator (e.g., gender). Take the diet and weight loss hypothesis for

example. A multi-group moderation model would answer the question: does dieting effect

weight loss differently for males than for females? In the videos above, you will learn how to set

up a multigroup model in AMOS, and test it using chi-square differences, and using critical

ratios. Really, using critical ratios takes about a one minute after the model is set up, and it

involves no room for human error, whereas using the chi-square method can take upwards of 30

minutes and it involves a lot of room for human error. So, I recommend the easy method!

Many of the examples in the videos so far have taught concepts using a set of composite

variables (instead of latent factors with observed items). Many will want to utilize the full power

of SEM by building true structural models (with latent factors). This is not a difficult thing.

Simply remove the covariance arrows from your measurement model (after CFA), then draw

single-headed arrows from IVs to DVs. Make sure you put error terms on the DVs, then run it.

It's that easy. Refer to the video for a demonstration.

Page 17 of 18

Creating Composites from Latent Factors

If you would like to create composite variables (as used in many of the videos) from latent

factors, it is an easy thing to do. However, you must remember two very important caveats:

You are not allowed to have any missing values in the data used. These will need to be

imputed beforehand in SPSS or excel (I have two tools for this in my Stats Tools Package

- one for imputing, and one for simply removing the entire row that has missing data).

Latent factor names must not have any spaces or hard returns in them. They must be

single continuous strings (FactorOne or Factor_One instead of Factor One).

After those two caveats are addressed, then you can simply go to the Analyze menu, and select

Data Imputation. Select Regression Imputation, and then click on the Impute button. This will

create a new SPSS dataset with the same name as the current dataset except it will be followed

by an "_C". This can be found in the same folder as your current dataset.

Page 18 of 18

- Using ICT in SchoolUploaded byHanif Hasif
- Burke Tal 2011Uploaded byAnnie Liang
- Self-report Measures of Psychopathy,Uploaded byadriangorbanescu
- 2013 Dozier Sha ShenUploaded byPrachiChoudhary
- Additional Simulation Results for the Comparison of Methods ToUploaded byhafiz10041976
- Introduction to Quantitative Techniques in BusinessUploaded byUsama Khalid
- Sem Web PageUploaded byVaibhav Bhamoriya
- Heckman Pinto (2013) Mediation AnalysisUploaded byHenne Popenne
- Publications - 2016 - Fa - Supply Side IPR 33437-36383-1-PB_2Uploaded byFahad
- Syllabus Bcom 3 SemUploaded bymahesh5050
- Ch03SMUploaded byKholibs
- Course Manual Spring 2017Uploaded byMihai Iacob
- BeaverUploaded byNavin Balakrishnan
- Dejong e 2001Uploaded byAnonymous qggLvxSc
- Business Decision MakingUploaded bysheran23
- eUploaded byWendi Putra
- 17010037.docxUploaded byMuhammad Rizwan Asim
- Week 04Uploaded byJane Doe
- 4029 Wiper SystemsUploaded byHamdani Nurdin
- Statistics - MBAUploaded byMohammed Bilal
- Canonical Correlation of AlmUploaded byKisha Delwal
- 254 Dian Eki PurwantiUploaded byKatherine Mercado
- Engineering journal ; Information Disseminated with Technology Firms on Social MediaUploaded byEngineering Journal
- MA Education Question Dec 2008Uploaded byArup Das
- 1-s2.0-S235214651730159X-mainUploaded bymojinjo
- Regression AnalysisUploaded byshoaib
- A Note on an Interpretation to Consumption-based CAPMUploaded bySinan Suleymanov
- 01 Analysis of Relationship Between Time and Cost OverrunsUploaded byRahulRandy
- regressionanalysis-110723130213-phpapp02.pdfUploaded byprashantnasa
- sugan dublicateUploaded byBarani Dharan

- Avishkar DissertationUploaded byRahul Ghosale
- Abd Question Paper BankUploaded byRahul Ghosale
- Summary of SalaryUploaded byRahul Ghosale
- Concepts of GSTUploaded byChandan Kumar
- UNIT 1Uploaded byRahul Ghosale
- Law Ppt SandyUploaded byRahul Ghosale
- Karishma PptUploaded byRahul Ghosale
- Marginal CostingUploaded byRahul Ghosale
- PanduUploaded byRahul Ghosale
- ABD TEST 2Uploaded byRahul Ghosale
- Evaluation Sheet SIPUploaded byRahul Ghosale
- A Study of Customer Satisfaction After BuyingUploaded byRahul Ghosale
- Abd SyllabusUploaded byRahul Ghosale
- 6.Time and Value of SupplyUploaded byRahul Ghosale
- INSPERIA 2017 - RGBSUploaded byRahul Ghosale
- BRM TEST NO 1Uploaded byRahul Ghosale
- BRM 1Uploaded byRahul Ghosale
- Final AccountUploaded byRahul Ghosale
- Sneha Loan ProjectUploaded byRahul Ghosale
- Cost Sheet FormatUploaded byRahul Ghosale
- Trupti Sandbhor Salary SlipUploaded byRahul Ghosale
- Scientific InquiryUploaded byRhief_Phynopank
- Mcq Test AbdUploaded byRahul Ghosale
- ExpUploaded byRahul Ghosale
- 27.09.2013 RM QB.docxUploaded byRahul Ghosale
- Document Required to Verify and Submit at ARCUploaded byRahul Ghosale
- UNDERSTANDING FINANCIAL STATEMENTS.pptxUploaded byRahul Ghosale
- International FinanceUploaded byRahul Ghosale
- feraandfema-111126120410-phpapp02.pptxUploaded bySmriti Singh

- Tut Sol Week13Uploaded byMichael Nguyen
- MA_I P_III Research Methodology StudyUploaded byHaresh M Raney
- Regresi GandaUploaded byekoefendi
- Big Data Analytics TutorialUploaded byAhmad Khador
- Week 5, Unit 2 Quantitative and Qualitative Data AnalysisUploaded byKevin Davis
- Davies 2010Uploaded byEvelyn_D_az_Ha_8434
- Mekb Module 16LOSEUploaded byDrip Drip
- API TR 6MET-2018.pdfUploaded byHoang Anh Dark
- Data Mining TutorialUploaded bymohikhan
- 20712 Mwakyusa Bupe Final e-Thesis (Master Copy).pdfUploaded byRajendra Lamsal
- Chapter 7 L2Uploaded byMbiko Sabeyo
- Gerlaketal, Andrea Et at. 2018. Water Security. a Review of Place-based ResearchUploaded byDiego Alberto
- Variance Covariance MatrixUploaded byArun Prasath
- OCR a Level Physics B SpecificationUploaded byastargroup
- jiptiain--novirahman-8347-4-revise-3(1)Uploaded byRita Cueciella
- sustainability-09-00919-v2.pdfUploaded byAhmed Ehab Mohamed
- Chapter 4 and 5 (2003)Uploaded byzahidkhanooo
- effect size-power.pdfUploaded byAoy Rangsima
- OUTPUT.docUploaded byRamadhan Sukma P
- Instrumental variables.pdfUploaded byduytue
- AEFLJ-Volume-20-Issue-4-April-2018-1.pdfUploaded byradiatul husna
- SPSS Bu EndangUploaded bykharis
- Midterm Sample ADM 3301Uploaded byAlbur Raheem-Jabar
- Multivariate Forecasting in Tableau With R _ R-bloggersUploaded bymalli karjun
- 2SLS Klein macro.pdfUploaded byNiken Dwi
- Unit-3Uploaded byRajesh
- Chapter 4Uploaded bygjsporque
- 5. Regression in the Toolbar of Minitab’s HelpUploaded byMalim Muhammad Siregar
- chapter07Uploaded byapi-172580262
- Supplement 5 - Multiple RegressionUploaded bynm2007k