You are on page 1of 21

LOWERBOUNDSONSAMPLESIZEINSTRUCTURALEQUATIONMODELING

Published in Electronic Commerce Research and Applications, forthcoming Dec 2010, PII:S15674223(10)000542, DOI: 10.1016/j.elerap.2010.07.003 (with software downloadable from
Elsevier)

J.ChristopherWestland
Professor,Information&DecisionSciences
UniversityofIllinois,Chicago
601S.MorganStreet,Chicago,IL606077124
(312)8600587email:westland@uic.edu

J ULY 2010

A BSTRACT

LOWERBOUNDSONSAMPLESIZEINSTRUCTURALEQUATIONMODELING

Computationallyintensivestructuralequationmodeling(SEM)approacheshavebeenindevelopmentovermuchofthe20thcentury,initiatedbytheseminalworkofSewallWright. Tothis
day,samplesizerequirementsremainavexingquestioninSEMbasedstudies. Complexitieswhichincreaseinformationdemandsinstructuralmodelestimationincreasewiththenumber
ofpotentialcombinationsoflatentvariables;whiletheinformationsuppliedforestimationincreaseswiththenumberofmeasuredparameterstimesthenumberofobservationsinthe
samplesizebotharenonlinear. Thisalonewouldimplythatrequisitesamplesizeisnotalinearfunctionsolelyofindicatorcount,eventhoughsuchheuristicsarewidelyinvokedin
justifyingSEMsamplesize. ThispaperdevelopstwolowerboundsonsamplesizeinSEM,thefirstasafunctionoftheratioofindicatorvariablestolatentvariables,andthesecondasa
functionofminimumeffect,powerandsignificance.ThealgorithmisappliedtoametastudyofasetofresearchpublishedinfiveofthetopMISjournals. Thestudyshowsasystematic
biastowardschoosingsamplesizesthataresignificantlytoosmall. Actualsamplesizesaveragedonly50%oftheminimumneededtodrawtheconclusionsthestudiesclaimed. Overall,
80%oftheresearcharticlesinthemetastudydrewconclusionsfrominsufficientsamples.Lackingaccuratesamplesizeinformation,researchersareinclinedtoeconomizeonsample
collectionwithinadequatesamplesthathurtthecredibilityofresearchconclusions. Guidelinesareprovidedforapplyingthealgorithmsdevelopedinthisstudy,andcompanionsoftware
encapsulatingthepapersformulaeismadeavailablefordownload. (261words)

Keywords: Structuralequationmodeling,SEM,Partialleastsquares,PLS,LISREL,AMOS,samplesize,Ginicorrelation,commonfactorbias,ruleof10

1.INTRODUCTION

Thepasttwodecadeshaveseenaremarkableaccelerationofinterestinstructuralequationsmodeling(SEM)methodsinmanagementresearch,includingpartialleastsquares(PLS)and
implementationsofJreskogsSEMalgorithms(LISREL,AMOS,EQS). ThebreadthofapplicationofSEMmethodshasbeenexpanding,withSEMincreasinglyappliedtoexploratory,
confirmatoryandpredictiveanalysiswithavarietyofadhoctopicsandmodels. SEMisparticularlyusefulinthesocialscienceswheremanyifnotmostkeyconceptsarenotdirectly
observable.Becausemanykeyconceptsinthesocialsciencesareinherentlylatent,questionsofconstructvalidityandmethodologicalsoundnesstakeonaparticularurgency.

Tothisday,methodologiesforassessingsuitablesamplesizerequirementsremainavexingquestioninSEMbasedstudies. Thenumberofdegreesoffreedomconsuminginformationin
structuralmodelestimationincreaseswiththenumberofpotentialcombinationsoflatentvariables;whiletheinformationsuppliedinestimatingincreaseswiththenumberofmeasured
parameters(i.e.,indicators)timesthenumberofobservations(i.e.,thesamplesize)botharenonlinearinmodelparameters. Thisshouldimplythatrequisitesamplesizeisnotalinear
functionsolelyofindicatorcount,eventhoughsuchheuristicsarewidelyinvokedinjustifyingSEMsamplesize. MonteCarlosimulationinthisfieldhaslentsupporttothenonlinearityof
samplesizerequirements,thoughresearchtodatehasnotyieldedasamplesizeformulasuitableforSEM. Thispaperproposesasetofnecessaryconditions(thuslowerbounds)forSEM
sampleadequacy.

Theexpositionproceedsasfollows. Section2describesthehistoricalcontext,commentingonhowparticularresearchobjectivesandcomputationallimitationsresultedinourcurrentSEM
toolsets.Section3summarizesthepriorliteratureonsampleadequacyresultsfromMonteCarlosimulations.Section4developsanalgorithmforcomputingtheminimumsamplesize
neededtodetectaminimumeffectatgivenpowerandsignificancelevelsinthestructuralequationmodel. Section5discussesthesewithanapplicationresearcharticleswhose
conclusionsrestonconfirmatorySEManalyses,andassesseswhetherthesamplesizesusedareadequate.

2.PRIORLITERATURE

SEMevolvedinthreedifferentstreams:(1)systemsofequationregressionmethodsdevelopedmainlyattheCowlesCommission;(2)iterativemaximumlikelihoodalgorithmsforpath
analysisdevelopedmainlyattheUniversityofUppsala;and(3)iterativeleastsquaresfitalgorithmsforpathanalysisalsodevelopedattheUniversityofUppsala. Figure1providesa
chronologyofthepivotaldevelopmentsinlatentvariablestatisticsintermsofmethod(precomputer,computerintensiveandSEM)andobjectives(exploratory/predictionor
confirmation).

INSERT FIGURE1:DEVELOPMENTOFSTRUCTURALEQUATIONMODELESTIMATION

BothLISRELandPLSwereconceivedasiterativecomputeralgorithms,withanemphasisfromthestartoncreatinganaccessiblegraphicalanddataentryinterfaceandextensionofWrights
(1921)pathanalysis. EarlyCowlesCommissionworkonsimultaneousequationsestimationcenteredonKoopmanandHoods(1953)algorithmsfromtheeconomicsoftransportationand
optimalrouting,withmaximumlikelihoodestimation,andclosedformalgebraiccalculations,asiterativesolutionsearchtechniqueswerelimitedinthedaysbeforecomputers. Anderson
andRubin(1949,1950)developedthelimitedinformationmaximumlikelihoodestimatorfortheparametersofasinglestructuralequation,whichindirectlyincludedthetwostageleast
squaresestimatoranditsasymptoticdistribution(Anderson,2005)andFarebrother(1999). Twostageleastsquareswasoriginallyproposedasamethodofestimatingtheparametersofa
singlestructuralequationinasystemoflinearsimultaneousequations,beingintroducedbyTheil(1953a,1953b,1961)andmoreorlessindependentlybyBasmann(1957)andSargan
(1958). Andersonslimitedinformationmaximumlikelihoodestimationwaseventuallyimplementedinacomputersearchalgorithm,whereitcompetedwithotheriterativeSEM
algorithms. Ofthese,twostageleastsquareswasbyfarthemostwidelyusedmethodinthe1960sandtheearly1970s.

LISRELandPLSpathmodelingapproacheswerechampionedatCowlesmainlybyNobelistTrygveHaavelmo(1943).UnfortunatelyunderlyingassumptionsofLISRELandPLSwerechallenged
byeconomistssuchasFreedman(1987)whoobjectedtotheirfailuretodistinguishamongcausalassumptions,statisticalimplications,andpolicyclaimshasbeenoneofthemainreasons
forthesuspicionandconfusionsurroundingquantitativemethodsinthesocialsciences(seealsoWolds(1987)response). Haavelmospathanalysisnevergainedalargefollowingamong
U.S.econometricians,butwassuccessfulininfluencingagenerationofHaavelmosfellowScandinavianstatisticians,includingHermannWold,KarlJreskog,andClaesFornell. Fornell
introducedLISRELandPLStechniquestomanyofhisMichigancolleaguesthroughinfluentialpapersinaccounting(FornellandLarker1981),andinformationsystems(Davis,etal,1989).
Dhrymes(1971;Dhrymes,etal.1974)providedevidencethatPLSestimatesasymptoticallyapproachedthoseoftwostageleastsquareswithexactlyidentifiedequations. Thispointis
moreofacademicimportancethanpractical,becausemostempiricalstudiesoveridentify. Butinonesense,allofthelimitedinformationmethods(OLSexcluded)yieldsimilarresults.

3.SAMPLESIZEANDTHERATIOOFINDICATORSTOLATENTVARIABLES
3

StructuralequationmodelinginMIShastakenacasualattitudetowardschoiceofsamplesize. Sincetheearly1990s,MISresearchershavealludedtoanadhocruleofthumbrequiringthe
choosingof10observationsperindicatorinsettingalowerboundfortheadequacyofsamplesizes. Justificationsforthisruleof10appearinseveralfrequentlycitedpublications(Barclay,
etal.1995;Chin1998;Chin,andNewsted1999;KahaiandCooper2003)thoughnoneoftheseresearchersreferstotheoriginalarticulationoftherulebyNunnally(1967)whosuggested
(withoutprovidingsupportingevidence)thatinSEMestimationagoodruleistohaveatleasttentimesasmanysubjectsasvariables.

WithintheMISfield,Goodhue,etal.(2006,2007)studiedtheruleof10usingMonteCarlosimulationtocomparesamplesizesof40,90,150,and200,alongwithvaryingeffectsizes(large,
medium,smallandnoeffect)todeterminetheadequacyofthisruleforagivensignificanceandpoweroftests. Theyconcludedthat:Infact,forsimple[SEM]modelswithnormally
distributeddataandrelativelyreliablemeasures,noneofthethreetechniqueshaveadequatepowertodetectsmallormediumeffectsatsmallsamplesizesThesefindingsruncounterto
extantsuggestionsinMISliterature(Goodhue,etal.2006,p.202b). Thisfindingisnotcompletelyunexpected,assimilarSEMrulesofthumbhavebeeninvestigatedsinceNunnallys
(1967)proposal. Thedebatehasevolvedsignificantlysincehis1967publication.

Theruleof10couchesthesamplesizequestionintermsoftheratioofobservations(samplepoints)tofreeparametersforexample,Bollen(1989)statedthatthoughIknowofnohard
andfastrule,ausefulsuggestionistohaveatleastseveralcasesperfreeparameterandBentler(1989)suggesteda5:1ratioofsamplesizetonumberoffreeparameters. Butisthisthe
rightquestion? Typicallytheirparameterswereconsideredtobeindicatorvariablesinthemodel,butunliketheearlypathanalysis,structuralequationmodelstodayaretypically

estimatedintheirentirety,andthenumberofuniqueentriesinthecovariancematrixis when isthenumberofindicators.Itwouldbereasonabletoassumethatthesamplesize

isproportionalto ratherthan . Unfortunately,MonteCarlostudiesconductedinthe1980sand1990sshowedthattheproblemissomewhatmoresubtleandcomplexthan

that,andsamplesizeandestimatorperformancearegenerallyuncorrelatedwitheither or .

Difficultiesarisebecausethe indicatorvariablesareusedtoestimatethe latentvariables(theunobservedvariablesofinterest)intheSEM,andeventhoughtheremaybe

freeparameters,thesearenotindividuallythefocusofSEMestimation. Rather,freeparametersareclusteredaroundamuchsmallersetoflatentvariableswhicharethefocusofthe
estimation(oralternatively,thecorrelationsbetweentheseunobservedlatentvariablesarethefocusofestimation). Tanaka(1987)arguedthatsamplesizeshouldbedependentonthe
numberofestimatedparameters(thelatentvariablesandtheircorrelations)ratherthanonthetotalnumberofindicators;aviewmirroredinotherdiscussionsofminimumsamplesizes
(BrowneandCudeck1989,1993;GewekeandSingleton1980;GebringandAnderson1985). VeicerandFava(1987,1989,1994)wentfurther,afterreviewingavarietyofsuch
recommendationsintheliterature,concludingthattherewasnosupportforrulespositingaminimumsamplesizeasafunctionofindicators. Theyshowedthatforagivensamplesize,a
convergencetopropersolutionsandgoodnessoffitwerefavorablyinfluencedby:(1)agreaternumberofindicatorsperlatentvariable;and(2)agreatersaturation(higherfactorloadings).

MarshandBailey(1991)concludedthattheratioofindicatorstolatentvariablesratherthanjustthenumberofindicators,assuggestedbytheruleof10,isasubstantiallybetterbasison

whichtocalculatesamplesize,reiteratingconclusionsreachedbyBoomsma(1982)whosuggestedusingaratio ofindicatorstolatentvariables. InformationinputtotheSEM

estimationincreasesbothwithmoreindicatorsperlatentvariable,aswellaswithmoresampleobservations. Aseriesofstudies(Ding,etal.1995)foundthattheprobabilityofrejecting
truemodelsatasignificancelevelof5%wascloseto5%for 2 (where istheratioofindicatorstolatentvariables)butrosesteadilyas increased for 6 ,rejection
rateswere39%forsamplesizeof50;22%forsamplesizeof100;12%forsamplesizeof200;and6%forsamplesizeof400.

Boomsmas(1982)simulationssuggestedthataratio ofindicatorstolatentvariablesof 4 wouldrequireasamplesizeofatleast100foradequateanalysis;andfor 2 would


requireasamplesizeofatleast400. Marshetal(1988,1996,1998))ran35,000MonteCarlosimulationsonLISRELCFAanalysis,yieldingdatathatsuggestedthat: 3 wouldrequirea
samplesizeofatleast200; 2 wouldrequireasamplesizeofatleast400; 12 wouldrequireasamplesizeofatleast50. Consolidationandsummarizationoftheseresults
suggestsamplesizes:

50 450 1100

where istheratioofindicatorstolatentvariables. Furthermore,Marshetal.(1996)recommend 6 to 10 indicatorsperlatentvariable,assuming2550%oftheinitialchoicesadd


noexplanatorypower,whichtheyfoundtooftenbethecaseintheirstudies. TheynotethatthisisasubstantiallylargerratiothanfoundinmostSEMstudies,whichtendtolimit
themselvesto34indicatorsperlatentvariable. Itispossiblethatasamplesizeruleoftenobservationsperindicatormayindeedbiasresearcherstowardsselectingsmallernumbersof
indicatorsperlatentvariableinordertocontrolthecostofastudyorthelengthofasurveyinstrument.

4.SAMPLESIZEWITHPAIREDLATENTVARIABLES

ThissectiondevelopsanalgorithmforcomputingthelowerboundonsamplesizerequiredtoconfirmorrejecttheexistenceofaminimumeffectinanSEMatgivensignificanceandpower
levels. WhereSEMstudiesaredirectedtowardshypothesistestingforcomplexmodels,withsomelevelofsignificance andpower 1 ,calculatingthepowerrequiresfirstspecifying
theeffectsize youwanttodetect. Fundingagencies,ethicsboardsandresearchreviewpanelsfrequentlyrequestthataresearcherperformapoweranalysis,theargumentisthatifa

studyisinadequatelypowered,thereisnopointincompletingtheresearch. Additionally,intheframeworkofSEMtheassessmentofpowerisaffectedbythevariableinformation
containedinsocialsciencedata. Table1summarizesthenotationused.

INSERT TABLE1:NOTATIONUSEDINTHEPAPER

DECONSTRUCTION

ThisresearchasksWhatisthelowerboundonsamplesize forconfirmatorytestingofSEMasafunctionofthesedesignparameters? Wewanttodetectaminimumcorrelation


(effect) inestimating latent(unobserved)variables,atsignificanceandpowerlevels ,1 . Inotherwords,deviseanalgorithm , | , .

Inthissection,wewilladoptthestandardtargetsforourrequiredTypeIandIIerrorsunderNeymanPearsonhypothesistestingof .05 and .20;buttheserequirementscanbe


relaxedforamoregeneralsolution. Structuralequationmodelsarecharacterizedhereasacollectionofpairsofcanonicallycorrelatedlatentvariables,andadheretothestandard
normalcyassumptiononindicatorvariables. ThisleadsnaturallytoadeconstructionoftheSEMintoanoverlappingsetofbivariatenormaldistributions. Maketheassumptionthatan
arbitrarilyselectedpairoflatentvariables,callthem and ,arebivariatenormalwithdensityfunction:

, | , , , ,

andcovariancestructureis

COMBINATORICSOFHYPOTHESISTESTSONLINKS ,ANDSIGNIFICANCELEVEL

ItistypicalintheliteraturetopredicateanSEManalysiswiththecaveatthatoneneedstomakestrongargumentsforthecomplexmodelsconstructedfromtheunobserved,latent
constructstestedwiththeparticularSEM,inordertosupporttheparticularlinksthatareincludedinthemodel. Thisisusuallyinterpretedtomeanthateachproposed(andtested)linkin
theSEMneedstobesupportedwithreferencestopriorresearch,anecdotalevidenceandsoforth. Thismaysimplymeanthewholesaleimportofapreexistingmodel(e.g.,theTechnology
AcceptanceModel)basedonthesuccessofthatmodelinothercontexts,butnotspecificallybuildingontheparticulareffectsunderinvestigation. Unfortunately,itisuncommontosee
anydiscussionoftheparticularlinks(causalorotherwise)orcombinationsoflinksthatareexcluded(eitherimplicitlyorexplicitly)fromtheSEMmodel. Ideally,thereshouldalsobe
similarlystrongargumentsmadefortheinapplicabilityofomittedlinksoromittedcombinationsoflinks.

Wecanformalizetheseobservationsbyletting bethenumberofthepotentiallinksbetweenlatentvariables. Extendtheindividuallinkminimumsamplesizetoaminimumsamplesize


fortheentireSEM;buildingupfrompairsoflatentvariablesbydeterminingthenumberofpossiblecombinationsofthe pairs,eachwithaneffectthatneedsdetection. Eacheffectcan
bedichotomized:

0:

1:

Ourproblemistocomputethenumberofdistinctstructuralequationmodelsthatcanexistintermsofthe0,1valuesoftheirlinksusingcombinatorialanalysis.

INSERT FIGURE2:ANEXAMPLEOFASTRUCTURALEQUATIONMODELWITHSIXLATENTVARIABLESANDFIVECORRELATIONS

INSERT FIGURE3:THESEMEXAMPLEINFIGURE2WITHALLPOSSIBLEPAIREDLINKSSHOWN

Theneachcombinationof 0,1 valuesforlinkswhichourtestsoftheSEMonthewholerequiresustodiscriminateamongstprovidesusasetof binarynumbers(seefigures2

and3)eachrepresentingauniquecombinationoflatentvariables. Theuniquemodelhypothesizedinanyparticularstudywillbesomemodel(binarynumber)whichisexactlyoneoutof

thepossible 2 waysofconnectingtheselatentvariables;testingmustdiscriminatethispathfromthepossible 2 1 otherpathswhichcollectivelydefinethealternative


hypothesis.

Forhypothesistestingwithasignificanceof (whichwehavebydefaultsetto .05)oneachlink,itisnecessarytocorrectforeffectivesignificancelevel indifferentiatingone


possiblemodelfromallotherhypothesizedstructuralequationmodelsthatarepossible. TheidkcorrectionisacommonlyusedalternativefortheBonferronicorrectionwherean
experimenteristestingasetof hypotheseswithadatasetcontrollingthefamilywiseerrorrate. Inthecontextofthecurrentresearchtheidkcorrectionprovidesthemostaccurate

results. Forthefollowinganalysis,aidkcorrectiongives 1 1 wherethepowerofthetestcanbeheldat 1 .8 overtheentireSEMwithno


modification.
5

MINIMUMEFFECTSIZE

Minimumeffect,inthecontextofstructuralequationmodels,isthesmallestcorrelationbetweenlatentvariablesthatwewishtobeabletodetectwithoursampleandmodel. Small
effectsaremoredifficulttodetectthanlargeeffectsastheyrequiremoreinformationtobecollected. Informationmaybeaddedtotheanalysisbycollectingmoresampleobservations,
byaddingparameters,andbyconstructingabettermodel.

INSERT FIGURE4:SIGNIFICANCEANDPOWERFORTHEMINIMUMEFFECTTHATNEEDSTOBEDETECTED

Samplesizeforhypothesistestingistypicallydeterminedfromacriticalvalue(seeFigure4)thatdefinestheboundarybetweentherejection(setby )andnonrejection(setby )regions.


Theminimumsamplesizethatcandifferentiatebetween and occurswherethecriticalvaluethatisexactlythesameunderthenullandalternativehypotheses. Theapproachto
computingsamplesizehereisanalogoustostandardunivariatecalculations(Cochran1977;Kish1955;Lohr1999;SnedecorandCochran1989,WestlandandSeeto2007)butusinga
formulationforvariancecustomizedtothisproblem.

Inthecontextofstructuralequationmodels,canonicalcorrelationbetweenlatentvariablesshouldbeseensimplyascorrelation,thecanonicalqualifierreferringtotheparticularsofits
calculationinSEMsincethelatentvariablesareunobserved,andthuscannotbedirectlymeasured. Correlationisinterpretedasthestrengthofstatisticalrelationshipbetweentwo
randomvariablesobeyingajointprobabilitydistribution(KendallandGibbons1990)likeabivariatenormal.Severalmethodsexisttocomputecorrelation:thePearsonsproductmoment
correlationcoefficient(Fisher1921,1990),SpearmansrhoandKendallstau(KendallandGibbons1990)areperhapsthemostwidelyused(MariandKotz2001). Besidesthesethree
classicalcorrelationcoefficients,variousestimatorsbasedonMestimation(ShevlyakovandVilchevski2002)andorderstatistics(SchechtmanandYitzhaki1987)havebeenproposedinthe
literature.Strengthsandweaknessesofvariouscorrelationcoefficientsmustbeconsideredindecisionmaking.ThePearsoncoefficient,whichutilizesalltheinformationcontainedinthe
variates,isoptimalwhenmeasuringthecorrelationbetweenbivariatenormalvariables(StuartandOrd1991). However,itcanperformpoorlywhenthedataisattenuatedbynonlinear
transformations. Thetworankcorrelationcoefficients,SpearmansrhoandKendallstau,arenotasefficientasthePearsoncorrelationunderthebivariatenormalmodel;nevertheless
theyareinvariantunderincreasingmonotonetransformations,thusoftenconsideredasrobustalternativestothePearsoncoefficientwhenthedatadeviatesfrombivariatenormalmodel.
Despitetheirrobustnessandstabilityinnonnormalcases,theMestimatorbasedcorrelationcoefficientssuffergreatlosses(upto63%accordingtoXu,etal.2010)ofasymptoticrelative
efficiencytothePearsoncoefficientfornormalsamples,thoughsuchheavylossofefficiencymightnotbecompensatedbytheirrobustnessinpractice. SchechtmanandYitzhaki(1987)
proposedacorrelationcoefficientbasedonorderstatisticsforthebivariatedistributionwhichtheycallGinicorrelation(becauseitisrelatedtoGinismeandifferenceinawaythatissimilar
totherelationshipbetweenPearsoncorrelationcoefficientandthevariance).

INSERT FIGURE5:BIVARIATENORMALSCATTERPLOTSFOR AND WITH

Asameasureofsuchstrength,correlationshouldbelargeandpositiveifthereisahighprobabilitythatlargeorsmallvaluesofonevariableoccur(respectively)inconjunctionwithlargeof
smallvaluesofanother;anditshouldbelargeandnegativeifthedirectionisreversed(GibbonsandChakraborti1992). Figure5providesarugplotofbivariatenormalscatterplots
generatedbytheRmtvnormpackagethatprovideavisualdescriptionoftheclusteringandbehaviorofparticularvaluesofcorrelation betweenthelatentvariables.

Wewilluseastandarddefinitionofminimumeffectsizetobedetectedthestrengthoftherelationshipbetweentwovariablesinastatisticalpopulationasmeasuredbythecorrelation
forpairedlatentvariablesfollowingconventionsarticulatedinWilkinson(1999);Nakagawaetal.(2007)andBrand,etal.(2008). Whereweareassessingcompletedresearch,wecan
substitutefor thesmallestcorrelation(effectsize)onallofthelinksbetweenlatentvariablesintheSEM. Cohen(1988,1992)providesthefollowingguidelinesforthesocialsciences:
smalleffectsize, | | =0.1.23;medium, | | =0.24.36;large, | | =0.37orlarger. Figure5givesusafeelforCohensrecommendations | | =0.37stillhasagreatdealofdispersion,
andwemightfinditdifficulttovisuallydeterminecorrelationmerelybylookingatascatterplotwherethevariablesonthetwoaxeshavecorrelation | | =0.37.

ESTIMATORFORC ORRELATIONINAB IVARIATENORMAL DISTRIBUTION

Let , 1,2, , bearandomsampleofindependentandidenticallydistributed(i.i.d.)datapairsofsize fromthebivariatenormalpopulationof , populationwith


continuousjointcumulativedistributionfunction. Let : : : betheorderstatistics(wherethefirstsubscriptistherank,andthesecondthesamplesize)ofthe
samplevalues;let : : : betheorderstatisticsofthe samplevalues;andlet : bethe samplevalueassociatedwiththe : samplevalueinthesamplepairs
, . : iscalledtheconcomitantofthe orderstatistic(BalakrishnanandRao1998). Reversingtherolesof andY,wecanalsoobtaintheassociated : .
ExtendingtheworkofSchechtmanandYitzhaki(1987), Xu,etal.(2010)showthatthetwoGinicorrelationswithrespectto , are

1
2 1 :
1
,
1
2 1 :
1

and

1
2 1 :
1
,
1
2 1 :
1

Ingeneral , isnotsymmetricthatis, , , . Suchasymmetryviolatestheaxiomsofcorrelationmeasurement(GibbonsandChakraborti1992;Mariand


Kotz2001)whichisassumedinSEMestimation. Xu,etal.(2010)provideasymmetricalestimator(whichweusehere)obtainedfromtheirlinearcombination:

, , ,

Ginicorrelation possessesthefollowinggeneralproperties(SchechtmanandYitzhaki1987):

1) 1,1

2) , , 1 if isamonotoneincreasing(decreasing)functionof

3) , isasymptoticallyunbiasedandtheexpectationsof , and , arezerowhen isindependentof

4) , , , , forboth , and ,

5) , isinvariantunderallstrictlymonotonetransformationsof

6) , isscaleandshiftinvariantwithrespecttoboth and

7) 0, ;i.e.,convergesindistributiontoanormaldistributionwithmeanzeroandvariance (ThisisfromSchechtmanandYitzhaki(1987)applying
methodsdevelopedbyHoeffding(1948))

8) TheSpearmanrhomeasureofcorrelationisaspecialcaseof , ;Xu,etal(2010).

Xu,etal.(2010)showedthatGinicorrelationsareasymptoticallynormalwiththefollowingmeanandvariance 1 :

2 2
2 4 3
1

1 1
1
1 6

Xu,etal.(2010)usedMonteCarlosimulationstoverifytheseformulasasymptoticresults(usingasymptoticrelativeefficiencyandrootmeansquareerrorperformancemetrics)showing
thattheyareapplicablefordataofevenrelativelysmallsamplesizes(downtoaround30samplepoints). TheirsimulationsconfirmedandextendHeaandNagarajabs(2009)Monte
Carlosimulationsexploringthebehaviorofninedistinctcorrelationestimatorsofthebivariatenormalcorrelationcoefficient,includingtheestimator ,thesamplecorrelationforthe
bivariatenormal,andestimatorsbasedonorderstatistics. Theestimator wasfoundgenerallytoreducebiasandimproveefficiencyaswellorbetterthanothercorrelationestimators
inthestudy. Xu,etal.(2010)alsocompared withthreeothercloselyrelatedcorrelationcoefficients: (1)classicalPearsonsproductmomentcorrelationcoefficient,(2)Spearmans
rho,and(3)orderstatisticscorrelationcoefficients. GinicorrelationbridgesthegapbetweentheorderstatisticscorrelationcoefficientandSpearmansrho,anditsestimatorsaremore
mathematicallytractablethanSpearmansrho,whosevarianceinvolvescomplexellipticintegralsthatcannotbeexpressedinelementaryfunctions. Theirefficiencyanalysisshowedthat
estimator slossofefficiencyisbetween4.5%to11.3%,muchlessthanthatofSpearmansrhowhichrangesfrom8.8%to30.5%.

CALCULATIONOFS AMPLES IZEONAS INGLELINK

1
convergenceimpliesthatfortheremainingterms gotozerofasterthan ; 0

Constructahypothesistesttojustdetecttheminimumeffectsize :

: 0

Theonesample,twosidedformulation(seeFigure4)thatreconcilesthenullandalternativehypothesistestsfortheestimator is

Xu,etal.(2010)showthat | | 0 quickly:for 30 fromabivariatenormalpopulationtheyshowthatwecanassume | | 0. Similarly,for 30 wecanassume

that areadequateapproximationsfor intheformula. Evenundertheveryweakassumptionsoftheruleof10asampleof 30 impliesamodelofat


mostthreevariablessignificantlysimplerthanthemajorityofpublishedmodels. Rearrangingtoplacealltermswith onthelefthandside:

Thustowithinlittle andusingtheformulafor

1 1 2 1 1
, 1 2
1 6 1 4

Wewanttorestatethisassomefunctionthatcalculatessamplesize , . Solvefor bysimplifyingintermsof:


Then arethesolutionsforthequadraticequationthatrestates , 0:

2 2
6 6 0

Orintermsof , , , andtakingthelargestroot

1
4 2 2
2 6 6 6

5.METASTUDYANDDISCUSSION
Thisresearchconstructedtwonecessaryconditionsforsampleadequacy:

1. Section3determinedthesamplesizeneededcompensatefortheratioofnumberofindicatorvariablestolatentvariables(summarizedfromMonteCarlosimulationsthathave
appearedintheliterature);and

2. Section4determinedthesamplesizerequiredtoassuretheexistenceornonexistenceofaminimumeffect(correlation)oneachpossiblepairoflatentvariablesintheSEM
(determinedanalytically).

Ofcourse,neitheroftheseconditionsissufficienttoassuresampleadequacyforaparticularchoiceof , becausetherearesomanyotherfactorsthatcanaffectestimationandsample
sizemulticolinearity,appropriatenessofdatasets,andsoforth. Additionally,theinformationcontainedinthesampleandindicatorvariablesmustbeadequatetocompensatefor
variationsinparticularSEMestimationmethodologies. Forexample,partialleastsquare(PLS)approachesgenerateparameterestimatesthatlackconsistency. Dhrymes(1970);
Schneewei(1990,1991,1993);Thomas,etal.(2005);andFhr(1989)alldemonstratethattheIV/2SLStechniquesconvergetothesameestimators,butaremorerobust. Joreskog
(1967,1970;JreskogandSrbom1996)suggeststhatdeparturesfromnormaldistributionfortheindicatorswilldemandlargersamples,andthatnonnormalindicatorsrequireone,twoor
threemagnitudeslargersamples,dependingondistribution.

Fromapracticalviewpoint,samplesizequestionscantakethreeforms:

1. Apriori:willaskwhatsamplesizewillbesufficientgiventheresearcherspriorbeliefsonwhattheminimumeffectisthatthetestswillneedtodetect

2. Exposteriori:willaskwhatsamplesizeshouldhavebeentakeninordertodetecttheminimumeffectthattheresearcheractuallydetectedinanexisting(eithersufficientor
insufficient)test. Iftheexposteriorimeasuredeffectissmallerthantheresearcherspriorbeliefsabouttheminimumeffect(in1.)thensamplesizeneedstobeincreased
commensurately.

3. Sequentialtestoptimalstopping:iscouchedinasequentialtestoptimalstoppingcontext,wherethesamplesizeisincrementeduntilitisconsideredsufficienttostoptesting.

Inthissection,wereportonanexposteriori metastudythatappliesthealgorithmsdevelopedinthispapertoaspecificbodyofSEMresearchstudiespublishedinfivecorejournalsin
MISandeCommerce(ISR,MISQ,ManagementScience,DecisionSciencesandJMIS)between1989(thedateoftheseminalstudybyDavis,etal.1989)and2007. Weassumedthatthelink
withthesmallesteffectactuallyobservedinthesestudiesdetermines aconservativeassumption,becausetheresearchwouldhavebeenverylikelytoholdabiasinactuallywanting
todetectevensmallereffectsthanthoseactuallyobserved,butthemodelanddatawouldhaveonlyhadsufficientresolutiontocapturetheminimumeffectobserved.

Additionally,manyofthestudieslistedinAppendixAanalyzedLikertscaledatathatisnotdistributednormally;nevertheless,theassumptionofnormalcyofdataisacommononeinSEM
studies,evenwherethedataisclearlynotnormal,forexamplewheresurveydatareturnsdiscreteLikertscaledatacensoredat0,andderivesfromamassfunctionwhichislikelytobe
skewed. Becauseestimatorbehaviorisbestunderstoodfornormaldata,wecanassumethat,inthesenonnormaldatastudiesourlowerboundonsamplesizeneedsatanonnormalcy
riskpremiumforsampleadequacydeparturesfromanormalweightmatrixinLISRELsuggestthatthismaybetwotothreeordersofmagnitudelargerthansamplesizerequiredfornormal
data.

Samplesizesactuallyusedindrawingconclusionsinthestudywerecomparedwithourcomputedlowerbound,andadifferencetakenasapercentage(thefarrighthandcolumnof

AppendixA). Histogramsofsampleadequacy showasignificantsystematicbiastowardstoosmallasamplesizeinthepaperssurveyed. Inthemetastudy,the

averagesamplewas770%toosmall;withtheremovalofthreeoutliers,thisdroppedto400%toosmall(figures6and7). Actualsamplesizesinthese74researcharticleswereonaverage
only50%oftheminimumneededtodrawtheconclusionsthestudiesclaimed;mediansamplesizewas38%oftheminimumrequired,reflectingasubstantialnegativeskewinginthe
undersampling,andstandarddeviationwas29%. Overall,80%oftheresearcharticlesinthismetastudydrewconclusionsfromsamplesthatweresmallerthanthelowerboundson
samplesizecomputedhere. Becauseeachadditionalobservationincreasesthecostofthestudyintime,effortandmonetaryterms,aninclinationtoeconomizeondatacollectionis
understandable. TheconclusionthatseemsmostappropriatefromourmetastudyisthatMISresearchershavebeengiveninadequateguidance,andhavenotbeenwellservedby
existingsamplesizeheuristics. Lackingthesamplesizeinformationtheyneed,researchersmaybeinclinedtoskimponsamplecollection. Unfortunately,whensamplesaretoolarge,
thestudiesweremorecostlythantheyneededtobeindrawingparticularconclusions;whensamplesaretoosmall,thecredibilityoftheirconclusionsisweakened.

INSERT FIGURE6:PERCENTERRORINSAMPLESIZEFOR74STUDIESINTHEENTIREMETASTUDY(MEAN=770,STANDARDDEVIATION=25,SKEWNESS=6.5,KURTOSIS=47)

INSERT FIGURE7:PERCENTERRORINSAMPLESIZEFOR74STUDIESINTHEMETASTUDYREMOVINGOUTLIERS % (MEAN=400,STANDARDDEVIATION=642,SKEWNESS=2.5,KURTOSIS=7.6)

Weshouldnotbesurprised,givenourreviewofthepriorliterature,thatexistingsamplesizeheuristicsaremisleadingresearchersinthisarea. Numerousstudieshaveconcludedthat
linearheuristicsliketheruleof10arepoorguidestofitandexplanatorypowerofthemodeloradequacyofthesamplesize. (BrowneandCudeck1989,1993,Geweke,andSingleton
1980;GebringandAnderson1985);VeicerandFava1987,1989,1994;MarshandBailey1991;Boomsma1982;Ding,etal.1995)

Asnotedearlier,neitheroftheconditionsdevelopedhereissufficienttoassuresampleadequacyforaparticularchoiceof , becausetherearesomanyfactorsthatcanaffect
estimationandsamplesizeinsomethingascomplexasastructuralequationmodel. Consequently,thenecessarysamplesizeforaccurateestimationwillinmostcasesexceedthelower
boundcomputedhere. Butreviewofactualsamplesizessummarizedinfigures6and7suggeststhat,atitsmostunambitious,thislowerboundwillinsureagainsttheveryerraticunder
sizingofsamplesthatseemscommoninSEManalysis.

10

FutureresearchonsamplesizechoiceshouldbeconductedonlinesspecifictothevariousalgorithmsusedtoestimateSEMPLSsprincipalcomponentsanalysisalgorithms;LISRELand
AMOSsgradientsearchalgorithms;andsystemsofequationsregressionalgorithms. Indeed,seminalresearchineachoftheseareasalludedtothisdecadesago. Wold(1980,1981)went
evenfurtherinadvisingthatPLSismoresuitableforexploratorymodelspecificationsearchesratherthanhypothesestesting,andintroducedtheconceptofplausiblecausalityforthatvery
reason. ThusinPLS,thesamplesizequestionisprobablybothlessrelevantandlesscritical,becausehypothesistestingisbetterlefttoLISRELandsystemsofequationapproaches.

Theprobleminbuildingthestructuralmodelcompletelyontheory,withoutreferencetothedataisthatthelatentconstructschosenbytheresearchermaybesubstantiallydifferentthan
thosethatwoulddropoutofanexploratoryfactoranalysis. ResearchershavedevelopedatestforthiscalledHarmonsonefactortest(PodsakoffandOrgan1986)commonlyusedto
checkforcommonfactorbiasinSEM(andoftenconductedexposteriori). Commonfactorbiasappearsbecauseinherentclusteringresultsfromaparticulardistancemeasureusedto
positiondatapointsinndimensionalspaceforexample,principalcomponentsanalysisdesignsadistancemeasuretominimizethevariancenotexplainedbythemaincomponents
(clusters). ButSEMwillimposepriorbeliefsonthedata,intheformofthestructureoflatentvariables. Thusdataareassumedtoclusteraroundthelatentconstructsthefactor
loadingsdeterminehowthisclusteringoccurs. SEMmodelsareoftenconstructedwithoutreferencetoclusteringintheunderlyingdatagivenaparticulardistancemeasure;itisentirely
theorydriven,thoughthisisnotinitselfabadthing. Commonfactorbiasreflectsthisdivergenceinthemodelandthedata,andifitistooextreme,mayindicatethatthedatais
incomplete,orthatthemodelismisspecified.

Commonfactorbiascanbeavoidedapriorithroughapretestoftheclusteringofindicatordata. Commonfactorbiasoccursbecauseproceduresthatshouldbeastandardpartofmodel
specificationareinpracticeleftuntilafterthedatacollectionandconfirmatoryanalysis.JreskogdevelopedPRELISforthesesortsofpretestsandmodelrespecifications. Ifthisclustering
showsthattheindicatorsareprovidinginformationonfewervariablesthantheresearcherslatentSEMcontains,thisisanindicationthatmoreindicatorsneedtobecollectedthatwill
provide(1)additionalinformationaboutthelatentconstructsthatdontshowupintheclusteranalysis;and(2)additionalinformationtosplitoneexploratoryfactorintothetwoormore
latentconstructstheresearchneedstocompletethehypothesizedmodel. Inexploratoryfactoranalysis,thetwoteststhataremostusefulforthisaretheKaiser(1960)criterionthat
retainsfactorswitheigenvaluesgreaterthanone(unlessafactorextractsatleastasmuchinformationastheequivalentofoneoriginalvariable,wedropit)andthescreetestproposedby
Cattell(1966)thatcomparesthedifferencebetweentwosuccessiveeigenvaluesandstopstakingfactorswhenthisdropsbelowacertainlevel. Ineithercase,thesuggestedfactorsarenot
necessarilythelatentfactorsthattheresearcherstheorywouldsuggestrathertheyaretheinformationthatisactuallyprovidedinthedata,thisinformationbeingthemainjustification
forthecostofdatacollection. Soinpractice,eithertestwouldsetamaximumnumberoflatentfactorsintheSEMifthatSEMistobeexploredwithonesownparticulardataset.

WhenSEMarebuiltaroundvalidrealworldconstructs(eveniftheseareunobservable)thealgorithmsproposedinthispaperimposeonlyweakadditionalassumptionsontheindicatorsand
latentvariablesinordertocomputesamplesizesadequateforestimation.OurlimitedapplicationtoawindowofISandecommercepublicationshasshownthatconcernsarewarranted
concerningexistingSEMsamplesizecalculationsandweneedtoremainsuspiciousofconclusionsreachedinstudiesbasedoninadequatesamplesizes. Furthermore,alargenumberof
studiesinoursampledevisedtheirtestswithoutfirstcommittingtominimumeffectsizethattheyweretryingtodetect,orindicatedinportionofnonresponseinsurveys. Itisclearthat
journalrefereesneedtobeginaskingforsurveyresponse,minimumeffectsize andajustificationofthesamplesize. Byincorporatingthesesuggestions,itisarguedthattheresearch
communitywillenhancethecredibilityandapplicabilityoftheirresearch,withacommensurateimprovedimpactandinfluenceinbothindustryandacademe.

Note: Iwanttothankthereviewersandeditorswhoperseveredthroughseveralrevisionsofthispaper,andhelpednurtureittocompletion. Anyremainingerrorsarefullymyown


responsibility.

Noteonsoftware: YoumaydownloadatElseviersECRAsiteasoftwarepackagethatcomputesthelowerboundsdevelopedinthispaper. ThissoftwareiswritteninWindowsC#Forms


torunonWindowsplatforms;inadditiontoanumberofthepackagesintheRlanguage,itwasusedtocalculatetheresultsinthispaper.

APPENDIXA:SAMPLEADEQUACYINASETOFECOMMERCEANDMISSEMSTUDIES
INSERTAPPENDIXA*******************************************

REFERENCES

Anderson,T.Originsofthelimitedinformationmaximumlikelihoodandtwostageleastsquaresestimators.JournalofEconometrics127,2005,116.

Anderson,T.andRubin,H..Estimatoroftheparametersofasingleequationinacompletesystemofstochasticequations.AnnalsofMathematicalStatistics20,1949,4663.
11

Anderson,T.andRubin,H. Theasymptoticpropertiesofestimatesoftheparametersofasingleequationinacompletesystemofstochasticequations.AnnalsofMathematicalStatistics
21,1950,57082.

BalakrishnanN.andC.R.Rao OrderStatistics:Applications,ser.Handbookofstatistics;v.17.NewYork:Elsevier,1998.

Barclay,D.W.,Higgins,C.,&Thompson,R..Thepartialleastsquares(PLS)approachtocausalmodeling:Personalcomputeradaptationanduseasanillustration.TechnologyStudies,2(2),
1995,285309.

Basmann,R. Ageneralizedclassicalmethodoflinearestimationofcoefficientsinastructuralequation.Econometrica25,19577783.

Bentler,P.M. EQS,StructuralEquations,ProgramManual,ProgramVersion3.0,LosAngeles:BMDPStatisticalSoftware,Inc.,1989, p.6

Bollen,K.A. Structuralequationswithlatentvariables.NewYork:Wiley,1989,p.268

Boomsma,A RobustnessofLISRELagainstsmallsamplesizesinfactoranalysismodels,inKGJoreskogandHWold(eds)Systemsunderindirectobservations,Causality,structure,prediction
(part1)1982,pp149173,Amsterdam:NorthHolland.

BrandA,BradleyMT,BestLA,StoicaG Accuracyofeffectsizeestimatesfrompublishedpsychologicalresearch".PerceptualandMotorSkills106(2)2008.645649

Browne,M.W.,and Cudeck,R. Alternativewaysofassessingmodelfit.InK.A.Bollen&J.S.Long(Eds.),Testingstructuralequationmodels,1993,pp.13616,NewburyPark,CA:Sage.

Browne,M.W.,andCudeck,R. Singlesamplecrossvalidationindicesforcovariancestructures.MultivariateBehavioralResearch,24,1989,445455.

CattellRB Handbookofmultivariateexperimentalpsychology1966RandMcNallyChicago

Chin,W.W. Thepartialleastsquaresapproachtostructuralequationmodeling.InG.A.Marcoulides(Ed.),ModernMethodsforbusinessresearch(pp.295336).Mahwah,1998,New
Jersey:LawrenceErlbaumAssociates.

Chin,W.W.,andNewsted,P.R.StructuralEquationModelinganalysiswithSmallSamplesUsingPartialLeastSquares.InRickHoyle(Ed.),StatisticalStrategiesforSmallSampleResearch,
SagePublications,1999,pp.307341

Cochran,WG SamplingTechniques,3rdEdition1977NewYork:Wiley

Cohen,J StatisticalPowerAnalysisfortheBehavioralSciences(seconded.)1988,LawrenceErlbaumAssociates

Cohen,J Apowerprimer, PsychologicalBulletin112,1992,155159

Davis,F.D.,RichardP.Bagozzi,PaulR.Warshaw Useracceptanceofcomputertechnology:acomparisonoftwotheoreticalmodels,ManagementScienceVolume35, Issue8,1989, 982


1003

DhrymesPJ.,R.Berner,D.CumminsAComparisonofSomeLimitedInformationEstimatorsforDynamicSimultaneousEquationsModelswithAutocorrelatedErrorsEconometrica,Vol.42,
No.2,1974,pp.311332

DhrymesP.DistributedLags:problemsofestimationandformulation,SanFrancisco:Holden Day,1971

Dhrymes,PJ. EconometricsStatisticalFoundationsandApplications,NewYorkEvanstonandLondon(Harper&Row),1970,p.53

Ding,L.,Belicer,W.F.andHarlow,LL Theeffectsofestimationmethods,numberofindicatorsperfactorandimpropersolutionsonstructuralequationmodelingfitindices,Structural
EquationModeling,2,1995,119144

Farebrother,R. FittingLinearRelationships:AHistoryoftheCalculusofObservations17501900.1999,NewYork:Springer.

Fhr,K. ComparisonofLISRELandPLSEstimationMethodsinLatentVariableModels.IntroducingLatentVariablesintoEconometricModels,ManuscriptSFB303,UniversityofBonn,
Bonn,1989

Fisher,R.A.Onthe`probableerror'ofacoefficientofcorrelationdeducedfromasmallsample, Metron1:1921, 332

Fisher,R.A. StatisticalMethods,ExperimentalDesign,andScientificInference.NewYork:OxfordUniv.Press,1990.

Fornell,ClaesandDavidF.Larker EvaluatingStructuralEquationModelswithUnobservableVariablesandMeasurementError,JournalofMarketingResearch181981,3950.

Freedman,D Asothersseeus:Acasestudyinpathanalysis(withdiscussion) JournalofEducationalStatistics,12,1987,pages101223.

Gerbing,D.W.,&Anderson,J.C. Theeffectsofsamplingerrorandmodelcharacteristicsonparameterestimationformaximumlikelihoodconfirmatoryfactoranalysis.Multivariate
BehavioralResearch,20,1985, 255271.

Gibbons,J.D.andS.Chakraborti,NonparametricStatisticalInference,3rded.NewYork:MarcelDekker,1992.

Goodhue,Dale WilliamLewis,RonaldThompson,StatisticalPowerinAnalyzingInteractionEffects:QuestioningtheAdvantageofPLSwithProductIndicators(ResearchNote),Information
SystemsResearchVol.18,No.2,2007,pp.211227

Goodhue,D.WilliamLewis,RonThompson,"PLS,SmallSampleSize,andStatisticalPowerinMISResearch,"HICSS,vol.8,pp.202b,Proceedingsofthe39thAnnualHawaiiInternational
ConferenceonSystemSciences(HICSS'06)2006

Haavelmo,T. TheStatisticalImplicationsofaSystemofSimultaneousEquationsEconometrica11,1943,112.

Hea,Q.and H.N.NagarajabCorrelationEstimationUsingConcomitantsofOrderStatisticsfromBivariateNormalSamples,CommunicationsinStatisticsTheoryandMethods,Volume38,
Issue12,January2009,pages20032015

Hoeffding,W. Aclassofstatisticswithasymptoticallynormaldistribution.Ann.Mathemat.Statist.19,1948,293325.

Jreskog,K.G. Somecontributionstomaximumlikelihoodfactoranalysis.Psychometrika ,32(4),1967, 443482.

Jreskog,K.G.andSrbom,D.,LISREL8User'sReferenceGuide,Chicago:ScientificSoftwareInternational,1996.

Joreskog,K.G. Ageneralmethodforanalysisofcovariancestructures,Biometrika,57,1970,239251.

Kahai,S.S.andCooper,R.B. ExploringtheCoreConceptsofMediaRichnessTheory:TheImpactofCueMultiplicityandFeedbackImmediacyonDecisionQuality,JournalofManagement
InformationSystems,20,1,2003263299

12

Kendall,MandJ.D.Gibbons RankCorrelationMethods,5thed.NewYork:OxfordUniv.Press,1990.

Kish,L SurveySampling,1995,NewYork:Wiley

Koopmans,T.andHood,W.Theestimationofsimultaneouslineareconomicrelationships.InStudiesinEconometricMethod,ed.W.HoodandT.Koopmans.CowlesFoundationMonograph
14.1953NewHaven:YaleUniversityPress.

Lohr,SL.Sampling:DesignandAnalysis.1999Duxbury

MariD.D.andS.Kotz, CorrelationandDependence.London,U.K.:ImperialCollegePress,2001.

Marsh,H.W.,Balla,J.R.,&McDonald,R.P.Goodnessoffitindexesinconfirmatoryfactoranalysis:Theeffectofsamplesize.PsychologicalBulletin,103,1988 391410.

Marsh,H.W.,Balla,J.R.,&Hau,K.T. Anevaluationofincrementalfitindices:Aclarificationofmathematicalandempiricalproperties.InG.A.Marcoulides&R.E.Schumacker(Eds.),
Advancedstructuralequationmodeling:Issuesandtechniques(pp.315353).1996,Mahwah,NJ:LawrenceErlbaumAssociates,Inc.

Marsh,H.W.,Hau,K.T.,Balla,J.R.,&Grayson,D..Ismoreevertoomuch?Thenumberofindicatorsperfactorinconfirmatoryfactoranalysis.MultivariateBehavioralResearch,33,1998,
181220.

Marsh,H.W.andMBailey Confirmatoryfactoranalysesofmultitraitmultimethoddata:AcomparisonofalternativemodelsAppliedPsychologicalMeasurement,Vol.15,No.1,4770
(1991)

Nakagawa,S.andCuthill,I.C. Effectsize,confidenceintervalandstatisticalsignificance:apracticalguideforbiologists BiologicalReviewsCambridgePhilosophicalSociety82,2007,591


605

Nunnally,J.C. PsychometricTheory,NewYork:McGrawHill,1967, p.355

Podsakoff,P.M.andD.W.OrganSelfreportsinorganizationresearch:Problemsandprospects,JournalofManagement,12,1986, 531544

Sargan,J. Estimationofeconomicrelationshipsusinginstrumentalvariables.Econometrica67,1958,55786.

Schechtman,E.,Yitzhaki,S. AmeasureofassociationbasedonGinismeandifference.Commun.Statist.Theor.Meth.16,1987,207231.

Schneewei,H. ModelswithLatentVariables:LISRELversusPLS,in:ContemporaryMathematicsVol.112(1990),p.3340

Schneewei,H. ModelswithLatentVariables:LISRELversusPLS,in:StatisticaNeerlandica45(1991),p.145157

Schneewei,H. ConsistencyatLargeinModelswithLatentVariables,in:K.Hagen,D.J.Barthdomew,M.Deistler,StatisticalModellingandLatentVariables,Elsevier(1993),p.299320

ShevlyakovG.L. andN.O.Vilchevski, RobustnessinDataAnalysis:CriteriaandMethods,ser.Modernprobabilityandstatistics.2002, Utrecht,TheNetherlands:,VSP,2002.

SnedecorandCochran StatisticalMethods,8thed.1989Ames:IowaU.Press

Tanaka,J.S.Howbigisbigenough?:Samplesizeandgoodnessoffitinstructuralequationmodelswithlatentvariables.ChildDevelopment,58,1987,134146.

Tanaka,J.S.Multifacetedconceptionsoffitinstructuralequationmodels.InK.A.Bollen&J.S.Long(Eds.),Testingstructuralequationmodels(pp.1039),1993,NewburyPark,CA:Sage.

Theil,H. Repeatedleastsquaresappliedtoacompleteequationsystems. 1953a,TheHague:CentralPlanningBureau.

Theil,H. Estimationandsimultaneouscorrelationincompleteequationsystems.1953b,TheHague:CentralPlanningBureau.

Theil,H. EconomicForecastsandPolicy,2ndedn.1961,Amsterdam:NorthHolland.

Thomas,D.R.,Lu,I.R.R.&Cedzynski,M.Partialleastsquares:Acriticalreviewandapotentialalternative. ProceedingsoftheAnnualConferenceofAdministrativeSciencesAssociationof
Canada,ManagementScienceDivision,Toronto,2005

Velicer,W.F.,andFava,J.L. Effectsofvariableandsubjectsamplingonfactorpatternrecovery.PsychologicalMethods,3,1998,231251.

Westland,J.C.andW.K.SeeTo TheShortrunPricePerformanceDynamicsofMicrocomputerTechnologies,ResearchPolicy,Volume36,Issue5,2007,Pages591604

Wilkinson,Leland;APATaskForceonStatisticalInferenceStatisticalmethodsinpsychologyjournals:Guidelinesandexplanations".AmericanPsychologist54,1999,594604

Wold,H. "TheFixPointApproachtoInterdependentSystems:ReviewandCurrentOutlook,"inH.Wold(Ed.),TheFixPointApproachtoInterdependentSystems,1981,Amsterdam:North
Holland,135.

Wold,Herman ResponsetoD.A.Freedman,JournalofEducationalStatistics,Vol.12,No.2,1987,pp.202205

Wright,S. Correlationandcausation,JournalofAgriculturalResearch,20,1921557585

Xu,W,Y.S.Hung,,M.Niranjan, andM.Shen AsymptoticMeanandVarianceofGiniCorrelationforBivariateNormalSamples,IEEETrans.OnSignalProcessing,V.58(2)2010


13

APPENDIXA:SAMPLEADEQUACYINASETOFECOMMERCEANDMISSEMSTUDIES
Studies Latent Indicator Sample MinimumEffect idkcorrected Samplebound Samplebound SampleSize Studysample
Variables Variables Points Observed section4 section3 lowerbound ()orabove

ChoudhuryandKarahanna,MISQ,V.32(1) 6 35 499 0.12 0.0034 1177 176 1177

KanawattanachaiandYoo,MISQ,V.31(4) 3 11 146 0.21 0.0170 264 122 264

Damien,etal.,MISQ,V.31(3) 6 43 31 0.08 0.0034 2694 443 2694

WebsterandAhuja,MISQ,V.30(3) 6 38 207 0.17 0.0034 568 256 568

Tanriverdi,MISQ,V.30(1) 10 16 356 0.11 0.0011 1661 508 1661

Awadand Krishnan,MISQ,V.30(1) 8 24 532 0.28 0.0018 206 200 206

VanderHeijdenndHans,MISQ,V.28(4) 4 15 1144 0.15 0.0085 628 116 628

Barua,etal.,MISQ,V.28(4) 12 45 1125 0.109 0.0008 1783 116 1783

Straub,etal.,MISQ,V.27(1) 8 34 213 0.1 0.0018 1886 91 1886

Susaria,etal.,MISQ,V.27(1) 8 32 256 0.067 0.0018 4253 100 4253

Tarafdar,etal,JMIS,V.24(1) 10 21 256 0.08 0.0011 3181 376 3181

Fuller,etal,JMIS,V.23(3) 5 22 318 0.3 0.0051 148 88 148

Kearns,etal,JMIS,V.23(3) 9 44 269 0.11 0.0014 1610 95 1610

Wang,etal,JMIS,V.23(2) 5 22 149 0.12 0.0051 1098 88 1098

Lopes,etal,JMIS,V.23(2) 4 13 392 0.24 0.0085 226 166 226

Hess,etal,JMIS,V.22(3) 8 36 233 0.28 0.0018 206 88 206

Johnson,etal,JMIS,V.22(2) 7 16 202 0.19 0.0024 472 333 472

Bhatt, etal,JMIS,V.22(2) 5 26 202 0.22 0.0051 303 112 303

Changetal,JMIS,V.22(1) 3 12 476 0.65 0.0170 9 100 100

Wallace,etal.Dec.Sci.,V.35(2) 6 27 507 0.13 0.0034 997 88 997

Rabinovich,etal., Dec.Sci.,V.34(1) 2 14 840 0.126 0.0500 587 400 587

AbdinnourHelm,etal.Dec.Sci.,V.36(2) 5 12 176 0.1 0.0051 1596 308 1596

EscrigTena,etal.,Dec.Sci.,V.36(2) 9 37 231 0.1 0.0014 1957 95 1957

PullmanandGross,Dec.Sci.,V.35(3) 3 15 400 0.52 0.0170 24 100 100

Tu,etal.,Dec.Sci.,V.35(2) 3 24 303 0.22 0.0170 238 700 700

Droge,etal.,Dec.Sci.,V.34(3) 4 17 437 0.16 0.0085 548 91 548

JanzandPrasarnphanich,Dec.Sci.,V.34(2) 5 15 231 0.41 0.0051 65 200 200

HongandTam,ISR,V.17(2) 5 27 1328 0.1 0.0051 1596 128 1596

Dinevand Hart,ISR,V.17(1) 5 18 369 0.15 0.0051 690 128 690

Jarvenpaa,etal.,ISR,V.15(3) 6 30 136 0.28 0.0034 187 100 187

PavlouandGefen,ISR,V.15(1) 10 33 274 0.1 0.0011 2020 160 2020

Bassellier,etal.,ISR,V.14(4) 7 38 404 0.2 0.0024 422 131 422

Choo,etal.,MS,V.53(3) 5 14 951 0.38 0.0051 81 232 232

14

Sabherwal,etal.,MS,V.52(12) 10 48 121 0.11 0.0011 1661 92 1661

BagozziandDholakia,MS,V.52(7) 14 24 402 0.17 0.0006 736 476 736

deJong,etal.,MS,V.51(11) 5 29 60 0.037 0.0051 11884 172 11884

KimandMalhotra,MS,V.51(5) 4 8 189 0.14 0.0085 725 400 725

Vickery,etal.,MS,V.50(8) 4 14 113 0.212 0.0085 299 138 299

Balasubramanian, etal,MS,V.49(7) 5 22 428 0.12 0.0051 1098 88 1098

Au,etal.,MISQ,V.32(1) 6 20 922 0.17 0.0034 568 156 568

Hsieh,etal.,MISQ,V.32(1) 12 31 451 0.21 0.0008 447 271 447

NadkarniandGupta,MISQ,V.31(3) 6 30 452 0.1 0.0034 1711 100 1711

Liang,etal.,MISQ,V.31(1) 7 21 77 0.277 0.0024 202 200 202

KomiakandBenbasat,MISQ,V.30(4) 7 16 100 0.12 0.0024 1242 333 1242

BhattacherjeeandSanford,MISQ,V.30(4) 7 24 81 0.26 0.0024 235 145 235

Karahanna,etal.,MISQ,V.30(4) 8 29 278 0.1 0.0018 1886 126 1886

SriteandKarahanna,MISQ,V.30(3) 8 27 181 0.29 0.0018 190 151 190

StewartandGosain,MISQ,V.30(2) 8 63 51 0.13 0.0018 1099 657 1099

MooresandChang,MISQ,V.30(1) 5 17 243 0.47 0.0051 43 148 148

PavlouandFygenson,MISQ,V.30(1) 6 18 312 0.12 0.0034 1177 200 1177

AhujaandThatcher,MISQ,V.29(3) 4 12 263 0.17 0.0085 482 200 482

Ko,etal.,MISQ,V.29(1) 5 14 96 0.134 0.0051 874 232 874

Bhattacherjee and Premkumar, MISQ, 7 27 77 0.1 0.0024 1806 108 1806


V.28(2)

Gemino,JMIS,V.24(3) 7 51 223 0.154 0.0024 739 476 739

Son,etal,JMIS,V.24(1) 11 38 625 0.16 0.0009 783 142 783

Klein,etal.,Dec.Sci.,V.38(4) 5 30 91 0.2 0.0051 373 200 373

WangandWei,Dec.Sci.,V.38(4) 4 46 150 0.22 0.0085 275 2538 2538

Keil,etal.,Dec.Sci.,V.38(3) 3 14 178 0.15 0.0170 543 89 543

EttlieandPavlou,Dec.Sci.,V.37(2) 4 31 72 0.15 0.0085 628 616 628

Looney,etal.,Dec.Sci.,V.37(2) 5 31 414 0.153 0.0051 662 232 662

BrownandChin,Dec.Sci.,V.35(3) 7 20 240 0.14 0.0024 902 222 902

TeiglandandWasko,Dec.Sci.,V.34(2) 9 19 83 0.2 0.0014 458 373 458

Saraf,etal.,ISR,V.18(3) 9 45 63 0.254 0.0014 268 100 268

PavlouandDimoka,ISR,V.17(4) 4 10 1665 0.05 0.0085 5904 288 5904

NicolaouandMcKnight,ISR,V.17(4) 6 32 69 0.247 0.0034 250 122 250

PavlouandElSawy,ISR,V.17(3) 4 10 507 0.14 0.0085 725 288 725

PavlouandGefen,ISR,V.16(4) 10 30 1031 0.14 0.0011 1009 200 1009

WixomandTodd,ISR,V.16(1) 7 17 465 0.1 0.0024 1806 302 1806

15

ZhuandKraemer,ISR,V.16(1) 11 34 624 0.04 0.0009 13213 187 13213

Malhotra,etal.,ISR,V.15(4) 6 18 449 0.12 0.0034 1177 200 1177

Karimi,etal.,ISR,V.15(2) 5 20 286 0.04 0.0051 10163 100 10163

Mun andDavis,ISR,V.14(2) 7 54 95 0.18 0.0024 530 604 604

VenkateshandAgarwal,MS,V.52(3) 5 21 757 0.1 0.0051 1596 92 1596

Ahuja,etal.,MS,V.49(1) 4 5 1781 0.17 0.0085 482 616 616

16

Factor Analysis
Exploratory Factor PLS-SEM through
(PCA) through
Analysis iterated OLS
iterated OLS
Lawley (1940) Wold (1978)
Wold (1966)
Model Specification Searches

Confirmatory Factor
Path Analysis LISREL-SEM
Analysis
Wright (1921) Jreskog (1969)
Jreskog (1969)

3SLS and
Systems of Linear Instrumental Variables
full-information
Equations Estimation and 2SLS
regression SEM
Koopmans (1950) Theil (1953)
Zellner (1962)

FIGURE2:DEVELOPMENTOFSTRUCTURALEQUATIONMODELESTIMATION


FIGURE2:ANEXAMPLEOFASTRUCTURALEQUATIONMODELWITHSIXLATENTVARIABLESANDFIVECORRELATIONS

17


FIGURE3:THESEMEXAMPLEINFIGURE2WITHALLPOSSIBLEPAIREDLINKSSHOWN

18


FIGURE4:SIGNIFICANCEANDPOWERFORTHEMINIMUMEFFECTTHATNEEDSTOBEDETECTED


Positive Correlation Negative Correlation

5
4
4

4
3
3

3
x[,2]

x[,2]

x[,2]

x[,2]

x[,2]
2
2

2
1

1
1

1
0

0
0

0
0
-1

-1

-1
-1
-1


-2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4 -1 0 1 2 3 4 -2 -1 0 1 2 3 4

x[,1]
x[,1] x[,1] x[,1] x[,1]
5
5

5
5

4
5
4

4
4

3
4
3

3
3

2
3

x[,2]
x[,2]
x[,2]

x[,2]

2
2

x[,2]

1
2
2

1
1

0
1

0
0

-1
-1
0
-1

-1


-2 -1 0 1 2 3 4


-2 -1 0 1 2 3 4
-2 -1 0 1 2 3 4 -2 -1 0 1 2 3 4


x[,1]
-2 -1 0 1 2 3 4 x[,1]


x[,1] x[,1]
x[,1]

19

1 0 and 1
Figure5:BivariateNormalScatterplotsfor with 500
2 0 1

50
40
30
frequency

20
10
0

-20000 -15000 -10000 -5000 0



FIGURE6:PERCENTERRORINSAMPLESIZEFOR74STUDIESINTHEENTIREMETASTUDY(MEAN=770,STANDARDDEVIATION=25,SKEWNESS=6.5,KURTOSIS=
47)

40
30
frequency

20
10
0

-2500 -2000 -1500 -1000 -500 0 500



FIGURE7:PERCENTERRORINSAMPLESIZEFOR74STUDIESINTHEMETASTUDYREMOVINGOUTLIERS % (MEAN=400,STANDARDDEVIATION=642,
SKEWNESS=2.5,KURTOSIS=7.6)

20

Numberofparameters(indicators)intheSEM
NumberoflatentvariablesintheSEM
Computedsamplesizelowerbound
, and BivariateNormalrandomlatentvariables(andtheirrealization)intheSEM
,

: : orderstatisticsofthe , samplevalues;thefirstindexisrank,andthesecondissample
size
: :

: concomitant of the order statistic; : is the sample value associated with the :
samplevalueinthesamplepairs , .
Minimumeffectsizethatourcomputedsamplesizecandetect
UnknowncorrelationforabivariateNormalrandomvector ,
EstimatorofGinicorrelation
; MeanandstandarddeviationestimatorsforGinicorrelation
;1 Significanceandpoweroftest
TheidkcorrectedsignificancefordiscriminationsbetweenpossibleSEMlinkcombinationsat
aresolutionof
; Rejection bound at significance and nonrejection bound at power 1 ; we substitute
thequantilefunction(inversecumulativeNormal) for incalculations

TABLE2:NOTATIONUSEDINTHEPAPER

21

You might also like