You are on page 1of 37

What is Evaluation?

Evaluationistheprocessofexaminingaprogramorprocesstodeterminewhat'sworking,what'snot,and
why.
Evaluationdeterminesthevalueofprogramsandactsasblueprintsforjudgmentandimprovement.
(Rossett&Sheldon,2001)

TypesofEvaluationsinInstructional
Design
Evaluationsarenormallydividedintotwobroadcategories:formativeandsummative.

Formative
Aformativeevaluation(sometimesreferredtoasinternal)isamethodforjudgingtheworthofaprogram
whiletheprogramactivitiesareforming(inprogress).Thispartoftheevaluationfocusesontheprocess.
Thus,formativeevaluationsarebasicallydoneonthefly.Theypermitthedesigners,learners,and
instructorstomonitorhowwelltheinstructionalgoalsandobjectivesarebeingmet.Itsmainpurposeisto
catchdeficienciessothattheproperlearninginterventionscantakeplacethatallowsthelearnerstomaster
therequiredskillsandknowledge.
Formativeevaluationisalsousefulinanalyzinglearningmaterials,studentlearningandachievements,and
teachereffectiveness....Formativeevaluationisprimarilyabuildingprocesswhichaccumulatesaseriesof
componentsofnewmaterials,skills,andproblemsintoanultimatemeaningfulwhole.WallyGuyot
(1978)

Summative
Asummativeevaluation(sometimesreferredtoasexternal)isamethodofjudgingtheworthofaprogram
attheendoftheprogramactivities(summation).Thefocusisontheoutcome.
Allassessmentscanbesummative(i.e.,havethepotentialtoserveasummativefunction),butonlysome
havetheadditionalcapabilityofservingformativefunctions.Scriven(1967)
Thevariousinstrumentsusedtocollectthedataarequestionnaires,surveys,interviews,observations,and
testing.Themodelormethodologyusedtogatherthedatashouldbeaspecifiedstepbystepprocedure.It
shouldbecarefullydesignedandexecutedtoensurethedataisaccurateandvalid.
Questionnairesaretheleastexpensiveprocedureforexternalevaluationsandcanbeusedtocollectlarge
samplesofgraduateinformation.Thequestionnairesshouldbetrialed(tested)beforeusingtoensurethe
recipientsunderstandtheiroperationthewaythedesignerintended.Whendesigningquestionnaires,keep

inmindthemostimportantfeatureistheguidancegivenforitscompletion.Allinstructionsshouldbe
clearlystated...letnothingbetakenforgranted.

HistoryoftheTwoEvaluations
Scriven(1967)firstsuggestedadistinctionbetweenformativeevaluationandsummativeevaluationwhen
describingtwomajorfunctionsofevaluation.Formativeevaluationwasintendedtofosterdevelopmentand
improvementwithinanongoingactivity(orperson,product,program,etc.).Summativeevaluation,in
contrast,isusedtoassesswhethertheresultsoftheobjectbeingevaluated(program,intervention,person,
etc.)metthestatedgoals.
Scrivensawtheneedtodistinguishtheformativeandsummativerolesofcurriculumevaluation.While
Scrivenpreferredsummativeevaluationsperformingafinalevaluationoftheprojectorperson,hedid
cometoacknowledgeCronbach'smeritsofformativeevaluationpartoftheprocessofcurriculum
developmentusedtoimprovethecoursewhileitisstillfluid(hebelieveditcontributesmoretothe
improvementofeducationthanevaluationusedtoappraiseaproduct).
Later,Misanchuk(1978)deliveredapaperontheneedtotightenupthedefinitionsinordertogetmore
accuratemeasurements.Theonethatseemstocausethegreatestdisagreementisthekeepingoffluid
movementsorchangesstrictlyintheprereleaseversions(beforeithitsthetargetpopulation).
InPaulSaettler's(1990)historyofinstructionaltechnology,hedescribesthetwoevaluations(pp.430431)
inthecontextofhowtheywereusedindevelopingSesameStreetandTheElectricCompanybythe
Children'sTelevisionWorkshop.CTWusedformativeevaluationsforidentifyanddefiningprogram
designsthatcouldprovidereliablepredictorsoflearningforparticularlearners.Theylaterusedsummative
evaluationstoprovetheirefforts(toquitegoodeffectImightadd).WhileSaettlerpraisesCTWfora
significantlandmarkinthetechnologyofinstructionaldesign,hewarnsthatitisstilltentativeandshould
beseenmoreasapointofdepartureratherthanafixedformula.
Saettlerdefinesthetwotypesofevaluationsas:1)formativeisusedtorefinegoalsandevolvestrategies
forachievinggoals,while2)summativeisundertakentotestthevalidityofatheoryordeterminethe
impactofaneducationalpracticesothatfutureeffortsmaybeimprovedormodified.
Thus,usingMisanchuk'sdefiningtermswillnormallyachievemoreaccuratemeasurements;however,the
costisquitehighasitishighlyresourceintensive,particularlywithtimebecauseofallthepreworkthat
hastobeperformedinthedesignphase:create,trial,redo,trial,redo,trial,redo,etc.;andallpreferably
withoutusingthetargetpopulation.
However,mostorganizationsaredemandingshorterdesigntimes.Thustheformativepartismovedoverto
theothermethods,suchasthroughtheuseofrapidprototypingandusingtestingandevaluationsmethods
toimproveasonemoveson.Whichofcourseisnotasaccuratebutitismoreappropriatetomost
organizationsastheyarenotreallythatinterestedinaccuratemeasurementsofthecontentbutratherthe
endproductskilledandknowledgeableworkers.
Misanchuk'sdefiningtermsbasicallyputsallthewaterinacontainerforaccuratemeasurementswhilethe
typicalorganizationestimatesthevolumeofwaterrunninginastream.
Thusifyouareavendor,researcher,orneedhighlyaccuratemeasurementsyouwillprobablydefinethe
twoevaluationsinthesamemannerasMisanchuk.Ifyouneedtopushthetraining/learningoutfasterand
arenotallthatworriedabouthighlyaccuratemeasurements,thenyoudefineitclosertohowmost
organizationsdoandSaettlerdoeswiththeCTWexample.

Kirkpatrick's Four Level Evaluation Model


PerhapsthebestknownevaluationmethodologyforjudginglearningprocessesisDonaldKirkpatrick's
FourLevelEvaluationModelthatwasfirstpublishedinaseriesofarticlesin1959intheJournalof
AmericanSocietyofTrainingDirectors(nowknownasT+DMagazine).Theserieswaslatercompiledand

publishedasanarticle,TechniquesforEvaluatingTrainingPrograms,inabookKirkpatrickedited,
EvaluatingTrainingPrograms(1975).Howeveritwasnotuntilhis1994bookwaspublished,Evaluating
TrainingPrograms,thatthefourlevelsbecamepopular.Nowadays,hisfourlevelsremainacornerstonein
thelearningindustry.
Whilemostpeoplerefertothefourcriteriaforevaluatinglearningprocessesaslevels,Kirkpatricknever
usedthatterm,henormallycalledthemsteps(Craig,1996).Inaddition,hedidnotcallitamodel,but
usedwordssuchastechniquesforconductingtheevaluation(Craig,1996,p294).
Thefourstepsofevaluationconsistof:

Step 1: Reaction - How well did the learners like the learning
process?

Step 2: Learning - What did they learn? (the extent to which the
learners gain knowledge and skills)

Step 3: Behavior - (What changes in job performance resulted


from the learning process? (capability to perform the newly
learned skills while on the job)

Step 4: Results - What are the tangible results of the learning


process in terms of reduced cost, improved quality, increased
production, efficiency, etc.?

Kirkpatrick'sconceptisquiteimportantasitmakesanexcellentplanning,evaluating,andtroubling
shootingtool,especiallyifwewemakesomeslightimprovementsasshowbelow.

Not Just For Training


Whilesomemistakenlyassumethefourlevelsareonlyfortrainingprocesses,themodelcanbeusedfor
otherlearningprocesses.Forexample,theHumanResourceDevelopment(HRD)professionisconcerned
withnotonlyhelpingtodevelopformallearning,suchastraining,butotherforms,suchasinformal
learning,development,andeducation(Nadler,1984).Theirhandbook,editedbyoneofthefoundersof
HRD,LeonardNadler(1984),usesKirkpatrick'sfourlevelsasoneoftheirmainevaluationmodels.
Kirkpatrickhimselfwrote,Theseobjectives[referringtohisarticle]willberelatedtoinhouseclassroom
programs,oneofthemostcommonformsoftraining.Manyoftheprinciplesandproceduresappliestoall
kindsoftrainingactivities,suchasperformancereview,participationinoutsideprograms,programmed
instruction,andthereadingofselectedbooks(Craig,1996,p294).

Improving the Four Levels


Becauseofitsageandwithallthenewtechnologyadvances,Kirkpatrick'smodelisoftencriticizedfor
beingtoooldandsimple.Yet,almostfivedecadesafteritsintroduction,therehasnotbeenaviableoption
toreplaceit.AndIbelievethereasonwhyisthatbecauseKirkpatrickbasicallynailedit,buthedidgeta
fewthingswrong:

Motivation, Not Reaction


Whenalearnergoesthroughalearningprocess,suchasanelearningcourse,informallearningepisode,or
usingajobperformanceaid,thelearnerhastomakeadecisionastowhetherheorshewillpayattentionto
it.Ifthegoalortaskisjudgedasimportantanddoable,thenthelearnerisnormallymotivatedtoengagein
it(Markus,Ruvolo,1990).However,ifthetaskispresentedaslowrelevanceorthereisalowprobability
ofsuccess,thenanegativeeffectisgeneratedandmotivationfortaskengagementislow.Inaddition,
researchonReactionevaluationsgenerallyshowthatitisnotavalidmeasurementforsuccess(seethelast
section,Criticisms).

ThisdiffersfromKirkpatrick(1996)whowrotethatreactionwashowwellthelearnerslikedaparticular
learningprocess.However,thelessrelevancethelearningpackageistoalearner,thenthemoreeffortthat
hastobeputintothedesignandpresentationofthelearningpackage.Thatis,ifitisnotrelevanttothe
learner,thenthelearningpackagehastohookthelearnerthroughslickdesign,humor,games,etc.Thisis
nottosaythatdesign,humor,orgamesareunimportant;however,theiruseinalearningpackageshouldbe
topromoteoraidthelearningprocessratherthanjustmakeitfun.Andifalearningpackageisbuiltof
soundpurposeanddesign,thenitshouldsupportthelearnersinbridgingaperformancegap.Hence,they
shouldbemotivatedtolearnifnot,somethingdreadfullywentwrongduringtheplanninganddesign
processes!Ifyoufindyourselfhavingtohookthelearnersthroughslickdesign,thenyouprobablyneedto
reevaluatethepurposeofyourlearningprocesses.

Performance, Not Behavior


AsGilbertnoted(1998),performanceisabetterobjectivethanbehaviorbecauseperformancehastwo
aspects:behaviorbeingthemeansanditsconsequencebeingtheend...anditistheendwearemostly
concernedwith.

Flipping it into a Better Model


Themodelisupsidedownasitplacesthetwomostimportantitemslastresults,andbehavior,which
basicallyimprintstheimportanceoforderinmostpeople'shead.Thusbyflippingitupsidedownand
addingtheabovechangesweget:

Result - What impact (outcome or result) will improve our


business?

Performance - What do the employees have to perform in order


to create the desired impact?

Learning - What knowledge, skills, and resources do they need


in order to perform? (courses or classrooms are the LAST answer,
see Selecting the Instructional Setting)

Motivation - What do they need to perceive in order to learn


and perform? (Do they see a need for the desired performance?)

Thismakesitbothaplanningandevaluationtoolwhichcanbeusedasatroublingshootingheuristic:
(Chyung,2008):

Compare criterion and norm-referenced tests for MR students


Whataretheadvantagesanddisadvantagesofnormandcriterionrelatedassessmentsandformativeand
summativeevaluations?
"Whenthecooktastesthesoup,that'sformative;whenthegueststastethesoup,that'ssummative."
NormandcriterionrelatedassessmentsMUSTbeusedinbothtypesofevaluation.Criterionreferenced
referstohowourstudentmeasuresuptosomestandardsetbyanoutsidesource.
Forexample,acriterionwouldbetobeabletojumpacertainheight,ortoreadacertainsetofwords.
whichisprobablynottrueinthecaseofanmentallretarded(MR)student.Maybeonecriterionwouldbe
thatthestudentbeabletonamehis/hercolorsbyacertaintime.
Normreferencedtestsaretestswhichcomparethestudentbeingtestedwithallotherstudents.Norm
referencedtestsareusedtoclassifystudents,toplacethem.MRstudentsbydefinitiondonottestaswellas
otherstudentshis/herage.

Theadvantageofanormreferencedtestisthatitshowsushowourstudentisdoingrelatedtoother
studentsacrossthecountry.Adisadvantageisthattheyarestandardizedanddonotshowsmallincrements
ofgain.Theyaregoodforusingforplacementatthebeginningandthenagainfourorsixmonthslater,or
attheendoftheyear.Thiswillshowgrowthovertheperiodofthetime.
Normreferenced(alsocalledstandardizedorcriterionreferenced)testsalongwithinformalobservational
evaluationareusefulforshowingstudentgrowthovertime.Theyaren'ttobeusedforgradingthoughthey
canbeoneelementinatotalgrade.Onemustrememberwecan'texpectgreatgrowth,ifany,overshort
periodsoftimes,particularlyasshownonanormreferencedtest.
Thedefinitionofretardedis"slowed."Thatmeansthatthegrowthofourstudentsisslowed,butinmost
cases,formanythingsandformoststudents,notstopped.Asamatteroffact,itisjustunlikelyfora
"normal"populationofstudentstoshowmuchgrowthevenafterasemester'stime.Thesetestsarenot
intendedformeasuringsmallincrementsofgain.
Criterionrelatedtestsarenicebecausewecanseejustwhatourstudentaccomplished.Sonow,afterthree
months,s/hecanrecognize35morewords,ormaybe65morewordsthans/hecouldbefore.Thestudent
cannameallthecolors.Thestudentisnowputtingawaytoyswheretheybelongwherebefores/heeither
wouldnotorcouldnot.

What is Authentic Assessment?


Definitions
WhatDoesAuthenticAssessmentLookLike?
HowisAuthenticAssessmentSimilarto/Differentfrom
TraditionalAssessment?

Traditional Assessment

Authentic Assessment

Authentic Assessment Complements Traditional Assessment

Defining Attributes of Authentic and Traditional Assessment

Teaching to the Test

AlternativeNamesforAuthenticAssessment

Definitions
Aformofassessmentinwhichstudentsareaskedtoperformrealworldtasksthatdemonstratemeaningful
applicationofessentialknowledgeandskillsJonMueller
"...Engagingandworthyproblemsorquestionsofimportance,inwhichstudentsmustuseknowledgeto
fashionperformanceseffectivelyandcreatively.Thetasksareeitherreplicasoforanalogoustothekindsof
problemsfacedbyadultcitizensandconsumersorprofessionalsinthefield."GrantWiggins
(Wiggins,1993,p.229).
"Performanceassessmentscallupontheexamineetodemonstratespecificskillsandcompetencies,thatis,
toapplytheskillsandknowledgetheyhavemastered."RichardJ.Stiggins(Stiggins,1987,p.34).

What does Authentic Assessment look like?


Anauthenticassessmentusuallyincludesataskforstudentstoperformandarubricbywhichtheir
performanceonthetaskwillbeevaluated.Clickthefollowinglinkstoseemanyexamplesofauthentic
tasksandrubrics.

Examples from teachers in my Authentic Assessment course

How is Authentic Assessment similar to/different from


Traditional Assessment?
Thefollowingcomparisonissomewhatsimplistic,butIhopeitilluminatesthedifferentassumptionsofthe
twoapproachestoassessment.
TraditionalAssessment
By"traditionalassessment"(TA)Iamreferringtotheforcedchoicemeasuresofmultiplechoicetests,fill
intheblanks,truefalse,matchingandthelikethathavebeenandremainsocommonineducation.
Studentstypicallyselectananswerorrecallinformationtocompletetheassessment.Thesetestsmaybe
standardizedorteachercreated.Theymaybeadministeredlocallyorstatewide,orinternationally.
Behindtraditionalandauthenticassessmentsisabeliefthattheprimarymissionofschoolsistohelp
developproductivecitizens.ThatistheessenceofmostmissionstatementsIhaveread.Fromthis
commonbeginning,thetwoperspectivesonassessmentdiverge.Essentially,TAisgroundedin
educationalphilosophythatadoptsthefollowingreasoningandpractice:
1.Aschool'smissionistodevelopproductivecitizens.
2.Tobeaproductivecitizenanindividualmustpossessacertainbodyofknowledgeandskills.
3.Therefore,schoolsmustteachthisbodyofknowledgeandskills.
4.Todetermineifitissuccessful,theschoolmustthenteststudentstoseeiftheyacquiredtheknowledge
andskills.
IntheTAmodel,thecurriculumdrivesassessment."The"bodyofknowledgeisdeterminedfirst.That
knowledgebecomesthecurriculumthatisdelivered.Subsequently,theassessmentsaredevelopedand
administeredtodetermineifacquisitionofthecurriculumoccurred.
AuthenticAssessment
Incontrast,authenticassessment(AA)springsfromthefollowingreasoningandpractice:
1.Aschool'smissionistodevelopproductivecitizens.
2.Tobeaproductivecitizen,anindividualmustbecapableofperformingmeaningfultasksinthereal
world.
3.Therefore,schoolsmusthelpstudentsbecomeproficientatperformingthetaskstheywillencounter
whentheygraduate.
4.Todetermineifitissuccessful,theschoolmustthenaskstudentstoperformmeaningfultasksthat
replicaterealworldchallengestoseeifstudentsarecapableofdoingso.
Thus,inAA,assessmentdrivesthecurriculum.Thatis,teachersfirstdeterminethetasksthatstudentswill
performtodemonstratetheirmastery,andthenacurriculumisdevelopedthatwillenablestudentsto
performthosetaskswell,whichwouldincludetheacquisitionofessentialknowledgeandskills.Thishas
beenreferredtoasplanningbackwards(e.g.,McDonald,1992).

IfIwereagolfinstructorandItaughttheskillsrequiredtoperformwell,Iwouldnotassessmystudents'
performancebygivingthemamultiplechoicetest.Iwouldputthemoutonthegolfcourseandaskthem
toperform.Althoughthisisobviouswithathleticskills,itisalsotrueforacademicsubjects.Wecanteach
studentshowtodomath,dohistoryanddoscience,notjustknowthem.Then,toassesswhatourstudents
hadlearned,wecanaskstudentstoperformtasksthat"replicatethechallenges"facedbythoseusing
mathematics,doinghistoryorconductingscientificinvestigation.

Authentic Assessment Complements Traditional Assessment


ButateacherdoesnothavetochoosebetweenAAandTA.Itislikelythatsomemixofthetwowillbest
meetyourneeds.Touseasillyexample,ifIhadtochooseachauffeurfrombetweensomeonewhopassed
thedrivingportionofthedriver'slicensetestbutfailedthewrittenportionorsomeonewhofailedthe
drivingportionandpassedthewrittenportion,Iwouldchoosethedriverwhomostdirectlydemonstrated
theabilitytodrive,thatis,theonewhopassedthedrivingportionofthetest.However,Iwouldprefera
driverwhopassedbothportions.Iwouldfeelmorecomfortableknowingthatmychauffeurhadagood
knowledgebaseaboutdriving(whichmightbestbeassessedinatraditionalmanner)andwasabletoapply
thatknowledgeinarealcontext(whichcouldbedemonstratedthroughanauthenticassessment).

Defining Attributes of Traditional and Authentic Assessment


AnotherwaythatAAiscommonlydistinguishedfromTAisintermsofitsdefiningattributes.Ofcourse,
TA'saswellasAA'svaryconsiderablyintheformstheytake.But,typically,alongthecontinuumsof
attributeslistedbelow,TA'sfallmoretowardstheleftendofeachcontinuumandAA'sfallmoretowards
therightend.

TraditionalAuthentic
SelectingaResponsePerformingaTask
ContrivedReallife
Recall/RecognitionConstruction/Application
TeacherstructuredStudentstructured
IndirectEvidenceDirectEvidence

Letmeclarifytheattributesbyelaboratingoneachinthecontextoftraditionalandauthenticassessments:
SelectingaResponsetoPerformingaTask:Ontraditionalassessments,studentsaretypicallygiven
severalchoices(e.g.,a,b,cord;trueorfalse;whichofthesematchwiththose)andaskedtoselecttheright
answer.Incontrast,authenticassessmentsaskstudentstodemonstrateunderstandingbyperformingamore
complextaskusuallyrepresentativeofmoremeaningfulapplication.
ContrivedtoReallife:Itisnotveryofteninlifeoutsideofschoolthatweareaskedtoselectfromfour
alternativestoindicateourproficiencyatsomething.Testsofferthesecontrivedmeansofassessmentto
increasethenumberoftimesyoucanbeaskedtodemonstrateproficiencyinashortperiodoftime.More
commonlyinlife,asinauthenticassessments,weareaskedtodemonstrateproficiencybydoing
something.
Recall/RecognitionofKnowledgetoConstruction/ApplicationofKnowledge:Welldesigned
traditionalassessments(i.e.,testsandquizzes)caneffectivelydeterminewhetherornotstudentshave
acquiredabodyofknowledge.Thus,asmentionedabove,testscanserveasanicecomplementtoauthentic
assessmentsinateacher'sassessmentportfolio.Furthermore,weareoftenaskedtorecallorrecognizefacts

andideasandpropositionsinlife,sotestsaresomewhatauthenticinthatsense.However,the
demonstrationofrecallandrecognitionontestsistypicallymuchlessrevealingaboutwhatwereallyknow
andcandothanwhenweareaskedtoconstructaproductorperformanceoutoffacts,ideasand
propositions.Authenticassessmentsoftenaskstudentstoanalyze,synthesizeandapplywhattheyhave
learnedinasubstantialmanner,andstudentscreatenewmeaningintheprocessaswell.
TeacherstructuredtoStudentstructured:Whencompletingatraditionalassessment,whatastudent
canandwilldemonstratehasbeencarefullystructuredbytheperson(s)whodevelopedthetest.Astudent's
attentionwillunderstandablybefocusedonandlimitedtowhatisonthetest.Incontrast,authentic
assessmentsallowmorestudentchoiceandconstructionindeterminingwhatispresentedasevidenceof
proficiency.Evenwhenstudentscannotchoosetheirowntopicsorformats,thereareusuallymultiple
acceptableroutestowardsconstructingaproductorperformance.Obviously,assessmentsmorecarefully
controlledbytheteachersofferadvantagesanddisadvantages.Similarly,morestudentstructuredtasks
havestrengthsandweaknessesthatmustbeconsideredwhenchoosinganddesigninganassessment.
IndirectEvidencetoDirectEvidence:Evenifamultiplechoicequestionasksastudenttoanalyzeor
applyfactstoanewsituationratherthanjustrecallthefacts,andthestudentselectsthecorrectanswer,
whatdoyounowknowaboutthatstudent?Didthatstudentgetluckyandpicktherightanswer?What
thinkingledthestudenttopickthatanswer?Wereallydonotknow.Atbest,wecanmakesomeinferences
aboutwhatthatstudentmightknowandmightbeabletodowiththatknowledge.Theevidenceisvery
indirect,particularlyforclaimsofmeaningfulapplicationincomplex,realworldsituations.Authentic
assessments,ontheotherhand,offermoredirectevidenceofapplicationandconstructionofknowledge.
Asinthegolfexampleabove,puttingagolfstudentonthegolfcoursetoplayprovidesmuchmoredirect
evidenceofproficiencythangivingthestudentawrittentest.Canastudenteffectivelycritiquethe
argumentssomeoneelsehaspresented(animportantskilloftenrequiredintherealworld)?Askinga
studenttowriteacritiqueshouldprovidemoredirectevidenceofthatskillthanaskingthestudentaseries
ofmultiplechoice,analyticalquestionsaboutapassage,althoughbothassessmentsmaybeuseful.
TeachingtotheTest
Thesetwodifferentapproachestoassessmentalsoofferdifferentadviceaboutteachingtothetest.Under
theTAmodel,teachershavebeendiscouragedfromteachingtothetest.Thatisbecauseatestusually
assessesasampleofstudents'knowledgeandunderstandingandassumesthatstudents'performanceonthe
sampleisrepresentativeoftheirknowledgeofalltherelevantmaterial.Ifteachersfocusprimarilyonthe
sampletobetestedduringinstruction,thengoodperformanceonthatsampledoesnotnecessarilyreflect
knowledgeofallthematerial.So,teachershidethetestsothatthesampleisnotknownbeforehand,and
teachersareadmonishednottoteachtothetest.
WithAA,teachersareencouragedtoteachtothetest.Studentsneedtolearnhowtoperformwellon
meaningfultasks.Toaidstudentsinthatprocess,itishelpfultoshowthemmodelsofgood(andnotso
good)performance.Furthermore,thestudentbenefitsfromseeingthetaskrubricaheadoftimeaswell.Is
this"cheating"?Willstudentsthenjustbeabletomimictheworkofotherswithouttrulyunderstanding
whattheyaredoing?Authenticassessmentstypicallydonotlendthemselvestomimicry.Thereisnotone
correctanswertocopy.So,byknowingwhatgoodperformancelookslike,andbyknowingwhatspecific
characteristicsmakeupgoodperformance,studentscanbetterdeveloptheskillsandunderstanding
necessarytoperformwellonthesetasks.(Forfurtherdiscussionofteachingtothetest,seeBushweller.)

Alternative Names for Authentic Assessment


YoucanalsolearnsomethingaboutwhatAAisbylookingattheothercommonnamesforthisformof
assessment.Forexample,AAissometimesreferredtoas

Performance Assessment (or Performance-based) -- so-called


because students are asked to perform meaningful tasks. This is
the other most common term for this type of assessment. Some
educators distinguish performance assessment from AA by

defining performance assessment as performance-based as


Stiggins has above but with no reference to the authentic nature
of the task (e.g., Meyer, 1992). For these educators, authentic
assessments are performance assessments using real-world or
authentic tasks or contexts. Since we should not typically ask
students to perform work that is not authentic in nature, I choose
to treat these two terms synonymously.

Alternative Assessment -- so-called because AA is an


alternative to traditional assessments.

Direct Assessment -- so-called because AA provides more


direct evidence of meaningful application of knowledge and
skills. If a student does well on a multiple-choice test we might
infer indirectly that the student could apply that knowledge in
real-world contexts, but we would be more comfortable making
that inference from a direct demonstration of that application
such as in the golfing example above.

NormReferenced

TypesofTests

Standardizedtestscomparestudents'performancetothatofanormingorsamplegroupwhoareinthesamegradeorareofthesameage.
Students'performanceiscommunicatedinpercentileranks,gradeequivalentscores,normalcurveequivalents,scaledscores,orstanine
scores.
Examples:IowaTests;SAT;DRP;ACT

CriterionReferenced

Astudent'sperformanceismeasuredagainstastandard.Oneformofcriterionreferencedassessmentisthebenchmark,adescriptionofakey
taskthatstudentsareexpectedtoperform.
Examples:DIBELS;Chaptertests;Driver'sLicenseTest;FCAT(FloridaComprehensiveAssessmentTest)

Survey

Surveyteststypicallyprovideanoverviewofgeneralcomprehensionandwordknowledge.

Examples:Interestsurveys;KWL;LearningStylesInventory

DiagnosticTools

Diagnostictestsassessanumberofareasingreaterdepth.

Examples:WoodcockJohnson;BRI;"TheFoxintheBox"

FormalTests

Formaltestsmaybestandardized.Theyaredesignedtobegivenaccordingtoastandardsetofcircumstances,theyhavetimelimits,andthey
havesetsofdirectionswhicharetobefollowedexactly.
Examples:SAT;FCAT;ACT

InformalTests

Informaltestsgenerallydonothaveasetofstandarddirections.Theyhaveagreatdealofflexibilityinhowtheyareadministered.Theyare
constructedbyteachersandhaveunknownvalidityandreliability.
Examples:Reviewgames;Quizzes

Static(Summative)Tests

Measureswhatthestudenthaslearned.

Examples:Endofchaptertests;Finalexaminations;Standardizedstatetests

Dynamic(Formative)Tests

Measuresthestudents'graspofmaterialthatiscurrentlybeingtaught.Canalsomeasurereadiness.Formativetestshelpguideandinform
instructionandlearning.
Examples:Quizzes;Homework;Portfolios

LawrenceKohlberg'sstagesofmoraldevelopmentconstituteanadaptationofapsychologicaltheory
originallyconceivedbytheSwisspsychologistJeanPiaget.Kohlbergbeganworkonthistopicwhilea
psychologygraduatestudentattheUniversityofChicago

[1]in1958,andexpandedanddevelopedthis
theorythroughouthislife.
Thetheoryholdsthatmoralreasoning,thebasisforethicalbehavior,hassixidentifiabledevelopmental
stages,eachmoreadequateatrespondingtomoraldilemmasthanitspredecessor.[2]Kohlbergfollowedthe

developmentofmoraljudgmentfarbeyondtheagesstudiedearlierbyPiaget,[3]whoalsoclaimedthatlogic
andmoralitydevelopthroughconstructivestages.[2]ExpandingonPiaget'swork,Kohlbergdeterminedthat
theprocessofmoraldevelopmentwasprincipallyconcernedwithjustice,andthatitcontinuedthroughout
theindividual'slifetime,[4]anotionthatspawneddialogueonthephilosophicalimplicationsofsuch
research.[5][6]
Thesixstagesofmoraldevelopmentaregroupedintothreelevels:preconventionalmorality,conventional
morality,andpostconventionalmorality.

Stages
Kohlberg'ssixstagescanbemoregenerallygroupedintothreelevelsoftwostageseach:preconventional,
conventionalandpostconventional.[7][8][9]FollowingPiaget'sconstructivistrequirementsforastagemodel,
asdescribedinhistheoryofcognitivedevelopment,itisextremelyraretoregressinstagestolosethe
useofhigherstageabilities.[14][15]Stagescannotbeskipped;eachprovidesanewandnecessaryperspective,
morecomprehensiveanddifferentiatedthanitspredecessorsbutintegratedwiththem. [14][15]

Level 1 (Pre-Conventional)
1. Obedience and punishment orientation
(How can I avoid punishment?)
2. Self-interest orientation
(What's in it for me?)
(Paying for a benefit)
Level 2 (Conventional)
3. Interpersonal accord and conformity
(Social norms)
(The good boy/girl attitude)
4. Authority and social-order maintaining orientation
(Law and order morality)
Level 3 (Post-Conventional)
5. Social contract orientation
6. Universal ethical principles
(Principled conscience)
Theunderstandinggainedineachstageisretainedinlaterstages,butmayberegardedbythoseinlater
stagesassimplistic,lackinginsufficientattentiontodetail.

Pre-conventional
Thepreconventionallevelofmoralreasoningisespeciallycommoninchildren,althoughadultscanalso
exhibitthislevelofreasoning.Reasonersatthisleveljudgethemoralityofanactionbyitsdirect
consequences.Thepreconventionallevelconsistsofthefirstandsecondstagesofmoraldevelopment,and
issolelyconcernedwiththeselfinanegocentricmanner.Achildwithpreconventionalmoralityhasnot
yetadoptedorinternalizedsociety'sconventionsregardingwhatisrightorwrong,butinsteadfocuses
largelyonexternalconsequencesthatcertainactionsmaybring.[7][8][9]
InStageone(obedienceandpunishmentdriven),individualsfocusonthedirectconsequencesoftheir
actionsonthemselves.Forexample,anactionisperceivedasmorallywrongbecausetheperpetratoris
punished."ThelasttimeIdidthatIgotspankedsoIwillnotdoitagain."Theworsethepunishmentforthe
actis,themore"bad"theactisperceivedtobe.[16]Thiscangiverisetoaninferencethateveninnocent
victimsareguiltyinproportiontotheirsuffering.Itis"egocentric,"lackingrecognitionthatothers'points
ofviewaredifferentfromone'sown.[17]Thereis"deferencetosuperiorpowerorprestige."[17]
Anexampleofobedienceandpunishmentdrivenmoralitywouldbeachildrefusingtodosomething
becauseitiswrongandthattheconsequencescouldresultinpunishment.Forexample,achild'sclassmate
triestodarethechildinplayinghookyfromschool.Thechildwouldapplyobedienceandpunishment

drivenmoralitybyrefusingtoplayhookybecausehewouldgetpunished.Anotherexampleofobedience
andpunishmentdrivenmoralityiswhenachildrefusestocheatonatestbecausethechildwouldget
punished
Stagetwo(selfinterestdriven)expressesthe"what'sinitforme"position,inwhichrightbehavioris
definedbywhatevertheindividualbelievestobeintheirbestinterestbutunderstoodinanarrowway
whichdoesnotconsiderone'sreputationorrelationshipstogroupsofpeople.Stagetworeasoningshowsa
limitedinterestintheneedsofothers,butonlytoapointwhereitmightfurthertheindividual'sown
interests.Asaresult,concernforothersisnotbasedonloyaltyorintrinsicrespect,butrathera"You
scratchmyback,andI'llscratchyours."mentality.[2]Thelackofasocietalperspectiveinthepre
conventionallevelisquitedifferentfromthesocialcontract(stagefive),asallactionshavethepurposeof
servingtheindividual'sownneedsorinterests.Forthestagetwotheorist,theworld'sperspectiveisoften
seenasmoralrelativism.

Anexampleofselfinterestdriveniswhenachildisaskedbyhisparentstodoachore.Thechildasks
"what'sinitforme?"Theparentswouldofferthechildanincentivebygivingachildanallowancetopay
themfortheirchores.Thechildismotivatedtodochoresforselfinterest.Anotherexampleofselfinterest
driveniswhenachilddoestheirhomeworkinexchangeforbettergradesandrewardsfromtheirparents

Conventional
Theconventionallevelofmoralreasoningistypicalofadolescentsandadults.Toreasoninaconventional
wayistojudgethemoralityofactionsbycomparingthemtosociety'sviewsandexpectations.The
conventionallevelconsistsofthethirdandfourthstagesofmoraldevelopment.Conventionalmoralityis
characterizedbyanacceptanceofsociety'sconventionsconcerningrightandwrong.Atthislevelan
individualobeysrulesandfollowssociety'snormsevenwhentherearenoconsequencesforobedienceor
disobedience.Adherencetorulesandconventionsissomewhatrigid,however,andarule'sappropriateness
orfairnessisseldomquestioned.[7][8][9]
InStagethree(goodintentionsasdeterminedbysocialconsensus),theselfenterssocietybyconforming
tosocialstandards.Individualsarereceptivetoapprovalordisapprovalfromothersasitreflectssociety's
views.Theytrytobea"goodboy"or"goodgirl"toliveuptotheseexpectations,[2]havinglearnedthat
beingregardedasgoodbenefitstheself.Stagethreereasoningmayjudgethemoralityofanactionby
evaluatingitsconsequencesintermsofaperson'srelationships,whichnowbegintoincludethingslike
respect,gratitudeandthe"goldenrule"."Iwanttobelikedandthoughtwellof;apparently,notbeing
naughtymakespeoplelikeme."Conformingtotherulesforone'ssocialroleisnotyetfullyunderstood.
Theintentionsofactorsplayamoresignificantroleinreasoningatthisstage;onemayfeelmoreforgiving
ifonethinks,"theymeanwell..."[2]
InStagefour(authorityandsocialorderobediencedriven),itisimportanttoobeylaws,dictumsandsocial
conventionsbecauseoftheirimportanceinmaintainingafunctioningsociety.Moralreasoninginstage
fouristhusbeyondtheneedforindividualapprovalexhibitedinstagethree.Acentralidealoridealsoften
prescribewhatisrightandwrong.Ifonepersonviolatesalaw,perhapseveryonewouldthusthereisan
obligationandadutytoupholdlawsandrules.Whensomeonedoesviolatealaw,itismorallywrong;
culpabilityisthusasignificantfactorinthisstageasitseparatesthebaddomainsfromthegoodones.Most
activemembersofsocietyremainatstagefour,wheremoralityisstillpredominantlydictatedbyanoutside
force.[2]

Post-Conventional
Thepostconventionallevel,alsoknownastheprincipledlevel,ismarkedbyagrowingrealizationthat
individualsareseparateentitiesfromsociety,andthattheindividualsownperspectivemaytake
precedenceoversocietysview;individualsmaydisobeyrulesinconsistentwiththeirownprinciples.Post
conventionalmoralistslivebytheirownethicalprinciplesprinciplesthattypicallyincludesuchbasic
humanrightsaslife,liberty,andjustice.Peoplewhoexhibitpostconventionalmoralityviewrulesas
usefulbutchangeablemechanismsideallyrulescanmaintainthegeneralsocialorderandprotecthuman
rights.Rulesarenotabsolutedictatesthatmustbeobeyedwithoutquestion.Becausepostconventional
individualselevatetheirownmoralevaluationofasituationoversocialconventions,theirbehavior,
especiallyatstagesix,canbeconfusedwiththatofthoseatthepreconventionallevel.

Sometheoristshavespeculatedthatmanypeoplemayneverreachthislevelofabstractmoralreasoning. [7][8]
[9]

InStagefive(socialcontractdriven),theworldisviewedasholdingdifferentopinions,rightsandvalues.
Suchperspectivesshouldbemutuallyrespectedasuniquetoeachpersonorcommunity.Lawsareregarded
associalcontractsratherthanrigidedicts.Thosethatdonotpromotethegeneralwelfareshouldbe
changedwhennecessarytomeetthegreatestgoodforthegreatestnumberofpeople." [8]Thisisachieved
throughmajoritydecisionandinevitablecompromise.Democraticgovernmentisostensiblybasedonstage
fivereasoning.
InStagesix(universalethicalprinciplesdriven),moralreasoningisbasedonabstractreasoningusing
universalethicalprinciples.Lawsarevalidonlyinsofarastheyaregroundedinjustice,andacommitment
tojusticecarrieswithitanobligationtodisobeyunjustlaws.Legalrightsareunnecessary,associal
contractsarenotessentialfordeonticmoralaction.Decisionsarenotreachedhypotheticallyina
conditionalwaybutrathercategoricallyinanabsoluteway,asinthephilosophyofImmanuelKant.[18]This
involvesanindividualimaginingwhattheywoulddoinanothersshoes,iftheybelievedwhatthatother
personimaginestobetrue.[19]Theresultingconsensusistheactiontaken.Inthiswayactionisnevera
meansbutalwaysanendinitself;theindividualactsbecauseitisright,andnotbecauseitavoids
punishment,isintheirbestinterest,expected,legal,orpreviouslyagreedupon.AlthoughKohlberginsisted
thatstagesixexists,hefounditdifficulttoidentifyindividualswhoconsistentlyoperatedatthatlevel. [15]
MontessorieducationisaneducationalapproachdevelopedbyItalianphysicianandeducatorMaria
Montessoriandcharacterizedbyanemphasisonindependence,freedomwithinlimits,andrespectfora
childsnaturalpsychological,physical,andsocialdevelopment.Althougharangeofpracticesexistsunder
thename"Montessori",theAssociationMontessoriInternationale(AMI)andtheAmericanMontessori
Society(AMS)citetheseelementsasessential:[2][3]

Mixedageclassrooms,withclassroomsforchildrenages2or3to6yearsoldbyfarthemost
common

Studentchoiceofactivityfromwithinaprescribedrangeofoptions

Uninterruptedblocksofworktime,ideallythreehours

Aconstructivistor"discovery"model,wherestudentslearnconceptsfromworkingwithmaterials,
ratherthanbydirectinstruction

SpecializededucationalmaterialsdevelopedbyMontessoriandhercollaborators

Freedomofmovementwithintheclassroom

AtrainedMontessoriteacher

Montessori education is fundamentally a model of human


development, and an educational approach based on that model. The
model has two basic principles. First, children and developing adults
engage in psychological self-construction by means of interaction with
their environments. Second, children, especially under the age of six,
have an innate path of psychological development. Based on her
observations, Montessori believed that children at liberty to choose
and act freely within an environment prepared according to her model
would act spontaneously for optimal development.

UnderstandingbyDesign,orUbD,isatoolutilizedforeducationalplanningfocusedon"teachingfor
understanding"advocatedbyJayMcTigheandGrantWigginsintheirUnderstandingbyDesign(1998),
publishedbytheAssociationforSupervisionandCurriculumDevelopment.[1][2]TheemphasisofUbDison
"backwarddesign",thepracticeoflookingattheoutcomesinordertodesigncurriculumunits,
performanceassessments,andclassroominstruction.[3]
"UnderstandingbyDesign"and"UbD"areregisteredtrademarksoftheAssociationforSupervisionand
CurriculumDevelopment("ASCD").AccordingtoWiggins,"ThepotentialofUbDforcurricular
improvementhasstruckachordinAmericaneducation.Over250,000educatorsownthebook.Over
30,000Handbooksareinuse.Morethan150Universityeducationclassesusethebookasatext."[1]As
definedbyWigginsandMcTighe,UnderstandingbyDesignisa"frameworkfordesigningcurriculum
units,performanceassessments,andinstructionthatleadyourstudentstodeepunderstandingofthecontent
youteach,"[4]UbDexpandson"sixfacetsofunderstanding",whichincludestudentsbeingabletoexplain,
interpret,apply,haveperspective,empathize,andhaveselfknowledgeaboutagiventopic.[5]
UnderstandingbyDesignreliesonwhatWigginsandMcTighecall"backwarddesign"(alsoknownas
"backwardsplanning").Teachers,accordingtoUbDproponents,traditionallystartcurriculumplanning
withactivitiesandtextbooksinsteadofidentifyingclassroomlearninggoalsandplanningtowardsthat
goal.Inbackwarddesign,theteacherstartswithclassroomoutcomesandthenplansthecurriculum,
choosingactivitiesandmaterialsthathelpdeterminestudentabilityandfosterstudentlearning. [6]
TheBackwarddesignapproachisdevelopedinthreestages.Stage1startswitheducatorsidentifyingthe
desiredresultsoftheirstudentsbyestablishingtheoverallgoalofthelessonsbyusingcontentstandards,
commoncoreorstatestandards.Inaddition,UbD'sstage1defines"Studentswillunderstandthat..."and
listsessentialquestionsthatwillguidethelearnertounderstanding.Stage1alsofocusesonidentifying
"whatstudentswillknow"andmostimportantly"whatstudentswillbeabletodo".

Difficulty Index - Teachers produce a difficulty index for a test item by


calculating the proportion of students in class who got an item correct.
(The name of this index is counter-intuitive, as one actually gets a
measure of how easy the item is, not the difficulty of the item.) The
larger the proportion, the more students who have learned the content
measured by the item.

C. Item Analysis

Afteryoucreateyourobjectiveassessmentitemsandgiveyourtest,howcan
youbesurethattheitemsareappropriatenottoodifficultandnottooeasy?
Howwillyouknowifthetesteffectivelydifferentiatesbetweenstudentswho
dowellontheoveralltestandthosewhodonot?Anitemanalysisisa
valuable,yetrelativelyeasy,procedurethatteacherscanusetoanswerbothof
thesequestions.
Todeterminethedifficultyleveloftestitems,ameasurecalledtheDifficulty
Indexisused.Thismeasureasksteacherstocalculatetheproportionofstudents
whoansweredthetestitemaccurately.Bylookingateachalternative(for
multiplechoice),wecanalsofindoutifthereareanswerchoicesthatshouldbe
replaced.Forexample,let'ssayyougaveamultiplechoicequizandtherewere
fouranswerchoices(A,B,C,andD).Thefollowingtableillustrateshowmany
studentsselectedeachanswerchoiceforQuestion#1and#2.

Question

#1

24*

#2

12*

13

*Denotescorrectanswer.
ForQuestion#1,wecanseethatAwasnotaverygooddistractornoone
selectedthatanswer.Wecanalsocomputethedifficultyoftheitembydividing
thenumberofstudentswhochoosethecorrectanswer(24)bythenumberof
totalstudents(30).Usingthisformula,thedifficultyofQuestion#1(referredto
asp)isequalto24/30or.80.Arough"ruleofthumb"isthatiftheitem
difficultyismorethan.75,itisaneasyitem;ifthedifficultyisbelow.25,itisa
difficultitem.Giventheseparameters,thisitemcouldberegardedmoderately
easylots(80%)ofstudentsgotitcorrect.Incontrast,Question#2ismuch
moredifficult(12/30=.40).Infact,onQuestion#2,morestudentsselectedan
incorrectanswer(B)thanselectedthecorrectanswer(A).Thisitemshouldbe
carefullyanalyzedtoensurethatBisanappropriatedistractor.

Anothermeasure,theDiscriminationIndex,referstohowwellanassessment
differentiatesbetweenhighandlowscorers.Inotherwords,youshouldbeable
toexpectthatthehighperformingstudentswouldselectthecorrectanswerfor
eachquestionmoreoftenthanthelowperformingstudents.Ifthisistrue,then
theassessmentissaidtohaveapositivediscriminationindex(between0and1)
indicatingthatstudentswhoreceivedahightotalscorechosethecorrect
answerforaspecificitemmoreoftenthanthestudentswhohadaloweroverall
score.If,however,youfindthatmoreofthelowperformingstudentsgota
specificitemcorrect,thentheitemhasanegativediscriminationindex
(between1and0).Let'slookatanexample.

Table2displaystheresultsoftenquestionsonaquiz.Notethatthestudentsare
arrangedwiththetopoverallscorersatthetopofthetable.

Student

Total
Score (%)

Questions
1

Asif

90

Sam

90

Jill

80

Charlie

80

Sonya

70

Ruben

60

Clay

60

Kelley

50

Justin

50

Tonya

40

"1"indicatestheanswerwascorrect;"0"indicatesitwasincorrect.
FollowthesestepstodeterminetheDifficultyIndexandtheDiscrimination
Index.

1. After the students are arranged with the


highest overall scores at the top, count the
number of students in the upper and lower
group who got each item correct. For Question
#1, there were 4 students in the top half who
got it correct, and 4 students in the bottom
half.
2. Determine the Difficulty Index by dividing the
number who got it correct by the total number
of students. For Question #1, this would be
8/10 or p=.80.
3. Determine the Discrimination Index by
subtracting the number of students in the
lower group who got the item correct from the
number of students in the upper group who
got the item correct. Then, divide by the
number of students in each group (in this case,
there are five in each group). For Question #1,
that means you would subtract 4 from 4, and
divide by 5, which results in a Discrimination
Index of 0.

4. The answers for Questions 1-3 are provided in


Table 2.

A. Bloom's Taxonomy
Questions (items) on quizzes and exams can demand different levels of
thinking skills. For example, some questions might be simple
memorization of facts, and others might require the ability to
synthesize information from several sources to select or construct a
response. Benjamin Bloom created a hierarchy of cognitive skills
(called Bloom's taxonomy) that is often used to categorize the levels
of cognitive involvement (thinking skills) in educational settings. The
taxonomy provides a good structure to assist teachers in writing
objectives and assessments. It can be divided into two levels -- Level I
(the lower level) contains knowledge, comprehension and application;

Level II (the higher level) includes application, analysis, synthesis, and


evaluation (see the diagram below).

Figure1.Bloom'sTaxonomy.
Bloom'staxonomyisalsousedtoguidethedevelopmentofstandardizedassessments.Forexample,in
Florida,about65%ofthequestionsonthestatewidereadingtest(FCAT)aredesignedtomeasureLevelII
thinkingskills(application,analysis,synthesis,andevaluation).Topreparestudentsforthesestandardized
tests,classroomassessmentsmustalsodemandbothLevelIandIIthinkingskills.Integratinghigherlevel
skillsintoinstructionandassessmentincreasesthelikelihoodthatstudentswillsucceedontestsand
becomebetterproblemsolvers.

Sometimesobjectivetests(suchasmultiplechoice)arecriticizedbecausethequestionsemphasizeonly
lowerlevelthinkingskills(suchasknowledgeandcomprehension).However,itispossibletoaddress
higherlevelthinkingskillsviaobjectiveassessmentsbyincludingitemsthatfocusongenuine
understanding"how"and"why"questions.Multiplechoiceitemsthatinvolvescenarios,casestudies,
andanalogiesarealsoeffectiveforrequiringstudentstoapply,analyze,synthesize,andevaluate
information

B. Writing Selected Response Assessment Items


Selectedresponse(objective)assessmentitemsareveryefficientoncetheitemsarecreated,youcan
assessandscoreagreatdealofcontentratherquickly.Notethatthetermobjectivereferstothefactthat
eachquestionhasarightandwronganswerandthattheycanbeimpartiallyscored.Infact,thescoringcan
beautomatedifyouhaveaccesstoanopticalscannerforscoringpapertestsoracomputerfor
computerizedtests.However,theconstructionoftheseobjectiveitemsmightwellincludesubjective
inputbytheteacher/creator.

Beforeyouwritetheassessmentitems,youshouldcreateablueprintthatoutlinesthecontentareasandthe
cognitiveskillsyouaretargeting.Onewaytodothisistolistyourinstructionalobjectives,alongwiththe
correspondingcognitivelevel.Forexample,thefollowingtablehasfourdifferentobjectivesandthe
correspondinglevelsofassessment(relativetoBloom'staxonomy).Foreachobjective,fiveassessment
itemswillbewritten,someatLevelIandsomeatLevelII.Thisapproachhelpstoensurethatallobjectives
arecoveredandthatseveralhigherlevelthinkingskillsareincludedintheassessment.

Objectiv
e

Number of Items at
Level I
(Bloom's Taxonomy)

Number of Items at
Level II
(Blooms' Taxonomy)

Afteryouhavedeterminedhowmanyitemsyouneedforeachlevel,youcanbeginwritingthe
assessments.Thereareseveralformsofselectedresponseassessments,includingmultiplechoice,
matching,andtrue/false.Regardlessoftheformyouselect,besuretheitemsareclearlywordedatthe
appropriatereadinglevelanddonotincludeunintentionalclues.Thevalidityofyourtestwillsuffer
tremendouslyifthestudentscantcomprehendorreadthequestions!Thissectionincludesafewguidelines
forconstructingobjectiveassessmentitems,alongwithexamplesandnonexamples.
MultipleChoice
Multiplechoicequestionsconsistofastem(questionorstatement)withseveralanswerchoices
(distractors).Foreachofthefollowingguidelines,clickthebuttonstoviewanExampleorNonExample.

All answer choices should be plausible and homogeneous.


o

Example

Non-Example

Answer choices should be similar in length and grammatical


form.
o

Example

Non-Example

List answer choices in logical (alphabetical or numerical) order.


o

Example

Non-Example

Avoid using "All of the Above" options.


o

Example

Non-Example

Matching
Matchingitemsconsistoftwolistsofwords,phrases,orimages(oftenreferredtoasstemsandresponses).
Studentsreviewthelistofstemsandmatcheachwithaword,phrase,orimagefromthelistofresponses.
Foreachofthefollowingguidelines,clickthebuttonstoviewanExampleorNonExample.

Answer choices should be short, homogeneous and arranged in


logical order.
o

Example

Non-Example

Responses should be plausible and similar in length and


grammatical form.
o

Example

Non-Example

Include more response options than stems.


o

Example

Non-Example

As a general rule, the stems should be longer and the responses


should be shorter.
o

Example

Non-Example

True/False
True/falsequestionscanappeartobeeasiertowrite;however,itisdifficulttowriteeffectivetrue/false
questions.Also,thereliabilityofT/Fquestionsisnotgenerallyveryhighbecauseofthehighpossibilityof
guessing.Inmostcases,T/Fquestionsarenotrecommended.

Statements should be completely true or completely false.


o

Example

Non-Example

Use simple, easy-to-follow statements.


o

Example

Non-Example

Avoid using negatives -- especially double negatives.


o

Example

Non-Example

Avoid absolutes such as "always; never."


o

Example

Non-Example

Test Topics
Step 9. Conduct the Item Analysis
DownloadthisinformationinPDFformat

Introduction
Theitemanalysisisanimportantphaseinthedevelopmentofanexamprogram.Inthisphasestatistical
methodsareusedtoidentifyanytestitemsthatarenotworkingwell.Ifanitemistooeasy,toodifficult,
failingtoshowadifferencebetweenskilledandunskilledexaminees,orevenscoredincorrectly,anitem
analysiswillrevealit.Thetwomostcommonstatisticsreportedinanitemanalysisaretheitemdifficulty,
whichisameasureoftheproportionofexamineeswhorespondedtoanitemcorrectly,andtheitem
discrimination,whichisameasureofhowwelltheitemdiscriminatesbetweenexamineeswhoare
knowledgeableinthecontentareaandthosewhoarenot.Anadditionalanalysisthatisoftenreportedis
thedistractoranalysis.Thedistractoranalysisprovidesameasureofhowwelleachoftheincorrect
optionscontributestothequalityofamultiplechoiceitem.Oncetheitemanalysisinformationisavailable,
anitemreviewisoftenconducted.

Item Analysis Statistics


ItemDifficultyIndex
Theitemdifficultyindexisoneofthemostuseful,andmostfrequentlyreported,itemanalysisstatistics.It
isameasureoftheproportionofexamineeswhoansweredtheitemcorrectly;forthisreasonitis
frequentlycalledthepvalue.Astheproportionofexamineeswhogottheitemright,thepvaluemight
moreproperlybecalledtheitemeasinessindex,ratherthantheitemdifficulty.Itcanrangebetween0.0
and1.0,withahighervalueindicatingthatagreaterproportionofexamineesrespondedtotheitem
correctly,anditwasthusaneasieritem.Forcriterionreferencedtests(CRTs),withtheiremphasison
masterytesting,manyitemsonanexamformwillhavepvaluesof.9orabove.Normreferencedtests
(NRTs),ontheotherhand,aredesignedtobeharderoverallandtospreadouttheexaminees'scores.Thus,
manyoftheitemsonanNRTwillhavedifficultyindexesbetween.4and.6.

ItemDiscriminationIndex
Theitemdiscriminationindexisameasureofhowwellanitemisabletodistinguishbetweenexaminees
whoareknowledgeableandthosewhoarenot,orbetweenmastersandnonmasters.Thereareactually
severalwaystocomputeanitemdiscrimination,butoneofthemostcommonisthepointbiserial
correlation.Thisstatisticlooksattherelationshipbetweenanexaminee'sperformanceonthegivenitem
(correctorincorrect)andtheexaminee'sscoreontheoveralltest.Foranitemthatishighlydiscriminating,
ingeneraltheexamineeswhorespondedtotheitemcorrectlyalsodidwellonthetest,whileingeneralthe
examineeswhorespondedtotheitemincorrectlyalsotendedtodopoorlyontheoveralltest.
Thepossiblerangeofthediscriminationindexis1.0to1.0;however,ifanitemhasadiscrimination
below0.0,itsuggestsaproblem.Whenanitemisdiscriminatingnegatively,overallthemost
knowledgeableexamineesaregettingtheitemwrongandtheleastknowledgeableexamineesaregetting
theitemright.Anegativediscriminationindexmayindicatethattheitemismeasuringsomethingother
thanwhattherestofthetestismeasuring.Moreoften,itisasignthattheitemhasbeenmiskeyed.
Wheninterpretingthevalueofadiscriminationitisimportanttobeawarethatthereisarelationship
betweenanitem'sdifficultyindexanditsdiscriminationindex.Ifanitemhasaveryhigh(orverylow)p
value,thepotentialvalueofthediscriminationindexwillbemuchlessthaniftheitemhasamidrangep
value.Inotherwords,ifanitemiseitherveryeasyorveryhard,itisnotlikelytobeverydiscriminating.A
typicalCRT,withmanyhighitempvalues,mayhavemostitemdiscriminationsintherangeof0.0to0.3.
Ausefulapproachwhenreviewingasetofitemdiscriminationindexesistoalsovieweachitem'spvalue
atthesametime.Forexample,ifagivenitemhasadiscriminationindexbelow.1,buttheitem'spvalueis
greaterthan.9,youmayinterprettheitemasbeingeasyforalmosttheentiresetofexaminees,and
probablyforthatreasonnotprovidingmuchdiscriminationbetweenhighabilityandlowabilityexaminees.
DistractorAnalysis
Oneimportantelementinthequalityofamultiplechoiceitemisthequalityoftheitem'sdistractors.
However,neithertheitemdifficultynortheitemdiscriminationindexconsiderstheperformanceofthe
incorrectresponseoptions,ordistractors.Adistractoranalysisaddressestheperformanceoftheseincorrect
responseoptions.
Justasthekey,orcorrectresponseoption,mustbedefinitivelycorrect,thedistractorsmustbeclearly
incorrect(orclearlynotthe"best"option).Inadditiontobeingclearlyincorrect,thedistractorsmustalso
beplausible.Thatis,thedistractorsshouldseemlikelyorreasonabletoanexamineewhoisnotsufficiently
knowledgeableinthecontentarea.Ifadistractorappearssounlikelythatalmostnoexamineewillselectit,
itisnotcontributingtotheperformanceoftheitem.Infact,thepresenceofoneormoreimplausible
distractorsinamultiplechoiceitemcanmaketheitemartificiallyfareasierthanitoughttobe.
Inasimpleapproachtodistractoranalysis,theproportionofexamineeswhoselectedeachoftheresponse
optionsisexamined.Forthekey,thisproportionisequivalenttotheitempvalue,ordifficulty.Ifthe
proportionsaresummedacrossallofanitem'sresponseoptionstheywilladdupto1.0,or100%ofthe
examinees'selections.
Theproportionofexamineeswhoselecteachofthedistractorscanbeveryinformative.Forexample,itcan
revealanitemmiskey.Whenevertheproportionofexamineeswhoselectedadistractorisgreaterthanthe
proportionofexamineeswhoselectedthekey,theitemshouldbeexaminedtodetermineifithasbeenmis
keyedordoublekeyed.Adistractoranalysiscanalsorevealanimplausibledistractor.InCRTs,wherethe
itempvaluesaretypicallyhigh,theproportionsofexamineesselectingallthedistractorsare,asaresult,
low.Nevertheless,ifexamineesconsistentlyfailtoselectagivendistractor,thismaybeevidencethatthe
distractorisimplausibleorsimplytooeasy.
ItemReview
Oncetheitemanalysisdataareavailable,itisusefultoholdameetingoftestdevelopers,
psychometricians,andsubjectmatterexperts.Duringthismeetingtheitemscanbereviewedusingthe
informationprovidedbytheitemanalysisstatistics.Decisionscanthenbemadeaboutitemchangesthat
areneededorevenitemsthatoughttobedroppedfromtheexam.Anyitemthathasbeensubstantially
changedshouldbereturnedtothebankforpretestingbeforeitisagainusedoperationally.Oncethese
decisionshavebeenmade,theexamsshouldberescored,leavingoutanyitemsthatweredroppedand

usingthecorrectkeyforanyitemsthatwerefoundtohavebeenmiskeyed.Thiscorrectedscoringwillbe
usedfortheexaminees'scorereports.
Summary
Intheitemanalysisphaseoftestdevelopment,statisticalmethodsareusedtoidentifypotentialitem
problems.Thestatisticalresultsshouldbeusedalongwithsubstantiveattentiontotheitemcontentto
determineifaproblemexistsandwhatshouldbedonetocorrectit.Itemsthatarefunctioningverypoorly
shouldusuallyberemovedfromconsiderationandtheexamsrescoredbeforethetestresultsarereleased.
Inothercases,itemsmaystillbeusable,aftermodestchangesaremadetoimprovetheirperformanceon
futureexams.

In statistics, a bimodal distribution is a continuous probability


distribution with two different modes. These appear as distinct peaks
(local maxima) in the probability density function, as shown in Figure 1.
How to Compute Mean, Median, Mode, Range, and Standard Deviation
Instatisticsanddataanalysis,themean,median,mode,range,andstandarddeviationtellresearchershow
thedataisdistributed.Eachofthefivemeasurescanbecalculatedwithsimplearithmetic.Themeanand
medianidicatethe"center"ofthedatapoints.Themodeisthevalueorvaluesthatoccurmostfrequently.
Rangeisthespanbetweenthesmallestvalueandlargersvalue.Standarddeviationmeasureshowfarthe
data"deviates"fromthecenter,onaverage.Knowinghowtocalculatethesestatisticalmeasureswillhelp
youanalyzedatafromsurveysandexperiments.

Mean
The aritmetic mean or average of a set of numbers is the expected
value. The mean is calculated by adding up all the values, and then
dividing that sum by the number of values.
For example, suppose a teacher has seven students and records the
following seven test scores for her class: 98, 96, 96, 84, 80, 80, and 72.
The average test score is
(98+96+96+84+81+81+73)/7 = 609/7 = 87.
If one more student entered her class and took the test, the expected
score would be an 87.
Median
The median is the middle value in a set of values. To find the median,
order the numbers from largest to smallest, and then choose the value
in the middle. For example, consider the following set of nine numbers:
10, 13, 4, 25, 8, 12, 9, 19, 18
If we arrange them in descending order, we get
25, 19, 18, 13, 12, 10, 9, 8, 4
The middle value is 12, so the median = 12. What if we have a set with
an even number of values? For example, consider the set
1, 2, 3, 4, 5, 6.

Both 3 and 4 are in the middle. In this case, we must take the average
of the two middle numbers. Since (3+4)/2 = 3.5, the median = 3.5.
Mode
The mode of a set is the value or values that occur most frequently.
There can be more than one mode in a set. If there is more than one
mode, you simply list all of the modes; you do not have to average
them. For eaxample, consider the set
10, 10, 4, 8, 10, 8, 3, 9, 14
The number 10 occurs three times, and no other numbers occur as
frequently. Therefore, the mode = 10
Now consider this set
10, 10, 4, 8, 10, 8, 3, 8, 14
Both 10 and 8 occur three times each, and no other numbers occur as
often. Threrfore, the modes are 8 and 10.
Range
The range of a set of numbers is the maximum distance between any
two values. In other words, it's the difference between the largest and
smalles values. Knowing the range gives you an idea of how close
together the data points are. For example, consider the set of test
scores
78, 88, 67, 90, 92, 83, 97
The highest test score is 97 and the lowest is 67, therefore the range is
97-67 = 30.
Standard Deviation
The standard deviation is another way to measure how close together
the elements are in a set of data. The s.d. is the average distance
between each data point and the mean. Knowing the standard
deviation gives a more complete picture of the distribution of elements
in a data set. Suppose you have N data points and you label them X1,
X2, X3,... XN, and you call the mean . There are two formulas for
standard deviation depending on whether your data is a complete set,
or a sample take from a larger set.
For example, suppose your data is all of the ACT scores of the students
in a small class. Then the standard deviation formula is

Suppose the scores are 15, 21, 21, 21, 25, 30, and 35. The mean of
this set is 24. The s.d. is
sqrt[((15-24)2+(21-24)2+(21-24)2+(21-24)2+(25-24)2+(30-24)2+(3524)2)/7]
= sqrt[266/7]
= sqrt[38]
= 6.16
If you take a random sample of ACT scores from a large school, the
standard deviation formula is

For example, suppose you select ten students at random from a high
school, and their ACT scores are 17, 20, 24, 25, 26, 26, 29, 29, 30 and
32. The average of this set is 25.8. The standard deviation is
sqrt[((17-25.8)2+(20-25.8)2+(24-25.8)2+(25-25.82+(26-25.8)2
+(26-25.8)2+(29-25.9)2+(29-25.8)2+(30-25.8)2+(32-25.8)2)/(10-1)]
= sqrt[(191.6 )/(10-1)]
= sqrt[191.6/9]
= sqrt[21.2889]
= 4.61

Mean,Mode,Median,andStandardDeviation

TheMeanandMode
Thesamplemeanistheaverageandiscomputedasthesumofalltheobservedoutcomesfromthesample
dividedbythetotalnumberofevents.Weusexasthesymbolforthesamplemean.Inmathterms,

wherenisthesamplesizeandthexcorrespondtotheobservedvalued.

Example
SupposeyourandomlysampledsixacresintheDesolationWildernessforanonindigenousweedand
cameupwiththefollowingcountsofthisweedinthisregion:

34,43,81,106,106and115
Wecomputethesamplemeanbyaddinganddividingbythenumberofsamples,6.

34+43+81+106+106+115
=80.83
6
Wecansaythatthesamplemeanofnonindigenousweedis80.83.
Themodeofasetofdataisthenumberwiththehighestfrequency.Intheaboveexample106isthemode,
sinceitoccurstwiceandtherestoftheoutcomesoccuronlyonce.
Thepopulationmeanistheaverageoftheentirepopulationandisusuallyimpossibletocompute.Weuse
theGreekletterforthepopulationmean.

Median,andTrimmedMean
Oneproblemwithusingthemean,isthatitoftendoesnotdepictthetypicaloutcome.Ifthereisone
outcomethatisveryfarfromtherestofthedata,thenthemeanwillbestronglyaffectedbythisoutcome.
Suchanoutcomeiscalledandoutlier.Analternativemeasureisthemedian.Themedianisthemiddle
score.Ifwehaveanevennumberofeventswetaketheaverageofthetwomiddles.Themedianisbetter
fordescribingthetypicalvalue.Itisoftenusedforincomeandhomeprices.
Example
Supposeyourandomlyselected10housepricesintheSouthLakeTahoearea.Yourareinterestedinthe
typicalhouseprice.In$100,000thepriceswere

2.7,2.9,3.1,3.4,3.7,4.1,4.3,4.7,4.7,40.8

Ifwecomputedthemean,wewouldsaythattheaveragehousepriceis744,000.Althoughthisnumberis
true,itdoesnotreflectthepriceforavailablehousinginSouthLakeTahoe.Acloserlookatthedata
showsthatthehousevaluedat40.8x$100,000=$4.08millionskewsthedata.Instead,weusethe
median.Sincethereisanevennumberofoutcomes,wetaketheaverageofthemiddletwo

3.7+4.1
=3.9
2
Themedianhousepriceis$390,000.Thisbetterreflectswhathouseshoppersshouldexpecttospend.

Thereisanalternativevaluethatalsoisresistanttooutliers.Thisiscalledthetrimmedmeanwhichisthe
meanaftergettingridoftheoutliersor5%onthetopand5%onthebottom.Wecanalsousethetrimmed
meanifweareconcernedwithoutliersskewingthedata,howeverthemedianisusedmoreoftensince
morepeopleunderstandit.
Example:
AtaskirentalshopdatawascollectedonthenumberofrentalsoneachoftenconsecutiveSaturdays:

44,50,38,96,42,47,40,39,46,50.

Tofindthesamplemean,addthemanddivideby10:

44+50+38+96+42+47+40+39+46+50
=49.2
10
Noticethatthemeanvalueisnotavalueofthesample.
Tofindthemedian,firstsortthedata:

38,39,40,42,44,46,47,50,50,96
Noticethattherearetwomiddlenumbers44and46.Tofindthemedianwetaketheaverageofthetwo.

44+46
Median==45
2
Noticealsothatthemeanislargerthanallbutthreeofthedatapoints.Themeanisinfluencedbyoutliers
whilethemedianisrobust.

Variance,StandardDeviationandCoefficientofVariation

Themean,mode,median,andtrimmedmeandoanicejobintellingwherethecenterofthedatasetis,but
oftenweareinterestedinmore.Forexample,apharmaceuticalengineerdevelopsanewdrugthat
regulatesironintheblood.Supposeshefindsoutthattheaveragesugarcontentaftertakingthe
medicationistheoptimallevel.Thisdoesnotmeanthatthedrugiseffective.Thereisapossibilitythat
halfofthepatientshavedangerouslylowsugarcontentwhiletheotherhalfhavedangerouslyhighcontent.
Insteadofthedrugbeinganeffectiveregulator,itisadeadlypoison.Whatthepharmacistneedsisa
measureofhowfarthedataisspreadapart.Thisiswhatthevarianceandstandarddeviationdo.Firstwe
showtheformulasforthesemeasurements.Thenwewillgothroughthestepsonhowtousetheformulas.
Wedefinethevariancetobe

andthestandarddeviationtobe

VarianceandStandardDeviation:StepbyStep
1.

Calculatethemean,x.

2.

Writeatablethatsubtractsthemeanfromeachobservedvalue.

3.

Squareeachofthedifferences.

4.

Addthiscolumn.

5.

Dividebyn1wherenisthenumberofitemsinthesampleThisisthe
variance.

6.

Togetthestandarddeviationwetakethesquarerootofthevariance.

Example
TheowneroftheChesTahoerestaurantisinterestedinhowmuchpeoplespendattherestaurant.He
examines10randomlyselectedreceiptsforpartiesoffourandwritesdownthefollowingdata.

44,50,38,96,42,47,40,39,46,50
Hecalculatedthemeanbyaddinganddividingby10toget

x=49.2
Belowisthetableforgettingthestandarddeviation:

x49.2

(x49.2)2

44

5.2

27.04

50

0.8

0.64

38

11.2

125.44

96

46.8

2190.24

42

7.2

51.84

47

2.2

4.84

40

9.2

84.64

39

10.2

104.04

46

3.2

10.24

50

0.8

0.64

Total

2600.4

Now

2600.4
=288.7
101
Hencethevarianceis289andthestandarddeviationisthesquarerootof289=17.
Sincethestandarddeviationcanbethoughtofmeasuringhowfarthedatavaluesliefromthemean,we
takethemeanandmoveonestandarddeviationineitherdirection.Themeanforthisexamplewasabout
49.2andthestandarddeviationwas17.Wehave:

49.217=32.2

and

49.2+17=66.2

Whatthismeansisthatmostofthepatronsprobablyspendbetween$32.20and$66.20.


Thesamplestandarddeviationwillbedenotedbysandthepopulationstandarddeviationwillbedenoted
bytheGreekletter.
Thesamplevariancewillbedenotedbys2andthepopulationvariancewillbedenotedby2.
Thevarianceandstandarddeviationdescribehowspreadoutthedatais.Ifthedataallliesclosetothe
mean,thenthestandarddeviationwillbesmall,whileifthedataisspreadoutoveralargerangeofvalues,
swillbelarge.Havingoutlierswillincreasethestandarddeviation.
Oneoftheflawsinvolvedwiththestandarddeviation,isthatitdependsontheunitsthatareused.One
wayofhandlingthisdifficulty,iscalledthecoefficientofvariationwhichisthestandarddeviationdivided
bythemeantimes100%

CV=100%

Intheaboveexample,itis

17
100%=34.6%
49.2
Thistellsusthatthestandarddeviationoftherestaurantbillsis34.6%ofthemean.

Chebyshev'sTheorem
AmathematiciannamedChebyshevcameupwithboundsonhowmuchofthedatamustlieclosetothe
mean.Inparticularforanypositivek,theproportionofthedatathatlieswithinkstandarddeviationsof
themeanisatleast
1

1
k2
Forexample,ifk=2thisnumberis

1
1=.75
22
Thistellusthatatleast75%ofthedatalieswithin75%ofthemean.Intheaboveexample,wecansay
thatatleast75%ofthedinersspentbetween

49.22(17)=15.2
and

49.2+2(17)=83.2
dollars.

Skewed Data
Datacanbe"skewed",meaningittendstohavealongtailononesideortheother:

Negative Skew

No Skew

Positive Skew

Negative Skew?
Whyisitcallednegativeskew?Becausethe
long"tail"isonthenegativesideofthepeak.
Peoplesometimessayitis"skewedtothe
left"(thelongtailisonthelefthandside)
Themeanisalsoontheleftofthepeak.

The Normal
Distribution has No
Skew
ANormalDistributionisnot
skewed.
Itisperfectlysymmetrical.
AndtheMeanisexactlyatthe
peak.

Positive Skew
Andpositiveskewiswhenthelongtail
isonthepositivesideofthepeak,and
somepeoplesayitis"skewedtothe
right".
Themeanisontherightofthepeak
value.

Example:
Income
Distribution
HereissomedataI
extractedfroma
recentCensus.
Asyoucanseeitis
positivelyskewed...
infactthetail
continueswaypast
$100,000

Calculating Skewness
"Skewness"(theamountofskew)canbecalculated,forexampleyoucouldusetheSKEW()functionin
ExcelorOpenOfficeCalc.

Normal Distribution
ThenormaldistributionisaprobabilitydistributionthatassociatesthenormalrandomvariableXwitha
cumulativeprobability.Thenormaldistributionisdefinedbythefollowingequation:

Normal equation. The value of the random variable Y is:


Y=[1/*sqrt(2)]*e(x)2/22

where X is a normal random variable, is the mean, is the standard


deviation, is approximately 3.14159, and e is approximately 2.71828.
Thegraphofthenormaldistributiondependsontwofactorsthemeanandthestandarddeviation.The
meanofthedistributiondeterminesthelocationofthecenterofthegraph,andthestandarddeviation
determinestheheightofthegraph.Whenthestandarddeviationislarge,thecurveisshortandwide;when
thestandarddeviationissmall,thecurveistallandnarrow.Allnormaldistributionslooklikeasymmetric,
bellshapedcurve,asshownbelow.

Thecurveontheleftisshorterandwiderthanthecurveontheright,becausethecurveonthelefthasa
biggerstandarddeviation.
TheRorschachtest(/rrk/or/rrk/,[3]Germanpronunciation:[oax];alsoknownastheRorschach
inkblottest,theRorschachtechnique,orsimplytheinkblottest)isapsychologicaltestinwhich
subjects'perceptionsofinkblotsarerecordedandthenanalyzedusingpsychologicalinterpretation,
complexalgorithms,orboth.Somepsychologistsusethistesttoexamineaperson'spersonality
characteristicsandemotionalfunctioning.Ithasbeenemployedtodetectunderlyingthoughtdisorder,
especiallyincaseswherepatientsarereluctanttodescribetheirthinkingprocessesopenly. [4]Thetestis
namedafteritscreator,SwisspsychologistHermannRorschach.
Inthe1960s,theRorschachwasthemostwidelyusedprojectivetest.[5]InanationalsurveyintheU.S.,the
Rorschachwasrankedeighthamongpsychologicaltestsusedinoutpatientmentalhealthfacilities. [6]Itis
thesecondmostwidelyusedtestbymembersoftheSocietyforPersonalityAssessment,anditisrequested
bypsychiatristsin25%offorensicassessmentcases,[6]usuallyinabatteryofteststhatoftenincludethe
MMPI2andtheMCMIIII.[7]Insurveys,theuseofRorschachrangesfromalowof20%bycorrectional
psychologists[8]toahighof80%byclinicalpsychologistsengagedinassessmentservices,and80%of
psychologygraduateprogramssurveyedteachit.[9]
AlthoughtheExnerScoringSystem(developedsincethe1960s)claimstohaveaddressedandoftenrefuted
manycriticismsoftheoriginaltestingsystemwithanextensivebodyofresearch, [10]someresearchers
continuetoraisequestions.Theareasofdisputeincludetheobjectivityoftesters,interraterreliability,the
verifiabilityandgeneralvalidityofthetest,biasofthetest'spathologyscalestowardsgreaternumbersof
responses,thelimitednumberofpsychologicalconditionswhichitaccuratelydiagnoses,theinabilityto
replicatethetest'snorms,itsuseincourtorderedevaluations,andtheproliferationoftheteninkblot
images,potentiallyinvalidatingthetestforthosewhohavebeenexposedtothem.[11]
TheRorschachtest(/rrk/or/rrk/,[3]Germanpronunciation:[oax];alsoknownastheRorschach
inkblottest,theRorschachtechnique,orsimplytheinkblottest)isapsychologicaltestinwhich
subjects'perceptionsofinkblotsarerecordedandthenanalyzedusingpsychologicalinterpretation,
complexalgorithms,orboth.Somepsychologistsusethistesttoexamineaperson'spersonality
characteristicsandemotionalfunctioning.Ithasbeenemployedtodetectunderlyingthoughtdisorder,
especiallyincaseswherepatientsarereluctanttodescribetheirthinkingprocessesopenly. [4]Thetestis
namedafteritscreator,SwisspsychologistHermannRorschach.
Inthe1960s,theRorschachwasthemostwidelyusedprojectivetest.[5]InanationalsurveyintheU.S.,the
Rorschachwasrankedeighthamongpsychologicaltestsusedinoutpatientmentalhealthfacilities. [6]Itis
thesecondmostwidelyusedtestbymembersoftheSocietyforPersonalityAssessment,anditisrequested
bypsychiatristsin25%offorensicassessmentcases,[6]usuallyinabatteryofteststhatoftenincludethe
MMPI2andtheMCMIIII.[7]Insurveys,theuseofRorschachrangesfromalowof20%bycorrectional
psychologists[8]toahighof80%byclinicalpsychologistsengagedinassessmentservices,and80%of
psychologygraduateprogramssurveyedteachit.[9]
AlthoughtheExnerScoringSystem(developedsincethe1960s)claimstohaveaddressedandoftenrefuted
manycriticismsoftheoriginaltestingsystemwithanextensivebodyofresearch, [10]someresearchers
continuetoraisequestions.Theareasofdisputeincludetheobjectivityoftesters,interraterreliability,the
verifiabilityandgeneralvalidityofthetest,biasofthetest'spathologyscalestowardsgreaternumbersof
responses,thelimitednumberofpsychologicalconditionswhichitaccuratelydiagnoses,theinabilityto

replicatethetest'snorms,itsuseincourtorderedevaluations,andtheproliferationoftheteninkblot
images,potentiallyinvalidatingthetestforthosewhohavebeenexposedtothem.[11]

Existence of God
ThereareseveralmainpositionswithregardtotheexistenceofGodthatonemighttake:

1. Theism - the belief in the existence of one or more divinities or


deities.
1. Pantheism - the belief that God exists as all things of the
cosmos, that God is one and all is God; God is immanent.
2. Panentheism - the belief that God encompasses all things
of the cosmos but that God is greater than the cosmos;
God is both immanent and transcendent.
3. Deism - the belief that God does exist but does not
interfere with human life and the laws of the universe; God
is transcendent.

4. Monotheism - the belief that a single deity exists which


rules the universe as a separate and individual entity.
5. Polytheism - the belief that multiple deities exist which rule
the universe as separate and individual entities.
6. Henotheism - the belief that multiple deities may or may
not exist, though there is a single supreme deity.
7. Henology - believing that multiple avatars of a deity exist,
which represent unique aspects of the ultimate deity.
2. Agnosticism - the belief that the existence or non-existence of
deities or God is currently unknown or unknowable and cannot
be proven. A weaker form of this might be defined as simply a
lack of certainty about gods' existence or nonexistence.[citation
needed]

3. Atheism - the rejection of belief in the existence of deities.[12][13]


1. Strong atheism is specifically the position that there are no
deities.[14][15]
2. Weak atheism is simply the absence of belief that any
deities exist.[15][16][17]
4. Apatheism - the lack of caring whether any supreme being exists,
or lack thereof
5. Possibilianism
Thesearenotmutuallyexclusivepositions.Forexample,agnostictheistschoosetobelieveGodexists
whileassertingthatknowledgeofGod'sexistenceisinherentlyunknowable.Similarly,agnosticatheists
rejectbeliefintheexistenceofalldeities,whileassertingthatwhetheranysuchentitiesexistornotis
inherentlyunknowable.

Hinduism, Buddhism, Confucianism, and Taoism


The four major religions of the Far East are Hinduism, Buddhism,
Confucianism, and Taoism.

Hinduism
Hinduism, a polytheistic religion and perhaps the oldest of the great
world religions, dates back about 6,000 years. Hinduism comprises so
many different beliefs and rituals that some sociologists have
suggested thinking of it as a grouping of interrelated religions.
Hinduismteachestheconceptofreincarnationthebeliefthatalllivingorganismscontinueeternallyin
cyclesofbirth,death,andrebirth.Similarly,Hinduismteachesthecastesystem,inwhichaperson's
previousincarnationsdeterminethatperson'shierarchicalpositioninthislife.Eachcastecomeswithits
ownsetofresponsibilitiesandduties,andhowwellapersonexecutesthesetasksinthecurrentlife
determinesthatperson'spositioninthenextincarnation.

Hindusacknowledgetheexistenceofbothmaleandfemalegods,buttheybelievethattheultimatedivine
energyexistsbeyondthesedescriptionsandcategories.Thedivinesoulispresentandactiveinallliving
things.
Morethan600millionHinduspracticethereligionworldwide,thoughmostresideinIndia.Unlike
MoslemsandChristians,Hindusdonotusuallyproselytize(attempttoconvertotherstotheirreligion).

Buddhism, Confucianism, and Taoism


Three other religions of the Far East include Buddhism, Confucianism,
and Taoism. These ethical religions have no gods like Yawheh or
Allah, but espouse ethical and moral principles designed to improve
the believer's relationship with the universe.
BuddhismoriginatesintheteachingsoftheBuddha,ortheEnlightenedOne(SiddharthaGautama)a
6thcenturyB.C.HinduprinceofsouthernNepal.Humans,accordingtotheBuddha,canescapethecycles
ofreincarnationbyrenouncingtheirearthlydesiresandseekingalifeofmeditationandselfdiscipline.The
ultimateobjectiveofBuddhismistoattainNirvana,whichisastateoftotalspiritualsatisfaction.Like
Hinduism,Buddhismallowsreligiousdivergence.Unlikeit,though,Buddhismrejectsritualandthecaste
system.Whileaglobalreligion,BuddhismtodaymostcommonlyliesinsuchareasoftheFarEastas
China,Japan,Korea,SriLanka,Thailand,andBurma.ArecognizeddenominationofBuddhismisZen
Buddhism,whichattemptstotransmittheideasofBuddhismwithoutrequiringacceptanceofallofthe
teachingsofBuddha.
Confucius,orK'ungFutzu,livedatthesametimeastheBuddha.Confucius'sfollowers,likethoseofLao
tzu,thefounderofTaoism,sawhimasamoralteacherandwisemannotareligiousgod,prophet,or
leader.Confucianism'smaingoalistheattainmentofinnerharmonywithnature.Thisincludesthe
venerationofancestors.Earlyon,therulingclassesofChinawidelyembracedConfucianism.Taoism
sharessimilarprincipleswithConfucianism.TheteachingsofLaotzustresstheimportanceofmeditation
andnonviolenceasmeansofreachinghigherlevelsofexistence.WhilesomeChinesestillpractice
ConfucianismandTaoism,thesereligionshavelostmuchoftheirimpetusduetoresistancefromtoday's
Communistgovernment.However,someconceptsofTaoism,likereincarnation,havefoundanexpression
inmodernNewAgereligions.