You are on page 1of 20

1

THE ETHICS OFARTIFICIAL INTELLIGENCE

(2011)
NickBostrom
EliezerYudkowsky

DraftforCambridgeHandbookofArtificialIntelligence,eds.WilliamRamseyandKeith
Frankish(CambridgeUniversityPress,2011):forthcoming

Thepossibilityofcreatingthinkingmachinesraisesahostofethicalissues.These
questionsrelatebothtoensuringthatsuchmachinesdonotharmhumansandother
morallyrelevantbeings,andtothemoralstatusofthemachinesthemselves.Thefirst
sectiondiscussesissuesthatmayariseinthenearfutureofAI.Thesecondsection
outlineschallengesforensuringthatAIoperatessafelyasitapproacheshumansinits
intelligence.Thethirdsectionoutlineshowwemightassesswhether,andinwhat
circumstances,AIsthemselveshavemoralstatus.Inthefourthsection,weconsider
howAIsmightdifferfromhumansincertainbasicrespectsrelevanttoourethical
assessmentofthem.ThefinalsectionaddressestheissuesofcreatingAIsmore
intelligentthanhuman,andensuringthattheyusetheiradvancedintelligencefor
goodratherthanill.

EthicsinMachineLearningandOtherDomainSpecificAI
Algorithms
Imagine,inthenearfuture,abankusingamachinelearningalgorithmtorecommend
mortgageapplicationsforapproval.Arejectedapplicantbringsalawsuitagainstthe
bank,allegingthatthealgorithmisdiscriminatingraciallyagainstmortgage
applicants.Thebankrepliesthatthisisimpossible,sincethealgorithmisdeliberately
blindedtotheraceoftheapplicants.Indeed,thatwaspartofthebanksrationalefor
implementingthesystem.Evenso,statisticsshowthatthebanksapprovalratefor
blackapplicantshasbeensteadilydropping.Submittingtenapparentlyequally
qualifiedgenuineapplicants(asdeterminedbyaseparatepanelofhumanjudges)
showsthatthealgorithmacceptswhiteapplicantsandrejectsblackapplicants.What
couldpossiblybehappening?

Findingananswermaynotbeeasy.Ifthemachinelearningalgorithmisbasedona
complicatedneuralnetwork,orageneticalgorithmproducedbydirectedevolution,
thenitmayprovenearlyimpossibletounderstandwhy,orevenhow,thealgorithmis
judgingapplicantsbasedontheirrace.Ontheotherhand,amachinelearnerbasedon
decisiontreesorBayesiannetworksismuchmoretransparenttoprogrammer
2

inspection(Hastieetal.2001),whichmayenableanauditortodiscoverthattheAI
algorithmusestheaddressinformationofapplicantswhowerebornorpreviously
residedinpredominantlypovertystrickenareas.

AIalgorithmsplayanincreasinglylargeroleinmodernsociety,thoughusuallynot
labeledAI.Thescenariodescribedabovemightbetranspiringevenaswewrite.It
willbecomeincreasinglyimportanttodevelopAIalgorithmsthatarenotjustpowerful
andscalable,butalsotransparenttoinspectiontonameoneofmanysociallyimportant
properties.

Somechallengesofmachineethicsaremuchlikemanyotherchallengesinvolvedin
designingmachines.Designingarobotarmtoavoidcrushingstrayhumansisno
moremorallyfraughtthandesigningaflameretardantsofa.Itinvolvesnew
programmingchallenges,butnonewethicalchallenges.ButwhenAIalgorithmstake
oncognitiveworkwithsocialdimensionscognitivetaskspreviouslyperformedby
humanstheAIalgorithminheritsthesocialrequirements.Itwouldsurelybe
frustratingtofindthatnobankintheworldwillapproveyourseeminglyexcellent
loanapplication,andnobodyknowswhy,andnobodycanfindouteveninprinciple.
(Maybeyouhaveafirstnamestronglyassociatedwithdeadbeats?Whoknows?)

TransparencyisnottheonlydesirablefeatureofAI.ItisalsoimportantthatAI
algorithmstakingoversocialfunctionsbepredictabletothosetheygovern.To
understandtheimportanceofsuchpredictability,considerananalogy.Thelegal
principleofstaredecisisbindsjudgestofollowpastprecedentwheneverpossible.To
anengineer,thispreferenceforprecedentmayseemincomprehensiblewhybindthe
futuretothepast,whentechnologyisalwaysimproving?Butoneofthemost
importantfunctionsofthelegalsystemistobepredictable,sothat,e.g.,contractscan
bewrittenknowinghowtheywillbeexecuted.Thejobofthelegalsystemisnot
necessarilytooptimizesociety,buttoprovideapredictableenvironmentwithinwhich
citizenscanoptimizetheirownlives.

ItwillalsobecomeincreasinglyimportantthatAIalgorithmsberobustagainst
manipulation.Amachinevisionsystemtoscanairlineluggageforbombsmustbe
robustagainsthumanadversariesdeliberatelysearchingforexploitableflawsinthe
algorithmforexample,ashapethat,placednexttoapistolinonesluggage,would
neutralizerecognitionofit.Robustnessagainstmanipulationisanordinarycriterion
ininformationsecurity;nearlythecriterion.Butitisnotacriterionthatappearsoften
inmachinelearningjournals,whicharecurrentlymoreinterestedin,e.g.,howan
algorithmscalesuponlargerparallelsystems.

Anotherimportantsocialcriterionfordealingwithorganizationsisbeingabletofind
thepersonresponsibleforgettingsomethingdone.WhenanAIsystemfailsatits
assignedtask,whotakestheblame?Theprogrammers?Theendusers?Modern
3

bureaucratsoftentakerefugeinestablishedproceduresthatdistributeresponsibilityso
widelythatnoonepersoncanbeidentifiedtoblameforthecatastrophesthatresult
(Howard1994).Theprovablydisinterestedjudgmentofanexpertsystemcouldturn
outtobeanevenbetterrefuge.EvenifanAIsystemisdesignedwithauseroverride,
onemustconsiderthecareerincentiveofabureaucratwhowillbepersonallyblamed
iftheoverridegoeswrong,andwhowouldmuchprefertoblametheAIforany
difficultdecisionwithanegativeoutcome.

Responsibility,transparency,auditability,incorruptibility,predictability,anda
tendencytonotmakeinnocentvictimsscreamwithhelplessfrustration:allcriteriathat
applytohumansperformingsocialfunctions;allcriteriathatmustbeconsideredinan
algorithmintendedtoreplacehumanjudgmentofsocialfunctions;allcriteriathatmay
notappearinajournalofmachinelearningconsideringhowanalgorithmscalesupto
morecomputers.Thislistofcriteriaisbynomeansexhaustive,butitservesasasmall
sampleofwhatanincreasinglycomputerizedsocietyshouldbethinkingabout.

ArtificialGeneralIntelligence
ThereisnearlyuniversalagreementamongmodernAIprofessionalsthatArtificial
Intelligencefallsshortofhumancapabilitiesinsomecriticalsense,eventhoughAI
algorithmshavebeatenhumansinmanyspecificdomainssuchaschess.Ithasbeen
suggestedbysomethatassoonasAIresearchersfigureouthowtodosomething,that
capabilityceasestoberegardedasintelligentchesswasconsideredtheepitomeof
intelligenceuntilDeepBluewontheworldchampionshipfromKasparovbuteven
theseresearchersagreethatsomethingimportantismissingfrommodernAIs(e.g.,
Hofstadter2006).

WhilethissubfieldofArtificialIntelligenceisonlyjustcoalescing,ArtificialGeneral
Intelligence(hereafter,AGI)istheemergingtermofartusedtodenoterealAI(see,
e.g.,theeditedvolumeGoertzelandPennachin2006).Asthenameimplies,the
emergingconsensusisthatthemissingcharacteristicisgenerality.CurrentAI
algorithmswithhumanequivalentorsuperiorperformancearecharacterizedbya
deliberatelyprogrammedcompetenceonlyinasingle,restricteddomain.DeepBlue
becametheworldchampionatchess,butitcannotevenplaycheckers,letalonedrivea
carormakeascientificdiscovery.SuchmodernAIalgorithmsresembleallbiological
lifewiththesoleexceptionofHomosapiens.Abeeexhibitscompetenceatbuilding
hives;abeaverexhibitscompetenceatbuildingdams;butabeedoesntbuilddams,
andabeavercantlearntobuildahive.Ahuman,watching,canlearntodoboth;but
thisisauniqueabilityamongbiologicallifeforms.Itisdebatablewhetherhuman
intelligenceistrulygeneralwearecertainlybetteratsomecognitivetasksthanothers
(HirschfeldandGelman1994)buthumanintelligenceissurelysignificantlymore
generallyapplicablethannonhominidintelligence.

ItisrelativelyeasytoenvisagethesortofsafetyissuesthatmayresultfromAI
operatingonlywithinaspecificdomain.Itisaqualitativelydifferentclassofproblem
tohandleanAGIoperatingacrossmanynovelcontextsthatcannotbepredictedin
advance.

Whenhumanengineersbuildanuclearreactor,theyenvisionthespecificeventsthat
couldgooninsideitvalvesfailing,computersfailing,coresincreasingin
temperatureandengineerthereactortorendertheseeventsnoncatastrophic.Or,on
amoremundanelevel,buildingatoasterinvolvesenvisioningbreadandenvisioning
thereactionofthebreadtothetoastersheatingelement.Thetoasteritselfdoesnot
knowthatitspurposeistomaketoastthepurposeofthetoasterisrepresentedwithin
thedesignersmind,butisnotexplicitlyrepresentedincomputationsinsidethe
toasterandsoifyouplaceclothinsideatoaster,itmaycatchfire,asthedesign
executesinanunenvisionedcontextwithanunenvisionedsideeffect.

EventaskspecificAIalgorithmsthrowusoutsidethetoasterparadigm,thedomainof
locallypreprogrammed,specificallyenvisionedbehavior.ConsiderDeepBlue,the
chessalgorithmthatbeatGarryKasparovfortheworldchampionshipofchess.Were
itthecasethatmachinescanonlydoexactlyastheyaretold,theprogrammerswould
havehadtomanuallypreprogramadatabasecontainingmovesforeverypossible
chesspositionthatDeepBluecouldencounter.ButthiswasnotanoptionforDeep
Bluesprogrammers.First,thespaceofpossiblechesspositionsisunmanageably
large.Second,iftheprogrammershadmanuallyinputwhattheyconsideredagood
moveineachpossiblesituation,theresultingsystemwouldnothavebeenableto
makestrongerchessmovesthanitscreators.Sincetheprogrammersthemselveswere
notworldchampions,suchasystemwouldnothavebeenabletodefeatGarry
Kasparov.

Increatingasuperhumanchessplayer,thehumanprogrammersnecessarilysacrificed
theirabilitytopredictDeepBlueslocal,specificgamebehavior.Instead,DeepBlues
programmershad(justifiable)confidencethatDeepBlueschessmoveswouldsatisfya
nonlocalcriterionofoptimality:namely,thatthemoveswouldtendtosteerthefuture
ofthegameboardintooutcomesinthewinningregionasdefinedbythechessrules.
Thispredictionaboutdistantconsequences,thoughitprovedaccurate,didnotallow
theprogrammerstoenvisionthelocalbehaviorofDeepBlueitsresponsetoaspecific
attackonitskingbecauseDeepBluecomputedthenonlocalgamemap,thelink
betweenamoveanditspossiblefutureconsequences,moreaccuratelythanthe
programmerscould(Yudkowsky2006).

Modernhumansdoliterallymillionsofthingstofeedthemselvestoservethefinal
consequenceofbeingfed.FewoftheseactivitieswereenvisionedbyNatureinthe
senseofbeingancestralchallengestowhichwearedirectlyadapted.Butouradapted
brainhasgrownpowerfulenoughtobesignificantlymoregenerallyapplicable;toletus
5

foreseetheconsequencesofmillionsofdifferentactionsacrossdomains,andexertour
preferencesoverfinaloutcomes.Humanscrossedspaceandputfootprintsonthe
Moon,eventhoughnoneofourancestorsencounteredachallengeanalogousto
vacuum.ComparedtodomainspecificAI,itisaqualitativelydifferentproblemto
designasystemthatwilloperatesafelyacrossthousandsofcontexts;including
contextsnotspecificallyenvisionedbyeitherthedesignersortheusers;including
contextsthatnohumanhasyetencountered.Heretheremaybenolocalspecification
ofgoodbehaviornosimplespecificationoverthebehaviorsthemselves,anymore
thanthereexistsacompactlocaldescriptionofallthewaysthathumansobtaintheir
dailybread.

TobuildanAIthatactssafelywhileactinginmanydomains,withmany
consequences,includingproblemstheengineersneverexplicitlyenvisioned,onemust
specifygoodbehaviorinsuchtermsasXsuchthattheconsequenceofXisnot
harmfultohumans.Thisisnonlocal;itinvolvesextrapolatingthedistant
consequencesofactions.Thus,thisisonlyaneffectivespecificationonethatcanbe
realizedasadesignpropertyifthesystemexplicitlyextrapolatestheconsequencesof
itsbehavior.Atoastercannothavethisdesignpropertybecauseatoastercannot
foreseetheconsequencesoftoastingbread.

Imagineanengineerhavingtosay,Well,IhavenoideahowthisairplaneIbuiltwill
flysafelyindeedIhavenoideahowitwillflyatall,whetheritwillflapitswingsor
inflateitselfwithheliumorsomethingelseIhaventevenimaginedbutIassureyou,
thedesignisvery,verysafe.Thismayseemlikeanunenviablepositionfromthe
perspectiveofpublicrelations,butitshardtoseewhatotherguaranteeofethical
behaviorwouldbepossibleforageneralintelligenceoperatingonunforeseen
problems,acrossdomains,withpreferencesoverdistantconsequences.Inspectingthe
cognitivedesignmightverifythatthemindwas,indeed,searchingforsolutionsthat
wewouldclassifyasethical;butwecouldntpredictwhichspecificsolutionthemind
woulddiscover.

Respectingsuchaverificationrequiressomewaytodistinguishtrustworthy
assurances(aprocedurewhichwillnotsaytheAIissafeunlesstheAIreallyissafe)
frompurehopeandmagicalthinking(IhavenoideahowthePhilosophersStone
willtransmuteleadtogold,butIassureyou,itwill!).Oneshouldbearinmindthat
purelyhopefulexpectationshavepreviouslybeenaprobleminAIresearch
(McDermott1976).

VerifiablyconstructingatrustworthyAGIwillrequiredifferentmethods,anda
differentwayofthinking,frominspectingpowerplantsoftwareforbugsitwill
requireanAGIthatthinkslikeahumanengineerconcernedaboutethics,notjusta
simpleproductofethicalengineering.

ThusthedisciplineofAIethics,especiallyasappliedtoAGI,islikelytodiffer
fundamentallyfromtheethicaldisciplineofnoncognitivetechnologies,inthat:

Thelocal,specificbehavioroftheAImaynotbepredictableapartfromits
safety,eveniftheprogrammersdoeverythingright;
Verifyingthesafetyofthesystembecomesagreaterchallengebecausewe
mustverifywhatthesystemistryingtodo,ratherthanbeingabletoverifythe
systemssafebehaviorinalloperatingcontexts;
Ethicalcognitionitselfmustbetakenasasubjectmatterofengineering.

MachineswithMoralStatus
Adifferentsetofethicalissuesariseswhenwecontemplatethepossibilitythatsome
futureAIsystemsmightbecandidatesforhavingmoralstatus.Ourdealingswith
beingspossessedofmoralstatusarenotexclusivelyamatterofinstrumental
rationality:wealsohavemoralreasonstotreatthemincertainways,andtorefrain
fromtreatingthemincertainotherways.FrancisKammhasproposedthefollowing
definitionofmoralstatus,whichwillserveforourpurposes:

Xhasmoralstatus=becauseXcountsmorallyinitsownright,itis
permissible/impermissibletodothingstoitforitsownsake.(Kamm2007:
chapter7;paraphrase)

Arockhasnomoralstatus:wemaycrushit,pulverizeit,orsubjectittoanytreatment
welikewithoutanyconcernfortherockitself.Ahumanperson,ontheotherhand,
mustbetreatednotonlyasameansbutalsoasanend.Exactlywhatitmeanstotreat
apersonasanendissomethingaboutwhichdifferentethicaltheoriesdisagree;butit
certainlyinvolvestakingherlegitimateinterestsintoaccountgivingweighttoher
wellbeinganditmayalsoinvolveacceptingstrictmoralsideconstraintsinour
dealingswithher,suchasaprohibitionagainstmurderingher,stealingfromher,or
doingavarietyofotherthingstoherorherpropertywithoutherconsent.Moreover,
itisbecauseahumanpersoncountsinherownright,andforhersake,thatitis
impermissibletodotoherthesethings.Thiscanbeexpressedmoreconciselyby
sayingthatahumanpersonhasmoralstatus.

Questionsaboutmoralstatusareimportantinsomeareasofpracticalethics.For
example,disputesaboutthemoralpermissibilityofabortionoftenhingeon
disagreementsaboutthemoralstatusoftheembryo.Controversiesaboutanimal
experimentationandthetreatmentofanimalsinthefoodindustryinvolvequestions
aboutthemoralstatusofdifferentspeciesofanimal.Andourobligationstowards
humanbeingswithseveredementia,suchaslatestageAlzheimerspatients,mayalso
dependonquestionsofmoralstatus.

ItiswidelyagreedthatcurrentAIsystemshavenomoralstatus.Wemaychange,
copy,terminate,delete,orusecomputerprogramsasweplease;atleastasfarasthe
programsthemselvesareconcerned.Themoralconstraintstowhichwearesubjectin
ourdealingswithcontemporaryAIsystemsareallgroundedinourresponsibilitiesto
otherbeings,suchasourfellowhumans,notinanydutiestothesystemsthemselves.

WhileitisfairlyconsensualthatpresentdayAIsystemslackmoralstatus,itisunclear
exactlywhatattributesgroundmoralstatus.Twocriteriaarecommonlyproposedas
beingimportantlylinkedtomoralstatus,eitherseparatelyorincombination:sentience
andsapience(orpersonhood).Thesemaybecharacterizedroughlyasfollows:

Sentience:thecapacityforphenomenalexperienceorqualia,suchasthe
capacitytofeelpainandsuffer

Sapience:asetofcapacitiesassociatedwithhigherintelligence,suchasself
awarenessandbeingareasonresponsiveagent

Onecommonviewisthatmanyanimalshavequaliaandthereforehavesomemoral
status,butthatonlyhumanbeingshavesapience,whichgivesthemahighermoral
statusthannonhumananimals.
1
Thisview,ofcourse,mustconfronttheexistenceof
borderlinecasessuchas,ontheonehand,humaninfantsorhumanbeingswithsevere
mentalretardationsometimesunfortunatelyreferredtoasmarginalhumans
whichfailtosatisfythecriteriaforsapience;and,ontheotherhand,somenonhuman
animalssuchasthegreatapes,whichmightpossessatleastsomeoftheelementsof
sapience.Somedenythatsocalledmarginalhumanshavefullmoralstatus.Others
proposeadditionalwaysinwhichanobjectcouldqualifyasabearerofmoralstatus,
suchasbybeingamemberofakindthatnormallyhassentienceorsapience,orby
standinginasuitablerelationtosomebeingthatindependentlyhasmoralstatus(cf.
MaryAnneWarren2000).Forpresentpurposes,however,wewillfocusonthecriteria
ofsentienceandsapience.

ThispictureofmoralstatussuggeststhatanAIsystemwillhavesomemoralstatusifit
hasthecapacityforqualia,suchasanabilitytofeelpain.AsentientAIsystem,evenif
itlackslanguageandotherhighercognitivefaculties,isnotlikeastuffedtoyanimalor
awindupdoll;itismorelikealivinganimal.Itiswrongtoinflictpainonamouse,
unlesstherearesufficientlystrongmorallyoverridingreasonstodoso.Thesame
wouldholdforanysentientAIsystem.Ifinadditiontosentience,anAIsystemalso

1
Alternatively,onemightdenythatmoralstatuscomesindegrees.Instead,onemightholdthat
certainbeingshavemoresignificantintereststhanotherbeings.Thus,forinstance,onecould
claimthatitisbettertosaveahumanthantosaveabird,notbecausethehumanhashigher
moralstatus,butbecausethehumanhasamoresignificantinterestinhavingherlifesavedthan
doesthebirdinhavingitslifesaved.
8

hassapienceofakindsimilartothatofanormalhumanadult,thenitwouldhavefull
moralstatus,equivalenttothatofhumanbeings.

Oneoftheideasunderlyingthismoralassessmentcanbeexpressedinstrongerform
asaprincipleofnondiscrimination:

PrincipleofSubstrateNonDiscrimination
Iftwobeingshavethesamefunctionalityandthesameconsciousexperience,
anddifferonlyinthesubstrateoftheirimplementation,thentheyhavethe
samemoralstatus.

Onecanargueforthisprincipleongroundsthatrejectingitwouldamountto
embracingapositionsimilartoracism:substratelacksfundamentalmoralsignificance
inthesamewayandforthesamereasonasskincolordoes.ThePrincipleofSubstrate
NonDiscriminationdoesnotimplythatadigitalcomputercouldbeconscious,orthat
itcouldhavethesamefunctionalityasahumanbeing.Substratecanofcoursebe
morallyrelevantinsofarasitmakesadifferencetosentienceorfunctionality.But
holdingthesethingsconstant,itmakesnomoraldifferencewhetherabeingismadeof
siliconorcarbon,orwhetheritsbrainusessemiconductorsorneurotransmitters.

AnadditionalprinciplethatcanbeproposedisthatthefactthatAIsystemsare
artificiali.e.,theproductofdeliberatedesignisnotfundamentallyrelevanttotheir
moralstatus.Wecouldformulatethisasfollows:

PrincipleofOntogenyNonDiscrimination
Iftwobeingshavethesamefunctionalityandthesameconsciousness
experience,anddifferonlyinhowtheycameintoexistence,thentheyhavethe
samemoralstatus.

Today,thisideaiswidelyacceptedinthehumancasealthoughinsomecircles,
particularlyinthepast,theideathatonesmoralstatusdependsononesbloodlineor
castehasbeeninfluential.Wedonotbelievethatcausalfactorssuchasfamily
planning,assisteddelivery,invitrofertilization,gameteselection,deliberate
enhancementofmaternalnutritionetc.whichintroduceanelementofdeliberate
choiceanddesigninthecreationofhumanpersonshaveanynecessaryimplicationsfor
themoralstatusoftheprogeny.Eventhosewhoareopposedtohumanreproductive
cloningformoralorreligiousreasonsgenerallyacceptthat,shouldahumanclonebe
broughttoterm,itwouldhavethesamemoralstatusasanyotherhumaninfant.The
PrincipleofOntogenyNonDiscriminationextendsthisreasoningtothecaseinvolving
entirelyartificialcognitivesystems.

Itis,ofcourse,possibleforcircumstancesofcreationtoaffecttheensuingprogenyin
suchawayastoalteritsmoralstatus.Forexample,ifsomeprocedurewere
9

performedduringconceptionorgestationthatcausedahumanfetustodevelop
withoutabrain,thenthisfactaboutontogenywouldberelevanttoourassessmentof
themoralstatusoftheprogeny.Theanencephalicchild,however,wouldhavethe
samemoralstatusasanyothersimilaranencephalicchild,includingonethathadcome
aboutthroughsomeentirelynaturalprocess.Thedifferenceinmoralstatusbetween
ananencephalicchildandanormalchildisgroundedinthequalitativedifference
betweenthetwothefactthatonehasamindwhiletheotherdoesnot.Sincethetwo
childrendonothavethesamefunctionalityandthesameconsciousexperience,the
PrincipleofOntogenyNonDiscriminationdoesnotapply.

AlthoughthePrincipleofOntogenyNonDiscriminationassertsthatabeings
ontogenyhasnoessentialbearingonitsmoralstatus,itdoesnotdenythatfactsabout
ontogenycanaffectwhatdutiesparticularmoralagentshavetowardthebeingin
question.Parentshavespecialdutiestotheirchildwhichtheydonothavetoother
children,andwhichtheywouldnothaveeveniftherewereanotherchildqualitatively
identicaltotheirown.Similarly,thePrincipleofOntogenyNonDiscriminationis
consistentwiththeclaimthatthecreatorsorownersofanAIsystemwithmoralstatus
mayhavespecialdutiestotheirartificialmindwhichtheydonothavetoanother
artificialmind,evenifthemindsinquestionarequalitativelysimilarandhavethe
samemoralstatus.

Iftheprinciplesofnondiscriminationwithregardtosubstrateandontogenyare
accepted,thenmanyquestionsabouthowweoughttotreatartificialmindscanbe
answeredbyapplyingthesamemoralprinciplesthatweusetodetermineourduties
inmorefamiliarcontexts.Insofarasmoraldutiesstemfrommoralstatus
considerations,weoughttotreatanartificialmindinjustthesamewayasweoughtto
treataqualitativelyidenticalnaturalhumanmindinasimilarsituation.This
simplifiestheproblemofdevelopinganethicsforthetreatmentofartificialminds.

Evenifweacceptthisstance,however,wemustconfrontanumberofnovelethical
questionswhichtheaforementionedprinciplesleaveunanswered.Novelethical
questionsarisebecauseartificialmindscanhaveverydifferentpropertiesfrom
ordinaryhumanoranimalminds.Wemustconsiderhowthesenovelproperties
wouldaffectthemoralstatusofartificialmindsandwhatitwouldmeantorespectthe
moralstatusofsuchexoticminds.

MindswithExoticProperties
Inthecaseofhumanbeings,wedonotnormallyhesitatetoascribesentienceand
consciousexperiencetoanyindividualwhoexhibitsthenormalkindsofhuman
behavior.Fewbelievetheretobeotherpeoplewhoactperfectlynormallybutlack
consciousness.However,otherhumanbeingsdonotmerelybehaveinpersonlike
wayssimilartoourselves;theyalsohavebrainsandcognitivearchitecturesthatare
10

constitutedmuchlikeourown.Anartificialintellect,bycontrast,mightbeconstituted
quitedifferentlyfromahumanintellectyetstillexhibithumanlikebehaviororpossess
thebehavioraldispositionsnormallyindicativeofpersonhood.Itmightthereforebe
possibletoconceiveofanartificialintellectthatwouldbesapient,andperhapswould
beaperson,yetwouldnotbesentientorhaveconsciousexperiencesofanykind.
(Whetherthisisreallypossibledependsontheanswerstosomenontrivial
metaphysicalquestions.)Shouldsuchasystembepossible,itwouldraisethequestion
whetheranonsentientpersonwouldhaveanymoralstatuswhatever;andifso,
whetheritwouldhavethesamemoralstatusasasentientperson.Sincesentience,or
atleastacapacityforsentience,isordinarilyassumedtobepresentinanyindividual
whoisaperson,thisquestionhasnotreceivedmuchattentiontodate.
2

Anotherexoticproperty,onewhichiscertainlymetaphysicallyandphysicallypossible
foranartificialintelligence,isforitssubjectiverateoftimetodeviatedrasticallyfromthe
ratethatischaracteristicofabiologicalhumanbrain.Theconceptofsubjectiverateof
timeisbestexplainedbyfirstintroducingtheideawholebrainemulation,or
uploading.

Uploadingreferstoahypotheticalfuturetechnologythatwouldenableahumanor
otheranimalintellecttobetransferredfromitsoriginalimplementationinanorganic
brainontoadigitalcomputer.Onescenariogoeslikethis:First,averyhighresolution
scanisperformedofsomeparticularbrain,possiblydestroyingtheoriginalinthe
process.Forexample,thebrainmightbevitrifiedanddissectedintothinslices,which
canthenbescannedusingsomeformofhighthroughputmicroscopycombinedwith
automatedimagerecognition.Wemayimaginethisscantobedetailedenoughto
capturealltheneurons,theirsynapticinterconnections,andotherfeaturesthatare
functionallyrelevanttotheoriginalbrainsoperation.Second,thisthreedimensional
mapofthecomponentsofthebrainandtheirinterconnectionsiscombinedwitha
libraryofadvancedneuroscientifictheorywhichspecifiesthecomputational
propertiesofeachbasictypeofelement,suchasdifferentkindsofneuronandsynaptic
junction.Third,thecomputationalstructureandtheassociatedalgorithmicbehavior
ofitscomponentsareimplementedinsomepowerfulcomputer.Iftheuploading
processhasbeensuccessful,thecomputerprogramshouldnowreplicatetheessential

2
Thequestionisrelatedtosomeproblemsinthephilosophyofmindwhichhavereceiveda
greatdealofattention,inparticularthezombieproblem,whichcanbeformulatedasfollows:
Isthereametaphysicallypossibleworldthatisidenticaltotheactualworldwithregardtoall
physicalfacts(includingtheexactphysicalmicrostructureofallbrainsandorganisms)yetthat
differsfromtheactualworldinregardtosomephenomenal(subjectiveexperiential)facts?Put
morecrudely,isitmetaphysicallypossiblethattherecouldbeanindividualwhoisphysically
exactlyidenticaltoyoubutwhoisazombie,i.e.lackingqualiaandphenomenalawareness?
(DavidChalmers,1996)Thisfamiliarquestiondiffersfromtheonereferredtointhetext:our
zombieisallowedtohavesystematicallydifferentphysicalpropertiesfromnormalhumans.
Moreover,wewishtodrawattentionspecificallytotheethicalstatusofasapientzombie.
11

functionalcharacteristicsoftheoriginalbrain.Theresultinguploadmayinhabita
simulatedvirtualreality,or,alternatively,itcouldbegivencontrolofaroboticbody,
enablingittointeractdirectlywithexternalphysicalreality.

Anumberofquestionsariseinthecontextofsuchascenario:Howplausibleisitthat
thisprocedurewillonedaybecometechnologicallyfeasible?Iftheprocedureworked
andproducedacomputerprogramexhibitingroughlythesamepersonality,thesame
memories,andthesamethinkingpatternsastheoriginalbrain,wouldthisprogrambe
sentient?Wouldtheuploadbethesamepersonastheindividualwhosebrainwas
disassembledintheuploadingprocess?Whathappenstopersonalidentityifan
uploadiscopiedsuchthattwosimilarorqualitativelyidenticaluploadmindsare
runninginparallel?Althoughallofthesequestionsarerelevanttotheethicsof
machineintelligence,letusherefocusonanissueinvolvingthenotionofasubjective
rateoftime.

Supposethatanuploadcouldbesentient.Ifweruntheuploadprogramonafaster
computer,thiswillcausetheupload,ifitisconnectedtoaninputdevicesuchasa
videocamera,toperceivetheexternalworldasifithadbeensloweddown.For
example,iftheuploadisrunningathousandtimesfasterthantheoriginalbrain,then
theexternalworldwillappeartotheuploadasifitweresloweddownbyafactorof
thousand.Somebodydropsaphysicalcoffeemug:Theuploadobservesthemug
slowlyfallingtothegroundwhiletheuploadfinishesreadingthemorningnewspaper
andsendsoffafewemails.Onesecondofobjectivetimecorrespondsto17minutesof
subjectivetime.Objectiveandsubjectivedurationcanthusdiverge.

Subjectivetimeisnotthesameasasubjectsestimateorperceptionofhowfasttime
flows.Humanbeingsareoftenmistakenabouttheflowoftime.Wemaybelievethat
itisoneoclockwhenitisinfactaquarterpasttwo;orastimulantdrugmightcause
ourthoughtstorace,makingitseemasthoughmoresubjectivetimehaslapsedthanis
actuallythecase.Thesemundanecasesinvolveadistortedtimeperceptionratherthan
ashiftintherateofsubjectivetime.Eveninacocaineaddledbrain,thereisprobably
notasignificantchangeinthespeedofbasicneurologicalcomputations;morelikely,
thedrugiscausingsuchabraintoflickermorerapidlyfromonethoughttoanother,
makingitspendlesssubjectivetimethinkingeachofagreaternumberofdistinct
thoughts.

Thevariabilityofthesubjectiverateoftimeisanexoticpropertyofartificialmindsthat
raisesnovelethicalissues.Forexample,incaseswherethedurationofanexperienceis
ethicallyrelevant,shoulddurationbemeasuredinobjectiveorsubjectivetime?Ifan
uploadhascommittedacrimeandissentencedtofouryearsinprison,shouldthisbe
fourobjectiveyearswhichmightcorrespondtomanymillenniaofsubjectivetime
orshoulditbefoursubjectiveyears,whichmightbeoverinacoupleofdaysof
objectivetime?IfafastAIandahumanareinpain,isitmoreurgenttoalleviatethe
12

AIspain,ongroundsthatitexperiencesagreatersubjectivedurationofpainforeach
siderealsecondthatpalliationisdelayed?Sinceinouraccustomedcontextof
biologicalhumans,subjectivetimeisnotsignificantlyvariable,itisunsurprisingthat
thiskindofquestionisnotstraightforwardlysettledbyfamiliarethicalnorms,evenif
thesenormsareextendedtoartificialintellectsbymeansofnondiscrimination
principles(suchasthoseproposedintheprevioussection).

Toillustratethekindofethicalclaimthatmightberelevanthere,weformulate(butdo
notarguefor)aprincipleprivilegingsubjectivetimeasthenormativelymore
fundamentalnotion:

PrincipleofSubjectiveRateofTime
Incaseswherethedurationofanexperienceisofbasicnormativesignificance,
itistheexperiencessubjectivedurationthatcounts.

Sofarwehavediscussedtwopossibilities(nonsentientsapienceandvariable
subjectiverateoftime)whichareexoticintherelativelyprofoundsenseofbeing
metaphysicallyproblematicaswellaslackingclearinstancesorparallelsinthe
contemporaryworld.Otherpropertiesofpossibleartificialmindswouldbeexoticina
moresuperficialsense;e.g.,bydiverginginsomeunproblematicallyquantitative
dimensionfromthekindsofmindwithwhichwearefamiliar.Butsuchsuperficially
exoticpropertiesmayalsoposenovelethicalproblemsifnotatthelevelof
foundationalmoralphilosophy,thenatthelevelofappliedethicsorformidlevel
ethicalprinciples.

Oneimportantsetofexoticpropertiesofartificialintelligencesrelatetoreproduction.
Anumberofempiricalconditionsthatapplytohumanreproductionneednotapplyto
artificialintelligences.Forexample,humanchildrenaretheproductofrecombination
ofthegeneticmaterialfromtwoparents;parentshavelimitedabilitytoinfluencethe
characteroftheiroffspring;ahumanembryoneedstobegestatedinthewombfornine
months;ittakesfifteentotwentyyearsforahumanchildtoreachmaturity;ahuman
childdoesnotinherittheskillsandknowledgeacquiredbyitsparents;humanbeings
possessacomplexevolvedsetofemotionaladaptationsrelatedtoreproduction,
nurturing,andthechildparentrelationship.Noneoftheseempiricalconditionsneed
pertaininthecontextofareproducingmachineintelligence.Itisthereforeplausible
thatmanyofthemidlevelmoralprinciplesthatwehavecometoacceptasnorms
governinghumanreproductionwillneedtoberethoughtinthecontextofAI
reproduction.

ToillustratewhysomeofourmoralnormsneedtoberethoughtinthecontextofAI
reproduction,itwillsufficetoconsiderjustoneexoticpropertyofAIs:theircapacity
forrapidreproduction.Givenaccesstocomputerhardware,anAIcouldduplicate
itselfveryquickly,innomoretimethanittakestomakeacopyoftheAIssoftware.
13

Moreover,sincetheAIcopywouldbeidenticaltotheoriginal,itwouldbeborn
completelymature,andthecopycouldbeginmakingitsowncopiesimmediately.
Absenthardwarelimitations,apopulationofAIscouldthereforegrowexponentially
atanextremelyrapidrate,withadoublingtimeontheorderofminutesorhours
ratherthandecadesorcenturies.

Ourcurrentethicalnormsaboutreproductionincludesomeversionofaprincipleof
reproductivefreedom,totheeffectthatitisuptoeachindividualorcoupletodecide
forthemselveswhethertohavechildrenandhowmanychildrentohave.Another
normwehave(atleastinrichandmiddleincomecountries)isthatsocietymuststep
intoprovidethebasicneedsofchildrenincaseswheretheirparentsareunableor
refusingtodoso.Itiseasytoseehowthesetwonormscouldcollideinthecontextof
entitieswiththecapacityforextremelyrapidreproduction.

Consider,forexample,apopulationofuploads,oneofwhomhappenstohavethe
desiretoproduceaslargeaclanaspossible.Givencompletereproductivefreedom,
thisuploadmaystartcopyingitselfasquicklyasitcan;andthecopiesitproduces
whichmayrunonnewcomputerhardwareownedorrentedbytheoriginal,ormay
sharethesamecomputerastheoriginalwillalsostartcopyingthemselves,sincethey
areidenticaltotheprogenitoruploadandshareitsphiloprogenicdesire.Soon,
membersoftheuploadclanwillfindthemselvesunabletopaytheelectricitybillorthe
rentforthecomputationalprocessingandstorageneededtokeepthemalive.Atthis
point,asocialwelfaresystemmightkickintoprovidethemwithatleastthebare
necessitiesforsustaininglife.Butifthepopulationgrowsfasterthantheeconomy,
resourceswillrunout;atwhichpointuploadswilleitherdieortheirabilityto
reproducewillbecurtailed.(Fortworelateddystopianscenarios,seeBostrom(2004).)

Thisscenarioillustrateshowsomemidlevelethicalprinciplesthataresuitablein
contemporarysocietiesmightneedtobemodifiedifthosesocietiesweretoinclude
personswiththeexoticpropertyofbeingabletoreproduceveryrapidly.

Thegeneralpointhereisthatwhenthinkingaboutappliedethicsforcontextsthatare
verydifferentfromourfamiliarhumancondition,wemustbecarefulnottomistake
midlevelethicalprinciplesforfoundationalnormativetruths.Putdifferently,we
mustrecognizetheextenttowhichourordinarynormativepreceptsareimplicitly
conditionedontheobtainingofvariousempiricalconditions,andtheneedtoadjust
thesepreceptsaccordinglywhenapplyingthemtohypotheticalfuturisticcasesin
whichtheirpreconditionsareassumednottoobtain.Bythis,wearenotmakingany
controversialclaimaboutmoralrelativism,butmerelyhighlightingthe
commonsensicalpointthatcontextisrelevanttotheapplicationofethicsand
suggestingthatthispointisespeciallypertinentwhenoneisconsideringtheethicsof
mindswithexoticproperties.

14

Superintelligence
I.J.Good(1965)setforththeclassichypothesisconcerningsuperintelligence:thatan
AIsufficientlyintelligenttounderstanditsowndesigncouldredesignitselforcreatea
successorsystem,moreintelligent,whichcouldthenredesignitselfyetagainto
becomeevenmoreintelligent,andsooninapositivefeedbackcycle.Goodcalledthis
theintelligenceexplosion.RecursivescenariosarenotlimitedtoAI:humanswith
intelligenceaugmentedthroughabraincomputerinterfacemightturntheirmindsto
designingthenextgenerationofbraincomputerinterfaces.(Ifyouhadamachinethat
increasedyourIQ,itwouldbeboundtooccurtoyou,onceyoubecamesmartenough,
totrytodesignamorepowerfulversionofthemachine.)

Superintelligencemayalsobeachievablebyincreasingprocessingspeed.Thefastest
observedneuronsfire1000timespersecond;thefastestaxonfibersconductsignalsat
150meters/second,ahalfmillionththespeedoflight(Sandberg1999).Itseemsthatit
shouldbephysicallypossibletobuildabrainwhichcomputesamilliontimesasfastas
ahumanbrain,withoutshrinkingitssizeorrewritingitssoftware.Ifahumanmind
werethusaccelerated,asubjectiveyearofthinkingwouldbeaccomplishedforevery
31physicalsecondsintheoutsideworld,andamillenniumwouldflybyineightanda
halfhours.Vinge(1993)referredtosuchspedupmindsasweaksuperintelligence:
amindthatthinkslikeahumanbutmuchfaster.

Yudkowsky(2008a)liststhreefamiliesofmetaphorsforvisualizingthecapabilityofa
smarterthanhumanAI:

Metaphorsinspiredbydifferencesofindividualintelligencebetweenhumans:
AIswillpatentnewinventions,publishgroundbreakingresearchpapers,
makemoneyonthestockmarket,orleadpoliticalpowerblocks.
Metaphorsinspiredbyknowledgedifferencesbetweenpastandpresent
humancivilizations:FastAIswillinventcapabilitiesthatfuturistscommonly
predictforhumancivilizationsacenturyormillenniuminthefuture,like
molecularnanotechnologyorinterstellartravel.
Metaphorsinspiredbydifferencesofbrainarchitecturebetweenhumansand
otherbiologicalorganisms:E.g.,Vinge(1993):Imaginerunningadogmind
atveryhighspeed.Wouldathousandyearsofdoggylivingadduptoany
humaninsight?Thatis:Changesofcognitivearchitecturemightproduce
insightsthatnohumanlevelmindwouldbeabletofind,orperhapseven
represent,afteranyamountoftime.

Evenifwerestrictourselvestohistoricalmetaphors,itbecomesclearthatsuperhuman
intelligencepresentsethicalchallengesthatarequiteliterallyunprecedented.Atthis
pointthestakesarenolongeronanindividualscale(e.g.,mortgageunjustly
disapproved,housecatchesfire,personagentmistreated)butonaglobalorcosmic
15

scale(e.g.,humanityisextinguishedandreplacedbynothingwewouldregardas
worthwhile).Or,ifsuperintelligencecanbeshapedtobebeneficial,then,depending
onitstechnologicalcapabilities,itmightmakeshortworkofmanypresentday
problemsthathaveprovendifficulttoourhumanlevelintelligence.

SuperintelligenceisoneofseveralexistentialrisksasdefinedbyBostrom(2002):a
riskwhereanadverseoutcomewouldeitherannihilateEarthoriginatingintelligent
lifeorpermanentlyanddrasticallycurtailitspotential.Conversely,apositive
outcomeforsuperintelligencecouldpreserveEarthoriginatingintelligentlifeandhelp
fulfillitspotential.Itisimportanttoemphasizethatsmartermindsposegreat
potentialbenefitsaswellasrisks.

Attemptstoreasonaboutglobalcatastrophicrisksmaybesusceptibletoanumberof
cognitivebiases(Yudkowsky2008b),includingthegoodstorybiasproposedby
Bostrom(2002):

Supposeourintuitionsaboutwhichfuturescenariosareplausibleand
realisticareshapedbywhatweseeonTVandinmoviesandwhatwereadin
novels.(Afterall,alargepartofthediscourseaboutthefuturethatpeople
encounterisintheformoffictionandotherrecreationalcontexts.)Weshould
then,whenthinkingcritically,suspectourintuitionsofbeingbiasedinthe
directionofoverestimatingtheprobabilityofthosescenariosthatmakefora
goodstory,sincesuchscenarioswillseemmuchmorefamiliarandmore
real.ThisGoodstorybiascouldbequitepowerful.Whenwasthelasttime
yousawamovieabouthumankindsuddenlygoingextinct(withoutwarning
andwithoutbeingreplacedbysomeothercivilization)?Whilethisscenario
maybemuchmoreprobablethanascenarioinwhichhumanheroes
successfullyrepelaninvasionofmonstersorrobotwarriors,itwouldntbe
muchfuntowatch.

Trulydesirableoutcomesmakepoormovies:Noconflictmeansnostory.While
AsimovsThreeLawsofRobotics(Asimov1942)aresometimescitedasamodelfor
ethicalAIdevelopment,theThreeLawsareasmuchaplotdeviceasAsimovs
positronicbrain.IfAsimovhaddepictedtheThreeLawsasworkingwell,hewould
havehadnostories.

ItwouldbeamistaketoregardAIsasaspecieswithfixedcharacteristicsandask,
Willtheybegoodorevil?ThetermArtificialIntelligencereferstoavastdesign
space,presumablymuchlargerthanthespaceofhumanminds(sinceallhumansshare
acommonbrainarchitecture).Itmaybeaformofgoodstorybiastoask,WillAIsbe
goodorevil?asiftryingtopickapremiseforamovieplot.Thereplyshouldbe,
ExactlywhichAIdesignareyoutalkingabout?

16

CancontrolovertheinitialprogrammingofanArtificialIntelligencetranslateinto
influenceonitslatereffectontheworld?Kurzweil(2005)holdsthat[i]ntelligenceis
inherentlyimpossibletocontrol,andthatdespiteanyhumanattemptsattaking
precautions,[b]ydefinitionintelligententitieshavetheclevernesstoeasily
overcomesuchbarriers.LetussupposethattheAIisnotonlyclever,butthat,aspart
oftheprocessofimprovingitsownintelligence,ithasunhinderedaccesstoitsown
sourcecode:itcanrewriteitselftoanythingitwantsitselftobe.Yetitdoesnotfollow
thattheAImustwanttorewriteitselftoahostileform.

ConsiderGandhi,whoseemstohavepossessedasinceredesirenottokillpeople.
Gandhiwouldnotknowinglytakeapillthatcausedhimtowanttokillpeople,
becauseGandhiknowsthatifhewantstokillpeople,hewillprobablykillpeople,and
thecurrentversionofGandhidoesnotwanttokill.Moregenerally,itseemslikely
thatmostselfmodifyingmindswillnaturallyhavestableutilityfunctions,which
impliesthataninitialchoiceofminddesigncanhavelastingeffects(Omohundro
2008).

AtthispointinthedevelopmentofAIscience,isthereanywaywecantranslatethe
taskoffindingadesignforgoodAIsintoamodernresearchdirection?Itmayseem
prematuretospeculate,butonedoessuspectthatsomeAIparadigmsaremorelikely
thanotherstoeventuallyproveconducivetothecreationofintelligentselfmodifying
agentswhosegoalsremainpredictableevenaftermultipleiterationsofself
improvement.Forexample,theBayesianbranchofAI,inspiredbycoherent
mathematicalsystemssuchasprobabilitytheoryandexpectedutilitymaximization,
seemsmoreamenabletothepredictableselfmodificationproblemthanevolutionary
programmingandgeneticalgorithms.Thisisacontroversialstatement,butit
illustratesthepointthatifwearethinkingaboutthechallengeofsuperintelligence
downtheroad,thiscanindeedbeturnedintodirectionaladviceforpresentAI
research.

YetevensupposingthatwecanspecifyanAIsgoalsystemtobepersistentunderself
modificationandselfimprovement,thisonlybeginstotouchonthecoreethical
problemsofcreatingsuperintelligence.Humans,thefirstgeneralintelligencestoexist
onEarth,haveusedthatintelligencetosubstantiallyreshapetheglobecarving
mountains,tamingrivers,buildingskyscrapers,farmingdeserts,producing
unintendedplanetaryclimatechanges.Amorepowerfulintelligencecouldhave
correspondinglylargerconsequences.

Consideragainthehistoricalmetaphorforsuperintelligencedifferencessimilartothe
differencesbetweenpastandpresentcivilizations.Ourpresentcivilizationisnot
separatedfromancientGreeceonlybyimprovedscienceandincreasedtechnological
capability.Thereisadifferenceofethicalperspectives:AncientGreeksthought
slaverywasacceptable;wethinkotherwise.Evenbetweenthenineteenthand
17

twentiethcenturies,thereweresubstantialethicaldisagreementsshouldwomen
havethevote?Shouldblackshavethevote?Itseemslikelythatpeopletodaywillnot
beseenasethicallyperfectbyfuturecivilizationsnotjustbecauseofourfailureto
solvecurrentlyrecognizedethicalproblems,suchaspovertyandinequality,butalso
forourfailureeventorecognizecertainethicalproblems.Perhapssomedaytheactof
subjectingchildrentoinvoluntarilyschoolingwillbeseenaschildabuseormaybe
allowingchildrentoleaveschoolatage18willbeseenaschildabuse.Wedontknow.

Consideringtheethicalhistoryofhumancivilizationsovercenturiesoftime,wecan
seethatitmightproveaverygreattragedytocreateamindthatwasstableinethical
dimensionsalongwhichhumancivilizationsseemtoexhibitdirectionalchange.Whatif
ArchimedesofSyracusehadbeenabletocreatealonglastingartificialintellectwitha
fixedversionofthemoralcodeofAncientGreece?Buttoavoidthissortofethical
stagnationislikelytoprovetricky:itwouldnotsuffice,forexample,simplytorender
themindrandomlyunstable.TheancientGreeks,eveniftheyhadrealizedtheirown
imperfection,couldnothavedonebetterbyrollingdice.Occasionallyagoodnewidea
inethicscomesalong,anditcomesasasurprise;butmostrandomlygeneratedethical
changeswouldstrikeusasfollyorgibberish.

Thispresentsuswithperhapstheultimatechallengeofmachineethics:Howdoyou
buildanAIwhich,whenitexecutes,becomesmoreethicalthanyou?Thisisnotlike
askingourownphilosopherstoproducesuperethics,anymorethanDeepBluewas
constructedbygettingthebesthumanchessplayerstoprogramingoodmoves.But
wehavetobeabletoeffectivelydescribethequestion,ifnottheanswerrollingdice
wontgenerategoodchessmoves,orgoodethicseither.Or,perhapsamore
productivewaytothinkabouttheproblem:Whatstrategywouldyouwant
Archimedestofollowinbuildingasuperintelligence,suchthattheoveralloutcome
wouldstillbeacceptable,ifyoucouldnttellhimwhatspecificallyhewasdoing
wrong?Thisisverymuchthesituationthatwearein,relativetothefuture.

Onestrongpieceofadvicethatemergesfromconsideringoursituationasanalogousto
thatofArchimedesisthatweshouldnottrytoinventasuperversionofwhatour
owncivilizationconsiderstobeethicsthisisnotthestrategywewouldhavewanted
Archimedestofollow.Perhapsthequestionweshouldbeconsidering,rather,ishow
anAIprogrammedbyArchimedes,withnomoremoralexpertisethanArchimedes,
couldrecognize(atleastsomeof)ourowncivilizationsethicsasmoralprogressas
opposedtomeremoralinstability.Thiswouldrequirethatwebegintocomprehend
thestructureofethicalquestionsinthewaythatwehavealreadycomprehendedthe
structureofchess.

IfweareseriousaboutdevelopingadvancedAI,thisisachallengethatwemustmeet.
Ifmachinesaretobeplacedinapositionofbeingstronger,faster,moretrusted,or
18

smarterthanhumans,thenthedisciplineofmachineethicsmustcommititselfto
seekinghumansuperior(notjusthumanequivalent)niceness.
3

Conclusion
AlthoughcurrentAIoffersusfewethicalissuesthatarenotalreadypresentinthe
designofcarsorpowerplants,theapproachofAIalgorithmstowardmorehumanlike
thoughtportendspredictablecomplications.SocialrolesmaybefilledbyAI
algorithms,implyingnewdesignrequirementsliketransparencyandpredictability.
SufficientlygeneralAIalgorithmsmaynolongerexecuteinpredictablecontexts,
requiringnewkindsofsafetyassuranceandtheengineeringofartificialethical
considerations.AIswithsufficientlyadvancedmentalstates,ortherightkindof
states,willhavemoralstatus,andsomemaycountaspersonsthoughperhaps
personsverymuchunlikethesortthatexistnow,perhapsgovernedbydifferentrules.
Andfinally,theprospectofAIswithsuperhumanintelligenceandsuperhuman
abilitiespresentsuswiththeextraordinarychallengeofstatinganalgorithmthat
outputssuperethicalbehavior.Thesechallengesmayseemvisionary,butitseems
predictablethatwewillencounterthem;andtheyarenotdevoidofsuggestionsfor
presentdayresearchdirections.

Authorbiographies
NickBostromisProfessorintheFacultyofPhilosophyatOxfordUniversityand
DirectoroftheFutureofHumanityInstitutewithintheOxfordMartinSchool.Heis
theauthorofsome200publications,includingAnthropicBias(Routledge,2002),Global
CatastrophicRisks(ed.,OUP,2008),andEnhancingHumans(ed.,OUP,2009).His
researchcoversarangeofbigpicturequestionsforhumanity.Heiscurrentlyworking
abookonthefutureofmachineintelligenceanditsstrategicimplications.

EliezerYudkowskyisaResearchFellowattheSingularityInstituteforArtificial
Intelligencewhereheworksfulltimeontheforeseeabledesignissuesofgoal
architecturesinselfimprovingAI.Hiscurrentworkcentersonmodifyingclassical
decisiontheorytocoherentlydescribeselfmodification.Heisalsoknownforhis
popularwritingonissuesofhumanrationalityandcognitivebiases.

Furtherreading
Bostrom,N.2004.TheFutureofHumanEvolution,inDeathandAntiDeath:Two
HundredYearsAfterKant,FiftyYearsAfterTuring,ed.CharlesTandy(PaloAlto,
California:RiaUniversityPress).Thispaperexploressomeevolutionarydynamics
thatcouldleadapopulationofdiverseuploadstodevelopindystopiandirections.

3
TheauthorsaregratefultoRebeccaRoacheforresearchassistanceandtotheeditorsofthis
volumefordetailedcommentsonanearlierversionofourmanuscript.
19

Yudkowsky,E.2008a.ArtificialIntelligenceasaPositiveandNegativeFactorin
GlobalRisk,inBostromandCirkovic(eds.),pp.308345.Anintroductiontothe
risksandchallengespresentedbythepossibilityofrecursivelyselfimproving
superintelligentmachines.

Wendell,W.2008.MoralMachines:TeachingRobotsRightfromWrong(Oxford
UniversityPress,2008).Acomprehensivesurveyofrecentdevelopments.

References
Asimov,I.1942.Runaround,AstoundingScienceFiction,March1942.
Beauchamp,T.andChilress,J.PrinciplesofBiomedicalEthics.Oxford:OxfordUniversity
Press.
Bostrom,N.2002.ExistentialRisks:AnalyzingHumanExtinctionScenarios,Journalof
EvolutionandTechnology9
(http://www.nickbostrom.com/existential/risks.html).
Bostrom,N.2003.AstronomicalWaste:TheOpportunityCostofDelayed
TechnologicalDevelopment,Utilitas15:308314.
Bostrom,N.2004.TheFutureofHumanEvolution,inDeathandAntiDeath:Two
HundredYearsAfterKant,FiftyYearsAfterTuring,ed.CharlesTandy(Palo
Alto,California:RiaUniversityPress)
(http://www.nickbostrom.com/fut/evolution.pdf)
Bostrom,N.andCirkovic,M.(eds.)2007.GlobalCatastrophicRisks.Oxford:Oxford
UniversityPress.
Chalmers,D.J.,1996,TheConsciousMind:InSearchofaFundamentalTheory.NewYork
andOxford:OxfordUniversityPress
Hirschfeld,L.A.andGelman,S.A.(eds.)1994.MappingtheMind:DomainSpecificityin
CognitionandCulture,Cambridge:CambridgeUniversityPress.
Goertzel,B.andPennachin,C.(eds.)2006.ArtificialGeneralIntelligence.NewYork,NY:
SpringerVerlag.
Good,I.J.1965.SpeculationsConcerningtheFirstUltraintelligentMachine,inAlt,F.
L.andRubinoff,M.(eds.)AdvancesinComputers,6,NewYork:Academic
Press.Pp.3188.
Hastie,T.,Tibshirani,R.andFriedman,J.2001.TheElementsofStatisticalLearning.New
York,NY:SpringerScience.
Henley,K.1993.AbstractPrinciples,MidlevelPrinciples,andtheRuleofLaw,Law
andPhilosophy12:12132.
Hofstadter,D.2006.TryingtoMuseRationallyabouttheSingularityScenario,
presentedattheSingularitySummitatStanford,2006.
Howard,PhilipK.1994.TheDeathofCommonSense:HowLawisSuffocatingAmerica.
NewYork,NY:WarnerBooks.
20

Kamm,F.2007.IntricateEthics:Rights,Responsibilities,andPermissibleHarm.Oxford:
OxfordUniversityPress.
Kurzweil,R.2005.TheSingularityIsNear:WhenHumansTranscendBiology.NewYork,
NY:Viking.
McDermott,D.1976.Artificialintelligencemeetsnaturalstupidity,ACMSIGART
Newsletter57:49.
Omohundro,S.2008.TheBasicAIDrives,ProceedingsoftheAGI08Workshop.
Amsterdam:IOSPress.Pp.483492.
Sandberg,A.1999.ThePhysicsofInformationProcessingSuperobjects:DailyLife
AmongtheJupiterBrains,JournalofEvolutionandTechnology,5.
Vinge,V.1993.TheComingTechnologicalSingularity,presentedattheVISION21
Symposium,March,1993.
Warren,M.E.2000.MoralStatus:ObligationstoPersonsandOtherLivingThings.Oxford:
OxfordUniversityPress.
Yudkowsky,E.2006.AIasaPreciseArt,presentedatthe2006AGIWorkshopin
Bethesda,MD.
Yudkowsky,E.2008a.ArtificialIntelligenceasaPositiveandNegativeFactorin
GlobalRisk,inBostromandCirkovic(eds.),pp.308345.
Yudkowsky,E.2008b.Cognitivebiasespotentiallyaffectingjudgmentofglobalrisks,
inBostromandCirkovic(eds.),pp.91119.

You might also like