Professional Documents
Culture Documents
Bostrom Nick - Ethics of Artificial Intelligence
Bostrom Nick - Ethics of Artificial Intelligence
(2011)
NickBostrom
EliezerYudkowsky
DraftforCambridgeHandbookofArtificialIntelligence,eds.WilliamRamseyandKeith
Frankish(CambridgeUniversityPress,2011):forthcoming
Thepossibilityofcreatingthinkingmachinesraisesahostofethicalissues.These
questionsrelatebothtoensuringthatsuchmachinesdonotharmhumansandother
morallyrelevantbeings,andtothemoralstatusofthemachinesthemselves.Thefirst
sectiondiscussesissuesthatmayariseinthenearfutureofAI.Thesecondsection
outlineschallengesforensuringthatAIoperatessafelyasitapproacheshumansinits
intelligence.Thethirdsectionoutlineshowwemightassesswhether,andinwhat
circumstances,AIsthemselveshavemoralstatus.Inthefourthsection,weconsider
howAIsmightdifferfromhumansincertainbasicrespectsrelevanttoourethical
assessmentofthem.ThefinalsectionaddressestheissuesofcreatingAIsmore
intelligentthanhuman,andensuringthattheyusetheiradvancedintelligencefor
goodratherthanill.
EthicsinMachineLearningandOtherDomainSpecificAI
Algorithms
Imagine,inthenearfuture,abankusingamachinelearningalgorithmtorecommend
mortgageapplicationsforapproval.Arejectedapplicantbringsalawsuitagainstthe
bank,allegingthatthealgorithmisdiscriminatingraciallyagainstmortgage
applicants.Thebankrepliesthatthisisimpossible,sincethealgorithmisdeliberately
blindedtotheraceoftheapplicants.Indeed,thatwaspartofthebanksrationalefor
implementingthesystem.Evenso,statisticsshowthatthebanksapprovalratefor
blackapplicantshasbeensteadilydropping.Submittingtenapparentlyequally
qualifiedgenuineapplicants(asdeterminedbyaseparatepanelofhumanjudges)
showsthatthealgorithmacceptswhiteapplicantsandrejectsblackapplicants.What
couldpossiblybehappening?
Findingananswermaynotbeeasy.Ifthemachinelearningalgorithmisbasedona
complicatedneuralnetwork,orageneticalgorithmproducedbydirectedevolution,
thenitmayprovenearlyimpossibletounderstandwhy,orevenhow,thealgorithmis
judgingapplicantsbasedontheirrace.Ontheotherhand,amachinelearnerbasedon
decisiontreesorBayesiannetworksismuchmoretransparenttoprogrammer
2
inspection(Hastieetal.2001),whichmayenableanauditortodiscoverthattheAI
algorithmusestheaddressinformationofapplicantswhowerebornorpreviously
residedinpredominantlypovertystrickenareas.
AIalgorithmsplayanincreasinglylargeroleinmodernsociety,thoughusuallynot
labeledAI.Thescenariodescribedabovemightbetranspiringevenaswewrite.It
willbecomeincreasinglyimportanttodevelopAIalgorithmsthatarenotjustpowerful
andscalable,butalsotransparenttoinspectiontonameoneofmanysociallyimportant
properties.
Somechallengesofmachineethicsaremuchlikemanyotherchallengesinvolvedin
designingmachines.Designingarobotarmtoavoidcrushingstrayhumansisno
moremorallyfraughtthandesigningaflameretardantsofa.Itinvolvesnew
programmingchallenges,butnonewethicalchallenges.ButwhenAIalgorithmstake
oncognitiveworkwithsocialdimensionscognitivetaskspreviouslyperformedby
humanstheAIalgorithminheritsthesocialrequirements.Itwouldsurelybe
frustratingtofindthatnobankintheworldwillapproveyourseeminglyexcellent
loanapplication,andnobodyknowswhy,andnobodycanfindouteveninprinciple.
(Maybeyouhaveafirstnamestronglyassociatedwithdeadbeats?Whoknows?)
TransparencyisnottheonlydesirablefeatureofAI.ItisalsoimportantthatAI
algorithmstakingoversocialfunctionsbepredictabletothosetheygovern.To
understandtheimportanceofsuchpredictability,considerananalogy.Thelegal
principleofstaredecisisbindsjudgestofollowpastprecedentwheneverpossible.To
anengineer,thispreferenceforprecedentmayseemincomprehensiblewhybindthe
futuretothepast,whentechnologyisalwaysimproving?Butoneofthemost
importantfunctionsofthelegalsystemistobepredictable,sothat,e.g.,contractscan
bewrittenknowinghowtheywillbeexecuted.Thejobofthelegalsystemisnot
necessarilytooptimizesociety,buttoprovideapredictableenvironmentwithinwhich
citizenscanoptimizetheirownlives.
ItwillalsobecomeincreasinglyimportantthatAIalgorithmsberobustagainst
manipulation.Amachinevisionsystemtoscanairlineluggageforbombsmustbe
robustagainsthumanadversariesdeliberatelysearchingforexploitableflawsinthe
algorithmforexample,ashapethat,placednexttoapistolinonesluggage,would
neutralizerecognitionofit.Robustnessagainstmanipulationisanordinarycriterion
ininformationsecurity;nearlythecriterion.Butitisnotacriterionthatappearsoften
inmachinelearningjournals,whicharecurrentlymoreinterestedin,e.g.,howan
algorithmscalesuponlargerparallelsystems.
Anotherimportantsocialcriterionfordealingwithorganizationsisbeingabletofind
thepersonresponsibleforgettingsomethingdone.WhenanAIsystemfailsatits
assignedtask,whotakestheblame?Theprogrammers?Theendusers?Modern
3
bureaucratsoftentakerefugeinestablishedproceduresthatdistributeresponsibilityso
widelythatnoonepersoncanbeidentifiedtoblameforthecatastrophesthatresult
(Howard1994).Theprovablydisinterestedjudgmentofanexpertsystemcouldturn
outtobeanevenbetterrefuge.EvenifanAIsystemisdesignedwithauseroverride,
onemustconsiderthecareerincentiveofabureaucratwhowillbepersonallyblamed
iftheoverridegoeswrong,andwhowouldmuchprefertoblametheAIforany
difficultdecisionwithanegativeoutcome.
Responsibility,transparency,auditability,incorruptibility,predictability,anda
tendencytonotmakeinnocentvictimsscreamwithhelplessfrustration:allcriteriathat
applytohumansperformingsocialfunctions;allcriteriathatmustbeconsideredinan
algorithmintendedtoreplacehumanjudgmentofsocialfunctions;allcriteriathatmay
notappearinajournalofmachinelearningconsideringhowanalgorithmscalesupto
morecomputers.Thislistofcriteriaisbynomeansexhaustive,butitservesasasmall
sampleofwhatanincreasinglycomputerizedsocietyshouldbethinkingabout.
ArtificialGeneralIntelligence
ThereisnearlyuniversalagreementamongmodernAIprofessionalsthatArtificial
Intelligencefallsshortofhumancapabilitiesinsomecriticalsense,eventhoughAI
algorithmshavebeatenhumansinmanyspecificdomainssuchaschess.Ithasbeen
suggestedbysomethatassoonasAIresearchersfigureouthowtodosomething,that
capabilityceasestoberegardedasintelligentchesswasconsideredtheepitomeof
intelligenceuntilDeepBluewontheworldchampionshipfromKasparovbuteven
theseresearchersagreethatsomethingimportantismissingfrommodernAIs(e.g.,
Hofstadter2006).
WhilethissubfieldofArtificialIntelligenceisonlyjustcoalescing,ArtificialGeneral
Intelligence(hereafter,AGI)istheemergingtermofartusedtodenoterealAI(see,
e.g.,theeditedvolumeGoertzelandPennachin2006).Asthenameimplies,the
emergingconsensusisthatthemissingcharacteristicisgenerality.CurrentAI
algorithmswithhumanequivalentorsuperiorperformancearecharacterizedbya
deliberatelyprogrammedcompetenceonlyinasingle,restricteddomain.DeepBlue
becametheworldchampionatchess,butitcannotevenplaycheckers,letalonedrivea
carormakeascientificdiscovery.SuchmodernAIalgorithmsresembleallbiological
lifewiththesoleexceptionofHomosapiens.Abeeexhibitscompetenceatbuilding
hives;abeaverexhibitscompetenceatbuildingdams;butabeedoesntbuilddams,
andabeavercantlearntobuildahive.Ahuman,watching,canlearntodoboth;but
thisisauniqueabilityamongbiologicallifeforms.Itisdebatablewhetherhuman
intelligenceistrulygeneralwearecertainlybetteratsomecognitivetasksthanothers
(HirschfeldandGelman1994)buthumanintelligenceissurelysignificantlymore
generallyapplicablethannonhominidintelligence.
ItisrelativelyeasytoenvisagethesortofsafetyissuesthatmayresultfromAI
operatingonlywithinaspecificdomain.Itisaqualitativelydifferentclassofproblem
tohandleanAGIoperatingacrossmanynovelcontextsthatcannotbepredictedin
advance.
Whenhumanengineersbuildanuclearreactor,theyenvisionthespecificeventsthat
couldgooninsideitvalvesfailing,computersfailing,coresincreasingin
temperatureandengineerthereactortorendertheseeventsnoncatastrophic.Or,on
amoremundanelevel,buildingatoasterinvolvesenvisioningbreadandenvisioning
thereactionofthebreadtothetoastersheatingelement.Thetoasteritselfdoesnot
knowthatitspurposeistomaketoastthepurposeofthetoasterisrepresentedwithin
thedesignersmind,butisnotexplicitlyrepresentedincomputationsinsidethe
toasterandsoifyouplaceclothinsideatoaster,itmaycatchfire,asthedesign
executesinanunenvisionedcontextwithanunenvisionedsideeffect.
EventaskspecificAIalgorithmsthrowusoutsidethetoasterparadigm,thedomainof
locallypreprogrammed,specificallyenvisionedbehavior.ConsiderDeepBlue,the
chessalgorithmthatbeatGarryKasparovfortheworldchampionshipofchess.Were
itthecasethatmachinescanonlydoexactlyastheyaretold,theprogrammerswould
havehadtomanuallypreprogramadatabasecontainingmovesforeverypossible
chesspositionthatDeepBluecouldencounter.ButthiswasnotanoptionforDeep
Bluesprogrammers.First,thespaceofpossiblechesspositionsisunmanageably
large.Second,iftheprogrammershadmanuallyinputwhattheyconsideredagood
moveineachpossiblesituation,theresultingsystemwouldnothavebeenableto
makestrongerchessmovesthanitscreators.Sincetheprogrammersthemselveswere
notworldchampions,suchasystemwouldnothavebeenabletodefeatGarry
Kasparov.
Increatingasuperhumanchessplayer,thehumanprogrammersnecessarilysacrificed
theirabilitytopredictDeepBlueslocal,specificgamebehavior.Instead,DeepBlues
programmershad(justifiable)confidencethatDeepBlueschessmoveswouldsatisfya
nonlocalcriterionofoptimality:namely,thatthemoveswouldtendtosteerthefuture
ofthegameboardintooutcomesinthewinningregionasdefinedbythechessrules.
Thispredictionaboutdistantconsequences,thoughitprovedaccurate,didnotallow
theprogrammerstoenvisionthelocalbehaviorofDeepBlueitsresponsetoaspecific
attackonitskingbecauseDeepBluecomputedthenonlocalgamemap,thelink
betweenamoveanditspossiblefutureconsequences,moreaccuratelythanthe
programmerscould(Yudkowsky2006).
Modernhumansdoliterallymillionsofthingstofeedthemselvestoservethefinal
consequenceofbeingfed.FewoftheseactivitieswereenvisionedbyNatureinthe
senseofbeingancestralchallengestowhichwearedirectlyadapted.Butouradapted
brainhasgrownpowerfulenoughtobesignificantlymoregenerallyapplicable;toletus
5
foreseetheconsequencesofmillionsofdifferentactionsacrossdomains,andexertour
preferencesoverfinaloutcomes.Humanscrossedspaceandputfootprintsonthe
Moon,eventhoughnoneofourancestorsencounteredachallengeanalogousto
vacuum.ComparedtodomainspecificAI,itisaqualitativelydifferentproblemto
designasystemthatwilloperatesafelyacrossthousandsofcontexts;including
contextsnotspecificallyenvisionedbyeitherthedesignersortheusers;including
contextsthatnohumanhasyetencountered.Heretheremaybenolocalspecification
ofgoodbehaviornosimplespecificationoverthebehaviorsthemselves,anymore
thanthereexistsacompactlocaldescriptionofallthewaysthathumansobtaintheir
dailybread.
TobuildanAIthatactssafelywhileactinginmanydomains,withmany
consequences,includingproblemstheengineersneverexplicitlyenvisioned,onemust
specifygoodbehaviorinsuchtermsasXsuchthattheconsequenceofXisnot
harmfultohumans.Thisisnonlocal;itinvolvesextrapolatingthedistant
consequencesofactions.Thus,thisisonlyaneffectivespecificationonethatcanbe
realizedasadesignpropertyifthesystemexplicitlyextrapolatestheconsequencesof
itsbehavior.Atoastercannothavethisdesignpropertybecauseatoastercannot
foreseetheconsequencesoftoastingbread.
Imagineanengineerhavingtosay,Well,IhavenoideahowthisairplaneIbuiltwill
flysafelyindeedIhavenoideahowitwillflyatall,whetheritwillflapitswingsor
inflateitselfwithheliumorsomethingelseIhaventevenimaginedbutIassureyou,
thedesignisvery,verysafe.Thismayseemlikeanunenviablepositionfromthe
perspectiveofpublicrelations,butitshardtoseewhatotherguaranteeofethical
behaviorwouldbepossibleforageneralintelligenceoperatingonunforeseen
problems,acrossdomains,withpreferencesoverdistantconsequences.Inspectingthe
cognitivedesignmightverifythatthemindwas,indeed,searchingforsolutionsthat
wewouldclassifyasethical;butwecouldntpredictwhichspecificsolutionthemind
woulddiscover.
Respectingsuchaverificationrequiressomewaytodistinguishtrustworthy
assurances(aprocedurewhichwillnotsaytheAIissafeunlesstheAIreallyissafe)
frompurehopeandmagicalthinking(IhavenoideahowthePhilosophersStone
willtransmuteleadtogold,butIassureyou,itwill!).Oneshouldbearinmindthat
purelyhopefulexpectationshavepreviouslybeenaprobleminAIresearch
(McDermott1976).
VerifiablyconstructingatrustworthyAGIwillrequiredifferentmethods,anda
differentwayofthinking,frominspectingpowerplantsoftwareforbugsitwill
requireanAGIthatthinkslikeahumanengineerconcernedaboutethics,notjusta
simpleproductofethicalengineering.
ThusthedisciplineofAIethics,especiallyasappliedtoAGI,islikelytodiffer
fundamentallyfromtheethicaldisciplineofnoncognitivetechnologies,inthat:
Thelocal,specificbehavioroftheAImaynotbepredictableapartfromits
safety,eveniftheprogrammersdoeverythingright;
Verifyingthesafetyofthesystembecomesagreaterchallengebecausewe
mustverifywhatthesystemistryingtodo,ratherthanbeingabletoverifythe
systemssafebehaviorinalloperatingcontexts;
Ethicalcognitionitselfmustbetakenasasubjectmatterofengineering.
MachineswithMoralStatus
Adifferentsetofethicalissuesariseswhenwecontemplatethepossibilitythatsome
futureAIsystemsmightbecandidatesforhavingmoralstatus.Ourdealingswith
beingspossessedofmoralstatusarenotexclusivelyamatterofinstrumental
rationality:wealsohavemoralreasonstotreatthemincertainways,andtorefrain
fromtreatingthemincertainotherways.FrancisKammhasproposedthefollowing
definitionofmoralstatus,whichwillserveforourpurposes:
Xhasmoralstatus=becauseXcountsmorallyinitsownright,itis
permissible/impermissibletodothingstoitforitsownsake.(Kamm2007:
chapter7;paraphrase)
Arockhasnomoralstatus:wemaycrushit,pulverizeit,orsubjectittoanytreatment
welikewithoutanyconcernfortherockitself.Ahumanperson,ontheotherhand,
mustbetreatednotonlyasameansbutalsoasanend.Exactlywhatitmeanstotreat
apersonasanendissomethingaboutwhichdifferentethicaltheoriesdisagree;butit
certainlyinvolvestakingherlegitimateinterestsintoaccountgivingweighttoher
wellbeinganditmayalsoinvolveacceptingstrictmoralsideconstraintsinour
dealingswithher,suchasaprohibitionagainstmurderingher,stealingfromher,or
doingavarietyofotherthingstoherorherpropertywithoutherconsent.Moreover,
itisbecauseahumanpersoncountsinherownright,andforhersake,thatitis
impermissibletodotoherthesethings.Thiscanbeexpressedmoreconciselyby
sayingthatahumanpersonhasmoralstatus.
Questionsaboutmoralstatusareimportantinsomeareasofpracticalethics.For
example,disputesaboutthemoralpermissibilityofabortionoftenhingeon
disagreementsaboutthemoralstatusoftheembryo.Controversiesaboutanimal
experimentationandthetreatmentofanimalsinthefoodindustryinvolvequestions
aboutthemoralstatusofdifferentspeciesofanimal.Andourobligationstowards
humanbeingswithseveredementia,suchaslatestageAlzheimerspatients,mayalso
dependonquestionsofmoralstatus.
ItiswidelyagreedthatcurrentAIsystemshavenomoralstatus.Wemaychange,
copy,terminate,delete,orusecomputerprogramsasweplease;atleastasfarasthe
programsthemselvesareconcerned.Themoralconstraintstowhichwearesubjectin
ourdealingswithcontemporaryAIsystemsareallgroundedinourresponsibilitiesto
otherbeings,suchasourfellowhumans,notinanydutiestothesystemsthemselves.
WhileitisfairlyconsensualthatpresentdayAIsystemslackmoralstatus,itisunclear
exactlywhatattributesgroundmoralstatus.Twocriteriaarecommonlyproposedas
beingimportantlylinkedtomoralstatus,eitherseparatelyorincombination:sentience
andsapience(orpersonhood).Thesemaybecharacterizedroughlyasfollows:
Sentience:thecapacityforphenomenalexperienceorqualia,suchasthe
capacitytofeelpainandsuffer
Sapience:asetofcapacitiesassociatedwithhigherintelligence,suchasself
awarenessandbeingareasonresponsiveagent
Onecommonviewisthatmanyanimalshavequaliaandthereforehavesomemoral
status,butthatonlyhumanbeingshavesapience,whichgivesthemahighermoral
statusthannonhumananimals.
1
Thisview,ofcourse,mustconfronttheexistenceof
borderlinecasessuchas,ontheonehand,humaninfantsorhumanbeingswithsevere
mentalretardationsometimesunfortunatelyreferredtoasmarginalhumans
whichfailtosatisfythecriteriaforsapience;and,ontheotherhand,somenonhuman
animalssuchasthegreatapes,whichmightpossessatleastsomeoftheelementsof
sapience.Somedenythatsocalledmarginalhumanshavefullmoralstatus.Others
proposeadditionalwaysinwhichanobjectcouldqualifyasabearerofmoralstatus,
suchasbybeingamemberofakindthatnormallyhassentienceorsapience,orby
standinginasuitablerelationtosomebeingthatindependentlyhasmoralstatus(cf.
MaryAnneWarren2000).Forpresentpurposes,however,wewillfocusonthecriteria
ofsentienceandsapience.
ThispictureofmoralstatussuggeststhatanAIsystemwillhavesomemoralstatusifit
hasthecapacityforqualia,suchasanabilitytofeelpain.AsentientAIsystem,evenif
itlackslanguageandotherhighercognitivefaculties,isnotlikeastuffedtoyanimalor
awindupdoll;itismorelikealivinganimal.Itiswrongtoinflictpainonamouse,
unlesstherearesufficientlystrongmorallyoverridingreasonstodoso.Thesame
wouldholdforanysentientAIsystem.Ifinadditiontosentience,anAIsystemalso
1
Alternatively,onemightdenythatmoralstatuscomesindegrees.Instead,onemightholdthat
certainbeingshavemoresignificantintereststhanotherbeings.Thus,forinstance,onecould
claimthatitisbettertosaveahumanthantosaveabird,notbecausethehumanhashigher
moralstatus,butbecausethehumanhasamoresignificantinterestinhavingherlifesavedthan
doesthebirdinhavingitslifesaved.
8
hassapienceofakindsimilartothatofanormalhumanadult,thenitwouldhavefull
moralstatus,equivalenttothatofhumanbeings.
Oneoftheideasunderlyingthismoralassessmentcanbeexpressedinstrongerform
asaprincipleofnondiscrimination:
PrincipleofSubstrateNonDiscrimination
Iftwobeingshavethesamefunctionalityandthesameconsciousexperience,
anddifferonlyinthesubstrateoftheirimplementation,thentheyhavethe
samemoralstatus.
Onecanargueforthisprincipleongroundsthatrejectingitwouldamountto
embracingapositionsimilartoracism:substratelacksfundamentalmoralsignificance
inthesamewayandforthesamereasonasskincolordoes.ThePrincipleofSubstrate
NonDiscriminationdoesnotimplythatadigitalcomputercouldbeconscious,orthat
itcouldhavethesamefunctionalityasahumanbeing.Substratecanofcoursebe
morallyrelevantinsofarasitmakesadifferencetosentienceorfunctionality.But
holdingthesethingsconstant,itmakesnomoraldifferencewhetherabeingismadeof
siliconorcarbon,orwhetheritsbrainusessemiconductorsorneurotransmitters.
AnadditionalprinciplethatcanbeproposedisthatthefactthatAIsystemsare
artificiali.e.,theproductofdeliberatedesignisnotfundamentallyrelevanttotheir
moralstatus.Wecouldformulatethisasfollows:
PrincipleofOntogenyNonDiscrimination
Iftwobeingshavethesamefunctionalityandthesameconsciousness
experience,anddifferonlyinhowtheycameintoexistence,thentheyhavethe
samemoralstatus.
Today,thisideaiswidelyacceptedinthehumancasealthoughinsomecircles,
particularlyinthepast,theideathatonesmoralstatusdependsononesbloodlineor
castehasbeeninfluential.Wedonotbelievethatcausalfactorssuchasfamily
planning,assisteddelivery,invitrofertilization,gameteselection,deliberate
enhancementofmaternalnutritionetc.whichintroduceanelementofdeliberate
choiceanddesigninthecreationofhumanpersonshaveanynecessaryimplicationsfor
themoralstatusoftheprogeny.Eventhosewhoareopposedtohumanreproductive
cloningformoralorreligiousreasonsgenerallyacceptthat,shouldahumanclonebe
broughttoterm,itwouldhavethesamemoralstatusasanyotherhumaninfant.The
PrincipleofOntogenyNonDiscriminationextendsthisreasoningtothecaseinvolving
entirelyartificialcognitivesystems.
Itis,ofcourse,possibleforcircumstancesofcreationtoaffecttheensuingprogenyin
suchawayastoalteritsmoralstatus.Forexample,ifsomeprocedurewere
9
performedduringconceptionorgestationthatcausedahumanfetustodevelop
withoutabrain,thenthisfactaboutontogenywouldberelevanttoourassessmentof
themoralstatusoftheprogeny.Theanencephalicchild,however,wouldhavethe
samemoralstatusasanyothersimilaranencephalicchild,includingonethathadcome
aboutthroughsomeentirelynaturalprocess.Thedifferenceinmoralstatusbetween
ananencephalicchildandanormalchildisgroundedinthequalitativedifference
betweenthetwothefactthatonehasamindwhiletheotherdoesnot.Sincethetwo
childrendonothavethesamefunctionalityandthesameconsciousexperience,the
PrincipleofOntogenyNonDiscriminationdoesnotapply.
AlthoughthePrincipleofOntogenyNonDiscriminationassertsthatabeings
ontogenyhasnoessentialbearingonitsmoralstatus,itdoesnotdenythatfactsabout
ontogenycanaffectwhatdutiesparticularmoralagentshavetowardthebeingin
question.Parentshavespecialdutiestotheirchildwhichtheydonothavetoother
children,andwhichtheywouldnothaveeveniftherewereanotherchildqualitatively
identicaltotheirown.Similarly,thePrincipleofOntogenyNonDiscriminationis
consistentwiththeclaimthatthecreatorsorownersofanAIsystemwithmoralstatus
mayhavespecialdutiestotheirartificialmindwhichtheydonothavetoanother
artificialmind,evenifthemindsinquestionarequalitativelysimilarandhavethe
samemoralstatus.
Iftheprinciplesofnondiscriminationwithregardtosubstrateandontogenyare
accepted,thenmanyquestionsabouthowweoughttotreatartificialmindscanbe
answeredbyapplyingthesamemoralprinciplesthatweusetodetermineourduties
inmorefamiliarcontexts.Insofarasmoraldutiesstemfrommoralstatus
considerations,weoughttotreatanartificialmindinjustthesamewayasweoughtto
treataqualitativelyidenticalnaturalhumanmindinasimilarsituation.This
simplifiestheproblemofdevelopinganethicsforthetreatmentofartificialminds.
Evenifweacceptthisstance,however,wemustconfrontanumberofnovelethical
questionswhichtheaforementionedprinciplesleaveunanswered.Novelethical
questionsarisebecauseartificialmindscanhaveverydifferentpropertiesfrom
ordinaryhumanoranimalminds.Wemustconsiderhowthesenovelproperties
wouldaffectthemoralstatusofartificialmindsandwhatitwouldmeantorespectthe
moralstatusofsuchexoticminds.
MindswithExoticProperties
Inthecaseofhumanbeings,wedonotnormallyhesitatetoascribesentienceand
consciousexperiencetoanyindividualwhoexhibitsthenormalkindsofhuman
behavior.Fewbelievetheretobeotherpeoplewhoactperfectlynormallybutlack
consciousness.However,otherhumanbeingsdonotmerelybehaveinpersonlike
wayssimilartoourselves;theyalsohavebrainsandcognitivearchitecturesthatare
10
constitutedmuchlikeourown.Anartificialintellect,bycontrast,mightbeconstituted
quitedifferentlyfromahumanintellectyetstillexhibithumanlikebehaviororpossess
thebehavioraldispositionsnormallyindicativeofpersonhood.Itmightthereforebe
possibletoconceiveofanartificialintellectthatwouldbesapient,andperhapswould
beaperson,yetwouldnotbesentientorhaveconsciousexperiencesofanykind.
(Whetherthisisreallypossibledependsontheanswerstosomenontrivial
metaphysicalquestions.)Shouldsuchasystembepossible,itwouldraisethequestion
whetheranonsentientpersonwouldhaveanymoralstatuswhatever;andifso,
whetheritwouldhavethesamemoralstatusasasentientperson.Sincesentience,or
atleastacapacityforsentience,isordinarilyassumedtobepresentinanyindividual
whoisaperson,thisquestionhasnotreceivedmuchattentiontodate.
2
Anotherexoticproperty,onewhichiscertainlymetaphysicallyandphysicallypossible
foranartificialintelligence,isforitssubjectiverateoftimetodeviatedrasticallyfromthe
ratethatischaracteristicofabiologicalhumanbrain.Theconceptofsubjectiverateof
timeisbestexplainedbyfirstintroducingtheideawholebrainemulation,or
uploading.
Uploadingreferstoahypotheticalfuturetechnologythatwouldenableahumanor
otheranimalintellecttobetransferredfromitsoriginalimplementationinanorganic
brainontoadigitalcomputer.Onescenariogoeslikethis:First,averyhighresolution
scanisperformedofsomeparticularbrain,possiblydestroyingtheoriginalinthe
process.Forexample,thebrainmightbevitrifiedanddissectedintothinslices,which
canthenbescannedusingsomeformofhighthroughputmicroscopycombinedwith
automatedimagerecognition.Wemayimaginethisscantobedetailedenoughto
capturealltheneurons,theirsynapticinterconnections,andotherfeaturesthatare
functionallyrelevanttotheoriginalbrainsoperation.Second,thisthreedimensional
mapofthecomponentsofthebrainandtheirinterconnectionsiscombinedwitha
libraryofadvancedneuroscientifictheorywhichspecifiesthecomputational
propertiesofeachbasictypeofelement,suchasdifferentkindsofneuronandsynaptic
junction.Third,thecomputationalstructureandtheassociatedalgorithmicbehavior
ofitscomponentsareimplementedinsomepowerfulcomputer.Iftheuploading
processhasbeensuccessful,thecomputerprogramshouldnowreplicatetheessential
2
Thequestionisrelatedtosomeproblemsinthephilosophyofmindwhichhavereceiveda
greatdealofattention,inparticularthezombieproblem,whichcanbeformulatedasfollows:
Isthereametaphysicallypossibleworldthatisidenticaltotheactualworldwithregardtoall
physicalfacts(includingtheexactphysicalmicrostructureofallbrainsandorganisms)yetthat
differsfromtheactualworldinregardtosomephenomenal(subjectiveexperiential)facts?Put
morecrudely,isitmetaphysicallypossiblethattherecouldbeanindividualwhoisphysically
exactlyidenticaltoyoubutwhoisazombie,i.e.lackingqualiaandphenomenalawareness?
(DavidChalmers,1996)Thisfamiliarquestiondiffersfromtheonereferredtointhetext:our
zombieisallowedtohavesystematicallydifferentphysicalpropertiesfromnormalhumans.
Moreover,wewishtodrawattentionspecificallytotheethicalstatusofasapientzombie.
11
functionalcharacteristicsoftheoriginalbrain.Theresultinguploadmayinhabita
simulatedvirtualreality,or,alternatively,itcouldbegivencontrolofaroboticbody,
enablingittointeractdirectlywithexternalphysicalreality.
Anumberofquestionsariseinthecontextofsuchascenario:Howplausibleisitthat
thisprocedurewillonedaybecometechnologicallyfeasible?Iftheprocedureworked
andproducedacomputerprogramexhibitingroughlythesamepersonality,thesame
memories,andthesamethinkingpatternsastheoriginalbrain,wouldthisprogrambe
sentient?Wouldtheuploadbethesamepersonastheindividualwhosebrainwas
disassembledintheuploadingprocess?Whathappenstopersonalidentityifan
uploadiscopiedsuchthattwosimilarorqualitativelyidenticaluploadmindsare
runninginparallel?Althoughallofthesequestionsarerelevanttotheethicsof
machineintelligence,letusherefocusonanissueinvolvingthenotionofasubjective
rateoftime.
Supposethatanuploadcouldbesentient.Ifweruntheuploadprogramonafaster
computer,thiswillcausetheupload,ifitisconnectedtoaninputdevicesuchasa
videocamera,toperceivetheexternalworldasifithadbeensloweddown.For
example,iftheuploadisrunningathousandtimesfasterthantheoriginalbrain,then
theexternalworldwillappeartotheuploadasifitweresloweddownbyafactorof
thousand.Somebodydropsaphysicalcoffeemug:Theuploadobservesthemug
slowlyfallingtothegroundwhiletheuploadfinishesreadingthemorningnewspaper
andsendsoffafewemails.Onesecondofobjectivetimecorrespondsto17minutesof
subjectivetime.Objectiveandsubjectivedurationcanthusdiverge.
Subjectivetimeisnotthesameasasubjectsestimateorperceptionofhowfasttime
flows.Humanbeingsareoftenmistakenabouttheflowoftime.Wemaybelievethat
itisoneoclockwhenitisinfactaquarterpasttwo;orastimulantdrugmightcause
ourthoughtstorace,makingitseemasthoughmoresubjectivetimehaslapsedthanis
actuallythecase.Thesemundanecasesinvolveadistortedtimeperceptionratherthan
ashiftintherateofsubjectivetime.Eveninacocaineaddledbrain,thereisprobably
notasignificantchangeinthespeedofbasicneurologicalcomputations;morelikely,
thedrugiscausingsuchabraintoflickermorerapidlyfromonethoughttoanother,
makingitspendlesssubjectivetimethinkingeachofagreaternumberofdistinct
thoughts.
Thevariabilityofthesubjectiverateoftimeisanexoticpropertyofartificialmindsthat
raisesnovelethicalissues.Forexample,incaseswherethedurationofanexperienceis
ethicallyrelevant,shoulddurationbemeasuredinobjectiveorsubjectivetime?Ifan
uploadhascommittedacrimeandissentencedtofouryearsinprison,shouldthisbe
fourobjectiveyearswhichmightcorrespondtomanymillenniaofsubjectivetime
orshoulditbefoursubjectiveyears,whichmightbeoverinacoupleofdaysof
objectivetime?IfafastAIandahumanareinpain,isitmoreurgenttoalleviatethe
12
AIspain,ongroundsthatitexperiencesagreatersubjectivedurationofpainforeach
siderealsecondthatpalliationisdelayed?Sinceinouraccustomedcontextof
biologicalhumans,subjectivetimeisnotsignificantlyvariable,itisunsurprisingthat
thiskindofquestionisnotstraightforwardlysettledbyfamiliarethicalnorms,evenif
thesenormsareextendedtoartificialintellectsbymeansofnondiscrimination
principles(suchasthoseproposedintheprevioussection).
Toillustratethekindofethicalclaimthatmightberelevanthere,weformulate(butdo
notarguefor)aprincipleprivilegingsubjectivetimeasthenormativelymore
fundamentalnotion:
PrincipleofSubjectiveRateofTime
Incaseswherethedurationofanexperienceisofbasicnormativesignificance,
itistheexperiencessubjectivedurationthatcounts.
Sofarwehavediscussedtwopossibilities(nonsentientsapienceandvariable
subjectiverateoftime)whichareexoticintherelativelyprofoundsenseofbeing
metaphysicallyproblematicaswellaslackingclearinstancesorparallelsinthe
contemporaryworld.Otherpropertiesofpossibleartificialmindswouldbeexoticina
moresuperficialsense;e.g.,bydiverginginsomeunproblematicallyquantitative
dimensionfromthekindsofmindwithwhichwearefamiliar.Butsuchsuperficially
exoticpropertiesmayalsoposenovelethicalproblemsifnotatthelevelof
foundationalmoralphilosophy,thenatthelevelofappliedethicsorformidlevel
ethicalprinciples.
Oneimportantsetofexoticpropertiesofartificialintelligencesrelatetoreproduction.
Anumberofempiricalconditionsthatapplytohumanreproductionneednotapplyto
artificialintelligences.Forexample,humanchildrenaretheproductofrecombination
ofthegeneticmaterialfromtwoparents;parentshavelimitedabilitytoinfluencethe
characteroftheiroffspring;ahumanembryoneedstobegestatedinthewombfornine
months;ittakesfifteentotwentyyearsforahumanchildtoreachmaturity;ahuman
childdoesnotinherittheskillsandknowledgeacquiredbyitsparents;humanbeings
possessacomplexevolvedsetofemotionaladaptationsrelatedtoreproduction,
nurturing,andthechildparentrelationship.Noneoftheseempiricalconditionsneed
pertaininthecontextofareproducingmachineintelligence.Itisthereforeplausible
thatmanyofthemidlevelmoralprinciplesthatwehavecometoacceptasnorms
governinghumanreproductionwillneedtoberethoughtinthecontextofAI
reproduction.
ToillustratewhysomeofourmoralnormsneedtoberethoughtinthecontextofAI
reproduction,itwillsufficetoconsiderjustoneexoticpropertyofAIs:theircapacity
forrapidreproduction.Givenaccesstocomputerhardware,anAIcouldduplicate
itselfveryquickly,innomoretimethanittakestomakeacopyoftheAIssoftware.
13
Moreover,sincetheAIcopywouldbeidenticaltotheoriginal,itwouldbeborn
completelymature,andthecopycouldbeginmakingitsowncopiesimmediately.
Absenthardwarelimitations,apopulationofAIscouldthereforegrowexponentially
atanextremelyrapidrate,withadoublingtimeontheorderofminutesorhours
ratherthandecadesorcenturies.
Ourcurrentethicalnormsaboutreproductionincludesomeversionofaprincipleof
reproductivefreedom,totheeffectthatitisuptoeachindividualorcoupletodecide
forthemselveswhethertohavechildrenandhowmanychildrentohave.Another
normwehave(atleastinrichandmiddleincomecountries)isthatsocietymuststep
intoprovidethebasicneedsofchildrenincaseswheretheirparentsareunableor
refusingtodoso.Itiseasytoseehowthesetwonormscouldcollideinthecontextof
entitieswiththecapacityforextremelyrapidreproduction.
Consider,forexample,apopulationofuploads,oneofwhomhappenstohavethe
desiretoproduceaslargeaclanaspossible.Givencompletereproductivefreedom,
thisuploadmaystartcopyingitselfasquicklyasitcan;andthecopiesitproduces
whichmayrunonnewcomputerhardwareownedorrentedbytheoriginal,ormay
sharethesamecomputerastheoriginalwillalsostartcopyingthemselves,sincethey
areidenticaltotheprogenitoruploadandshareitsphiloprogenicdesire.Soon,
membersoftheuploadclanwillfindthemselvesunabletopaytheelectricitybillorthe
rentforthecomputationalprocessingandstorageneededtokeepthemalive.Atthis
point,asocialwelfaresystemmightkickintoprovidethemwithatleastthebare
necessitiesforsustaininglife.Butifthepopulationgrowsfasterthantheeconomy,
resourceswillrunout;atwhichpointuploadswilleitherdieortheirabilityto
reproducewillbecurtailed.(Fortworelateddystopianscenarios,seeBostrom(2004).)
Thisscenarioillustrateshowsomemidlevelethicalprinciplesthataresuitablein
contemporarysocietiesmightneedtobemodifiedifthosesocietiesweretoinclude
personswiththeexoticpropertyofbeingabletoreproduceveryrapidly.
Thegeneralpointhereisthatwhenthinkingaboutappliedethicsforcontextsthatare
verydifferentfromourfamiliarhumancondition,wemustbecarefulnottomistake
midlevelethicalprinciplesforfoundationalnormativetruths.Putdifferently,we
mustrecognizetheextenttowhichourordinarynormativepreceptsareimplicitly
conditionedontheobtainingofvariousempiricalconditions,andtheneedtoadjust
thesepreceptsaccordinglywhenapplyingthemtohypotheticalfuturisticcasesin
whichtheirpreconditionsareassumednottoobtain.Bythis,wearenotmakingany
controversialclaimaboutmoralrelativism,butmerelyhighlightingthe
commonsensicalpointthatcontextisrelevanttotheapplicationofethicsand
suggestingthatthispointisespeciallypertinentwhenoneisconsideringtheethicsof
mindswithexoticproperties.
14
Superintelligence
I.J.Good(1965)setforththeclassichypothesisconcerningsuperintelligence:thatan
AIsufficientlyintelligenttounderstanditsowndesigncouldredesignitselforcreatea
successorsystem,moreintelligent,whichcouldthenredesignitselfyetagainto
becomeevenmoreintelligent,andsooninapositivefeedbackcycle.Goodcalledthis
theintelligenceexplosion.RecursivescenariosarenotlimitedtoAI:humanswith
intelligenceaugmentedthroughabraincomputerinterfacemightturntheirmindsto
designingthenextgenerationofbraincomputerinterfaces.(Ifyouhadamachinethat
increasedyourIQ,itwouldbeboundtooccurtoyou,onceyoubecamesmartenough,
totrytodesignamorepowerfulversionofthemachine.)
Superintelligencemayalsobeachievablebyincreasingprocessingspeed.Thefastest
observedneuronsfire1000timespersecond;thefastestaxonfibersconductsignalsat
150meters/second,ahalfmillionththespeedoflight(Sandberg1999).Itseemsthatit
shouldbephysicallypossibletobuildabrainwhichcomputesamilliontimesasfastas
ahumanbrain,withoutshrinkingitssizeorrewritingitssoftware.Ifahumanmind
werethusaccelerated,asubjectiveyearofthinkingwouldbeaccomplishedforevery
31physicalsecondsintheoutsideworld,andamillenniumwouldflybyineightanda
halfhours.Vinge(1993)referredtosuchspedupmindsasweaksuperintelligence:
amindthatthinkslikeahumanbutmuchfaster.
Yudkowsky(2008a)liststhreefamiliesofmetaphorsforvisualizingthecapabilityofa
smarterthanhumanAI:
Metaphorsinspiredbydifferencesofindividualintelligencebetweenhumans:
AIswillpatentnewinventions,publishgroundbreakingresearchpapers,
makemoneyonthestockmarket,orleadpoliticalpowerblocks.
Metaphorsinspiredbyknowledgedifferencesbetweenpastandpresent
humancivilizations:FastAIswillinventcapabilitiesthatfuturistscommonly
predictforhumancivilizationsacenturyormillenniuminthefuture,like
molecularnanotechnologyorinterstellartravel.
Metaphorsinspiredbydifferencesofbrainarchitecturebetweenhumansand
otherbiologicalorganisms:E.g.,Vinge(1993):Imaginerunningadogmind
atveryhighspeed.Wouldathousandyearsofdoggylivingadduptoany
humaninsight?Thatis:Changesofcognitivearchitecturemightproduce
insightsthatnohumanlevelmindwouldbeabletofind,orperhapseven
represent,afteranyamountoftime.
Evenifwerestrictourselvestohistoricalmetaphors,itbecomesclearthatsuperhuman
intelligencepresentsethicalchallengesthatarequiteliterallyunprecedented.Atthis
pointthestakesarenolongeronanindividualscale(e.g.,mortgageunjustly
disapproved,housecatchesfire,personagentmistreated)butonaglobalorcosmic
15
scale(e.g.,humanityisextinguishedandreplacedbynothingwewouldregardas
worthwhile).Or,ifsuperintelligencecanbeshapedtobebeneficial,then,depending
onitstechnologicalcapabilities,itmightmakeshortworkofmanypresentday
problemsthathaveprovendifficulttoourhumanlevelintelligence.
SuperintelligenceisoneofseveralexistentialrisksasdefinedbyBostrom(2002):a
riskwhereanadverseoutcomewouldeitherannihilateEarthoriginatingintelligent
lifeorpermanentlyanddrasticallycurtailitspotential.Conversely,apositive
outcomeforsuperintelligencecouldpreserveEarthoriginatingintelligentlifeandhelp
fulfillitspotential.Itisimportanttoemphasizethatsmartermindsposegreat
potentialbenefitsaswellasrisks.
Attemptstoreasonaboutglobalcatastrophicrisksmaybesusceptibletoanumberof
cognitivebiases(Yudkowsky2008b),includingthegoodstorybiasproposedby
Bostrom(2002):
Supposeourintuitionsaboutwhichfuturescenariosareplausibleand
realisticareshapedbywhatweseeonTVandinmoviesandwhatwereadin
novels.(Afterall,alargepartofthediscourseaboutthefuturethatpeople
encounterisintheformoffictionandotherrecreationalcontexts.)Weshould
then,whenthinkingcritically,suspectourintuitionsofbeingbiasedinthe
directionofoverestimatingtheprobabilityofthosescenariosthatmakefora
goodstory,sincesuchscenarioswillseemmuchmorefamiliarandmore
real.ThisGoodstorybiascouldbequitepowerful.Whenwasthelasttime
yousawamovieabouthumankindsuddenlygoingextinct(withoutwarning
andwithoutbeingreplacedbysomeothercivilization)?Whilethisscenario
maybemuchmoreprobablethanascenarioinwhichhumanheroes
successfullyrepelaninvasionofmonstersorrobotwarriors,itwouldntbe
muchfuntowatch.
Trulydesirableoutcomesmakepoormovies:Noconflictmeansnostory.While
AsimovsThreeLawsofRobotics(Asimov1942)aresometimescitedasamodelfor
ethicalAIdevelopment,theThreeLawsareasmuchaplotdeviceasAsimovs
positronicbrain.IfAsimovhaddepictedtheThreeLawsasworkingwell,hewould
havehadnostories.
ItwouldbeamistaketoregardAIsasaspecieswithfixedcharacteristicsandask,
Willtheybegoodorevil?ThetermArtificialIntelligencereferstoavastdesign
space,presumablymuchlargerthanthespaceofhumanminds(sinceallhumansshare
acommonbrainarchitecture).Itmaybeaformofgoodstorybiastoask,WillAIsbe
goodorevil?asiftryingtopickapremiseforamovieplot.Thereplyshouldbe,
ExactlywhichAIdesignareyoutalkingabout?
16
CancontrolovertheinitialprogrammingofanArtificialIntelligencetranslateinto
influenceonitslatereffectontheworld?Kurzweil(2005)holdsthat[i]ntelligenceis
inherentlyimpossibletocontrol,andthatdespiteanyhumanattemptsattaking
precautions,[b]ydefinitionintelligententitieshavetheclevernesstoeasily
overcomesuchbarriers.LetussupposethattheAIisnotonlyclever,butthat,aspart
oftheprocessofimprovingitsownintelligence,ithasunhinderedaccesstoitsown
sourcecode:itcanrewriteitselftoanythingitwantsitselftobe.Yetitdoesnotfollow
thattheAImustwanttorewriteitselftoahostileform.
ConsiderGandhi,whoseemstohavepossessedasinceredesirenottokillpeople.
Gandhiwouldnotknowinglytakeapillthatcausedhimtowanttokillpeople,
becauseGandhiknowsthatifhewantstokillpeople,hewillprobablykillpeople,and
thecurrentversionofGandhidoesnotwanttokill.Moregenerally,itseemslikely
thatmostselfmodifyingmindswillnaturallyhavestableutilityfunctions,which
impliesthataninitialchoiceofminddesigncanhavelastingeffects(Omohundro
2008).
AtthispointinthedevelopmentofAIscience,isthereanywaywecantranslatethe
taskoffindingadesignforgoodAIsintoamodernresearchdirection?Itmayseem
prematuretospeculate,butonedoessuspectthatsomeAIparadigmsaremorelikely
thanotherstoeventuallyproveconducivetothecreationofintelligentselfmodifying
agentswhosegoalsremainpredictableevenaftermultipleiterationsofself
improvement.Forexample,theBayesianbranchofAI,inspiredbycoherent
mathematicalsystemssuchasprobabilitytheoryandexpectedutilitymaximization,
seemsmoreamenabletothepredictableselfmodificationproblemthanevolutionary
programmingandgeneticalgorithms.Thisisacontroversialstatement,butit
illustratesthepointthatifwearethinkingaboutthechallengeofsuperintelligence
downtheroad,thiscanindeedbeturnedintodirectionaladviceforpresentAI
research.
YetevensupposingthatwecanspecifyanAIsgoalsystemtobepersistentunderself
modificationandselfimprovement,thisonlybeginstotouchonthecoreethical
problemsofcreatingsuperintelligence.Humans,thefirstgeneralintelligencestoexist
onEarth,haveusedthatintelligencetosubstantiallyreshapetheglobecarving
mountains,tamingrivers,buildingskyscrapers,farmingdeserts,producing
unintendedplanetaryclimatechanges.Amorepowerfulintelligencecouldhave
correspondinglylargerconsequences.
Consideragainthehistoricalmetaphorforsuperintelligencedifferencessimilartothe
differencesbetweenpastandpresentcivilizations.Ourpresentcivilizationisnot
separatedfromancientGreeceonlybyimprovedscienceandincreasedtechnological
capability.Thereisadifferenceofethicalperspectives:AncientGreeksthought
slaverywasacceptable;wethinkotherwise.Evenbetweenthenineteenthand
17
twentiethcenturies,thereweresubstantialethicaldisagreementsshouldwomen
havethevote?Shouldblackshavethevote?Itseemslikelythatpeopletodaywillnot
beseenasethicallyperfectbyfuturecivilizationsnotjustbecauseofourfailureto
solvecurrentlyrecognizedethicalproblems,suchaspovertyandinequality,butalso
forourfailureeventorecognizecertainethicalproblems.Perhapssomedaytheactof
subjectingchildrentoinvoluntarilyschoolingwillbeseenaschildabuseormaybe
allowingchildrentoleaveschoolatage18willbeseenaschildabuse.Wedontknow.
Consideringtheethicalhistoryofhumancivilizationsovercenturiesoftime,wecan
seethatitmightproveaverygreattragedytocreateamindthatwasstableinethical
dimensionsalongwhichhumancivilizationsseemtoexhibitdirectionalchange.Whatif
ArchimedesofSyracusehadbeenabletocreatealonglastingartificialintellectwitha
fixedversionofthemoralcodeofAncientGreece?Buttoavoidthissortofethical
stagnationislikelytoprovetricky:itwouldnotsuffice,forexample,simplytorender
themindrandomlyunstable.TheancientGreeks,eveniftheyhadrealizedtheirown
imperfection,couldnothavedonebetterbyrollingdice.Occasionallyagoodnewidea
inethicscomesalong,anditcomesasasurprise;butmostrandomlygeneratedethical
changeswouldstrikeusasfollyorgibberish.
Thispresentsuswithperhapstheultimatechallengeofmachineethics:Howdoyou
buildanAIwhich,whenitexecutes,becomesmoreethicalthanyou?Thisisnotlike
askingourownphilosopherstoproducesuperethics,anymorethanDeepBluewas
constructedbygettingthebesthumanchessplayerstoprogramingoodmoves.But
wehavetobeabletoeffectivelydescribethequestion,ifnottheanswerrollingdice
wontgenerategoodchessmoves,orgoodethicseither.Or,perhapsamore
productivewaytothinkabouttheproblem:Whatstrategywouldyouwant
Archimedestofollowinbuildingasuperintelligence,suchthattheoveralloutcome
wouldstillbeacceptable,ifyoucouldnttellhimwhatspecificallyhewasdoing
wrong?Thisisverymuchthesituationthatwearein,relativetothefuture.
Onestrongpieceofadvicethatemergesfromconsideringoursituationasanalogousto
thatofArchimedesisthatweshouldnottrytoinventasuperversionofwhatour
owncivilizationconsiderstobeethicsthisisnotthestrategywewouldhavewanted
Archimedestofollow.Perhapsthequestionweshouldbeconsidering,rather,ishow
anAIprogrammedbyArchimedes,withnomoremoralexpertisethanArchimedes,
couldrecognize(atleastsomeof)ourowncivilizationsethicsasmoralprogressas
opposedtomeremoralinstability.Thiswouldrequirethatwebegintocomprehend
thestructureofethicalquestionsinthewaythatwehavealreadycomprehendedthe
structureofchess.
IfweareseriousaboutdevelopingadvancedAI,thisisachallengethatwemustmeet.
Ifmachinesaretobeplacedinapositionofbeingstronger,faster,moretrusted,or
18
smarterthanhumans,thenthedisciplineofmachineethicsmustcommititselfto
seekinghumansuperior(notjusthumanequivalent)niceness.
3
Conclusion
AlthoughcurrentAIoffersusfewethicalissuesthatarenotalreadypresentinthe
designofcarsorpowerplants,theapproachofAIalgorithmstowardmorehumanlike
thoughtportendspredictablecomplications.SocialrolesmaybefilledbyAI
algorithms,implyingnewdesignrequirementsliketransparencyandpredictability.
SufficientlygeneralAIalgorithmsmaynolongerexecuteinpredictablecontexts,
requiringnewkindsofsafetyassuranceandtheengineeringofartificialethical
considerations.AIswithsufficientlyadvancedmentalstates,ortherightkindof
states,willhavemoralstatus,andsomemaycountaspersonsthoughperhaps
personsverymuchunlikethesortthatexistnow,perhapsgovernedbydifferentrules.
Andfinally,theprospectofAIswithsuperhumanintelligenceandsuperhuman
abilitiespresentsuswiththeextraordinarychallengeofstatinganalgorithmthat
outputssuperethicalbehavior.Thesechallengesmayseemvisionary,butitseems
predictablethatwewillencounterthem;andtheyarenotdevoidofsuggestionsfor
presentdayresearchdirections.
Authorbiographies
NickBostromisProfessorintheFacultyofPhilosophyatOxfordUniversityand
DirectoroftheFutureofHumanityInstitutewithintheOxfordMartinSchool.Heis
theauthorofsome200publications,includingAnthropicBias(Routledge,2002),Global
CatastrophicRisks(ed.,OUP,2008),andEnhancingHumans(ed.,OUP,2009).His
researchcoversarangeofbigpicturequestionsforhumanity.Heiscurrentlyworking
abookonthefutureofmachineintelligenceanditsstrategicimplications.
EliezerYudkowskyisaResearchFellowattheSingularityInstituteforArtificial
Intelligencewhereheworksfulltimeontheforeseeabledesignissuesofgoal
architecturesinselfimprovingAI.Hiscurrentworkcentersonmodifyingclassical
decisiontheorytocoherentlydescribeselfmodification.Heisalsoknownforhis
popularwritingonissuesofhumanrationalityandcognitivebiases.
Furtherreading
Bostrom,N.2004.TheFutureofHumanEvolution,inDeathandAntiDeath:Two
HundredYearsAfterKant,FiftyYearsAfterTuring,ed.CharlesTandy(PaloAlto,
California:RiaUniversityPress).Thispaperexploressomeevolutionarydynamics
thatcouldleadapopulationofdiverseuploadstodevelopindystopiandirections.
3
TheauthorsaregratefultoRebeccaRoacheforresearchassistanceandtotheeditorsofthis
volumefordetailedcommentsonanearlierversionofourmanuscript.
19
Yudkowsky,E.2008a.ArtificialIntelligenceasaPositiveandNegativeFactorin
GlobalRisk,inBostromandCirkovic(eds.),pp.308345.Anintroductiontothe
risksandchallengespresentedbythepossibilityofrecursivelyselfimproving
superintelligentmachines.
Wendell,W.2008.MoralMachines:TeachingRobotsRightfromWrong(Oxford
UniversityPress,2008).Acomprehensivesurveyofrecentdevelopments.
References
Asimov,I.1942.Runaround,AstoundingScienceFiction,March1942.
Beauchamp,T.andChilress,J.PrinciplesofBiomedicalEthics.Oxford:OxfordUniversity
Press.
Bostrom,N.2002.ExistentialRisks:AnalyzingHumanExtinctionScenarios,Journalof
EvolutionandTechnology9
(http://www.nickbostrom.com/existential/risks.html).
Bostrom,N.2003.AstronomicalWaste:TheOpportunityCostofDelayed
TechnologicalDevelopment,Utilitas15:308314.
Bostrom,N.2004.TheFutureofHumanEvolution,inDeathandAntiDeath:Two
HundredYearsAfterKant,FiftyYearsAfterTuring,ed.CharlesTandy(Palo
Alto,California:RiaUniversityPress)
(http://www.nickbostrom.com/fut/evolution.pdf)
Bostrom,N.andCirkovic,M.(eds.)2007.GlobalCatastrophicRisks.Oxford:Oxford
UniversityPress.
Chalmers,D.J.,1996,TheConsciousMind:InSearchofaFundamentalTheory.NewYork
andOxford:OxfordUniversityPress
Hirschfeld,L.A.andGelman,S.A.(eds.)1994.MappingtheMind:DomainSpecificityin
CognitionandCulture,Cambridge:CambridgeUniversityPress.
Goertzel,B.andPennachin,C.(eds.)2006.ArtificialGeneralIntelligence.NewYork,NY:
SpringerVerlag.
Good,I.J.1965.SpeculationsConcerningtheFirstUltraintelligentMachine,inAlt,F.
L.andRubinoff,M.(eds.)AdvancesinComputers,6,NewYork:Academic
Press.Pp.3188.
Hastie,T.,Tibshirani,R.andFriedman,J.2001.TheElementsofStatisticalLearning.New
York,NY:SpringerScience.
Henley,K.1993.AbstractPrinciples,MidlevelPrinciples,andtheRuleofLaw,Law
andPhilosophy12:12132.
Hofstadter,D.2006.TryingtoMuseRationallyabouttheSingularityScenario,
presentedattheSingularitySummitatStanford,2006.
Howard,PhilipK.1994.TheDeathofCommonSense:HowLawisSuffocatingAmerica.
NewYork,NY:WarnerBooks.
20
Kamm,F.2007.IntricateEthics:Rights,Responsibilities,andPermissibleHarm.Oxford:
OxfordUniversityPress.
Kurzweil,R.2005.TheSingularityIsNear:WhenHumansTranscendBiology.NewYork,
NY:Viking.
McDermott,D.1976.Artificialintelligencemeetsnaturalstupidity,ACMSIGART
Newsletter57:49.
Omohundro,S.2008.TheBasicAIDrives,ProceedingsoftheAGI08Workshop.
Amsterdam:IOSPress.Pp.483492.
Sandberg,A.1999.ThePhysicsofInformationProcessingSuperobjects:DailyLife
AmongtheJupiterBrains,JournalofEvolutionandTechnology,5.
Vinge,V.1993.TheComingTechnologicalSingularity,presentedattheVISION21
Symposium,March,1993.
Warren,M.E.2000.MoralStatus:ObligationstoPersonsandOtherLivingThings.Oxford:
OxfordUniversityPress.
Yudkowsky,E.2006.AIasaPreciseArt,presentedatthe2006AGIWorkshopin
Bethesda,MD.
Yudkowsky,E.2008a.ArtificialIntelligenceasaPositiveandNegativeFactorin
GlobalRisk,inBostromandCirkovic(eds.),pp.308345.
Yudkowsky,E.2008b.Cognitivebiasespotentiallyaffectingjudgmentofglobalrisks,
inBostromandCirkovic(eds.),pp.91119.