Plan 9 from Bell Labs

Rob Pike
Dave Presotto
Sean Dorward
Bob Flandrena
Ken Thompson
Howard Trickey
Phil Winterbottom
8eII LaboratorIes
Murray HIII, New ]ersey 07974
USA
Motivation
8y the mId 1980ߣs, the trend In computIng was away Irom Iarge centraIIzed tIme-shared
computers towards networks oI smaIIer, personaI machInes, typIcaIIy UNÌX ߢworksta·
tIonsߣ. PeopIe had grown weary oI overIoaded, bureaucratIc tImesharIng machInes and
were eager to move to smaII, seII-maIntaIned systems, even II that meant a net Ioss In
computIng power. As mIcrocomputers became Iaster, even that Ioss was recovered, and
thIs styIe oI computIng remaIns popuIar today.
Ìn the rush to personaI workstatIons, though, some oI theIr weaknesses were over·
Iooked. FIrst, the operatIng system they run, UNÌX, Is ItseII an oId tImesharIng system
and has had troubIe adaptIng to Ideas born aIter It. CraphIcs and networkIng were
added to UNÌX weII Into Its IIIetIme and remaIn poorIy Integrated and dIIIIcuIt to admIn·
Ister. More Important, the earIy Iocus on havIng prIvate machInes made It dIIIIcuIt Ior
networks oI machInes to serve as seamIessIy as the oId monoIIthIc tImesharIng systems.
TImesharIng centraIIzed the management and amortIzatIon oI costs and resources; per·
sonaI computIng Iractured, democratIzed, and uItImateIy ampIIIIed admInIstratIve prob·
Iems. The choIce oI an oId tImesharIng operatIng system to run those personaI
machInes made It dIIIIcuIt to bInd thIngs together smoothIy.
PIan 9 began In the Iate 1980ߣs as an attempt to have It both ways: to buIId a sys·
tem that was centraIIy admInIstered and cost-eIIectIve usIng cheap modern mIcrocom·
puters as Its computIng eIements. The Idea was to buIId a tIme-sharIng system out oI
workstatIons, but In a noveI way. DIIIerent computers wouId handIe dIIIerent tasks:
smaII, cheap machInes In peopIeߣs oIIIces wouId serve as termInaIs provIdIng access to
Iarge, centraI, shared resources such as computIng servers and IIIe servers. For the cen·
traI machInes, the comIng wave oI shared-memory muItIprocessors seemed obvIous
candIdates. The phIIosophy Is much IIke that oI the CambrIdge DIstrIbuted System
[NeHe82]. The earIy catch phrase was to buIId a UNÌX out oI a Iot oI IIttIe systems, not a
system out oI a Iot oI IIttIe UNÌXes.
The probIems wIth UNÌX were too deep to IIx, but some oI Its Ideas couId be
brought aIong. The best was Its use oI the IIIe system to coordInate namIng oI and
__________________
Appeared In a sIIghtIy dIIIerent Iorm In Computing Systems, VoI 8 #3, Summer 199S, pp. 221-2S4.
· 2 ·
access to resources, even those, such as devIces, not tradItIonaIIy treated as IIIes. For
PIan 9, we adopted thIs Idea by desIgnIng a network-IeveI protocoI, caIIed 9P, to enabIe
machInes to access IIIes on remote systems. Above thIs, we buIIt a namIng system that
Iets peopIe and theIr computIng agents buIId customIzed vIews oI the resources In the
network. ThIs Is where PIan 9 IIrst began to Iook dIIIerent: a PIan 9 user buIIds a prIvate
computIng envIronment and recreates It wherever desIred, rather than doIng aII comput·
Ing on a prIvate machIne. Ìt soon became cIear that thIs modeI was rIcher than we had
Ioreseen, and the Ideas oI per-process name spaces and IIIe-system-IIke resources
were extended throughout the systemߞto processes, graphIcs, even the network ItseII.
8y 1989 the system had become soIId enough that some oI us began usIng It as
our excIusIve computIng envIronment. ThIs meant brIngIng aIong many oI the servIces
and appIIcatIons we had used on UNÌX. We used thIs opportunIty to revIsIt many Issues,
not just kerneI-resIdent ones, that we IeIt UNÌX addressed badIy. PIan 9 has new com·
pIIers, Ianguages, IIbrarIes, wIndow systems, and many new appIIcatIons. Many oI the
oId tooIs were dropped, whIIe those brought aIong have been poIIshed or rewrItten.
Why be so aII-encompassIngZ The dIstInctIon between operatIng system, IIbrary,
and appIIcatIon Is Important to the operatIng system researcher but unInterestIng to the
user. What matters Is cIean IunctIonaIIty. 8y buIIdIng a compIete new system, we were
abIe to soIve probIems where we thought they shouId be soIved. For exampIe, there Is
no reaI ߢtty drIverߣ In the kerneI; that Is the job oI the wIndow system. Ìn the modern
worId, muItI-vendor and muItI-archItecture computIng are essentIaI, yet the usuaI com·
pIIers and tooIs assume the program Is beIng buIIt to run IocaIIy; we needed to rethInk
these Issues. Most Important, though, the test oI a system Is the computIng envIron·
ment It provIdes. ProducIng a more eIIIcIent way to run the oId UNÌX warhorses Is
empty engIneerIng; we were more Interested In whether the new Ideas suggested by the
archItecture oI the underIyIng system encourage a more eIIectIve way oI workIng. Thus,
aIthough PIan 9 provIdes an emuIatIon envIronment Ior runnIng POSÌX commands, It Is a
backwater oI the system. The vast majorIty oI system soItware Is deveIoped In the
ߢnatIveߣ PIan 9 envIronment.
There are beneIIts to havIng an aII-new system. FIrst, our Iaboratory has a hIstory
oI buIIdIng experImentaI perIpheraI boards. To make It easy to wrIte devIce drIvers, we
want a system that Is avaIIabIe In source Iorm (no Ionger guaranteed wIth UNÌX, even In
the Iaboratory In whIch It was born). AIso, we want to redIstrIbute our work, whIch
means the soItware must be IocaIIy produced. For exampIe, we couId have used some
vendorsߣ C compIIers Ior our system, but even had we overcome the probIems wIth
cross-compIIatIon, we wouId have dIIIIcuIty redIstrIbutIng the resuIt.
ThIs paper serves as an overvIew oI the system. Ìt dIscusses the archItecture Irom
the Iowest buIIdIng bIocks to the computIng envIronment seen by users. Ìt aIso serves
as an IntroductIon to the rest oI the PIan 9 Programmerߣs ManuaI, whIch It accompanIes.
More detaII about topIcs In thIs paper can be Iound eIsewhere In the manuaI.
Design
The vIew oI the system Is buIIt upon three prIncIpIes. FIrst, resources are named
and accessed IIke IIIes In a hIerarchIcaI IIIe system. Second, there Is a standard proto·
coI, caIIed 9P, Ior accessIng these resources. ThIrd, the dIsjoInt hIerarchIes provIded by
dIIIerent servIces are joIned together Into a sIngIe prIvate hIerarchIcaI IIIe name space.
The unusuaI propertIes oI PIan 9 stem Irom the consIstent, aggressIve appIIcatIon oI
these prIncIpIes.
A Iarge PIan 9 InstaIIatIon has a number oI computers networked together, each
provIdIng a partIcuIar cIass oI servIce. Shared muItIprocessor servers provIde comput·
Ing cycIes; other Iarge machInes oIIer IIIe storage. These machInes are Iocated In an
aIr-condItIoned machIne room and are connected by hIgh-perIormance networks.
Lower bandwIdth networks such as Ethernet or ÌSDN connect these servers to oIIIce-
· 3 ·
and home-resIdent workstatIons or PCs, caIIed termInaIs In PIan 9 termInoIogy. FIgure
1 shows the arrangement.
CPU CPU FIIe FIIe
Ìnternet
Cateway
Cateway
Term
Term Term Term
Ethernet
FIber Network
DatakIt
Term Term Term
Figure 1. Structure of a large Plan 9 installation. CPU servers and IIIe servers share Iast IocaI-area
networks, whIIe termInaIs use sIower wIder-area networks such as Ethernet, DatakIt, or teIephone
IInes to connect to them. Cateway machInes, whIch are just CPU servers connected to muItIpIe net·
works, aIIow machInes on one network to see another.
The modern styIe oI computIng oIIers each user a dedIcated workstatIon or PC.
PIan 9ߣs approach Is dIIIerent. The varIous machInes wIth screens, keyboards, and mIce
aII provIde access to the resources oI the network, so they are IunctIonaIIy equIvaIent, In
the manner oI the termInaIs attached to oId tImesharIng systems. When someone uses
the system, though, the termInaI Is temporarIIy personaIIzed by that user. Ìnstead oI
customIzIng the hardware, PIan 9 oIIers the abIIIty to customIze oneߣs vIew oI the sys·
tem provIded by the soItware. That customIzatIon Is accompIIshed by gIvIng IocaI, per·
sonaI names Ior the pubIIcIy vIsIbIe resources In the network. PIan 9 provIdes the mech·
anIsm to assembIe a personaI vIew oI the pubIIc space wIth IocaI names Ior gIobaIIy
accessIbIe resources. SInce the most Important resources oI the network are IIIes, the
modeI oI that vIew Is IIIe-orIented.
The cIIentߣs IocaI name space provIdes a way to customIze the userߣs vIew oI the
network. The servIces avaIIabIe In the network aII export IIIe hIerarchIes. Those Impor·
tant to the user are gathered together Into a custom name space; those oI no ImmedIate
Interest are Ignored. ThIs Is a dIIIerent styIe oI use Irom the Idea oI a ߢunIIorm gIobaI
name spaceߣ. Ìn PIan 9, there are known names Ior servIces and unIIorm names Ior IIIes
exported by those servIces, but the vIew Is entIreIy IocaI. As an anaIogy, consIder the
dIIIerence between the phrase ߢmy houseߣ and the precIse address oI the speakerߣs
home. The Iatter may be used by anyone but the Iormer Is easIer to say and makes
sense when spoken. Ìt aIso changes meanIng dependIng on who says It, yet that does
· 4 ·
not cause conIusIon. SImIIarIy, In PIan 9 the name /dev/cons aIways reIers to the
userߣs termInaI and /bin/date the correct versIon oI the date command to run, but
whIch IIIes those names represent depends on cIrcumstances such as the archItecture oI
the machIne executIng date. PIan 9, then, has IocaI name spaces that obey gIobaIIy
understood conventIons; It Is the conventIons that guarantee sane behavIor In the pres·
ence oI IocaI names.
The 9P protocoI Is structured as a set oI transactIons that send a request Irom a
cIIent to a (IocaI or remote) server and return the resuIt. 9P controIs IIIe systems, not
just IIIes: It IncIudes procedures to resoIve IIIe names and traverse the name hIerarchy
oI the IIIe system provIded by the server. On the other hand, the cIIentߣs name space Is
heId by the cIIent system aIone, not on or wIth the server, a dIstInctIon Irom systems
such as SprIte [OCDNW88]. AIso, IIIe access Is at the IeveI oI bytes, not bIocks, whIch
dIstInguIshes 9P Irom protocoIs IIke NFS and RFS. A paper by WeIch compares SprIte,
NFS, and PIan 9ߣs network IIIe system structures [WeIc94].
ThIs approach was desIgned wIth tradItIonaI IIIes In mInd, but can be extended to
many other resources. PIan 9 servIces that export IIIe hIerarchIes IncIude Ì]O devIces,
backup servIces, the wIndow system, network InterIaces, and many others. One exam·
pIe Is the process IIIe system, /proc, whIch provIdes a cIean way to examIne and con·
troI runnIng processes. Precursor systems had a sImIIar Idea [KIII84], but PIan 9 pushes
the IIIe metaphor much Iurther [PPTTW93]. The IIIe system modeI Is weII-understood,
both by system buIIders and generaI users, so servIces that present IIIe-IIke InterIaces
are easy to buIId, easy to understand, and easy to use. FIIes come wIth agreed-upon
ruIes Ior protectIon, namIng, and access both IocaI and remote, so servIces buIIt thIs
way are ready-made Ior a dIstrIbuted system. (ThIs Is a dIstInctIon Irom ߢobject-
orIentedߣ modeIs, where these Issues must be Iaced anew Ior every cIass oI object.)
ExampIes In the sectIons that IoIIow IIIustrate these Ideas In actIon.
The Command−level View
PIan 9 Is meant to be used Irom a machIne wIth a screen runnIng the wIndow sys·
tem. Ìt has no notIon oI ߢteIetypeߣ In the UNÌX sense. The keyboard handIIng oI the
bare system Is rudImentary, but once the wIndow system, 8l [PIke91], Is runnIng, text
can be edIted wIth ߢcut and pasteߣ operatIons Irom a pop-up menu, copIed between wIn·
dows, and so on. 8l permIts edItIng text Irom the past, not just on the current Input
IIne. The text-edItIng capabIIItIes oI 8l are strong enough to dIspIace specIaI Ieatures
such as hIstory In the sheII, pagIng and scroIIIng, and maII edItors. 8l wIndows do not
support cursor addressIng and, except Ior one termInaI emuIator to sImpIIIy connectIng
to tradItIonaI systems, there Is no cursor-addressIng soItware In PIan 9.
Each wIndow Is created In a separate name space. Adjustments made to the name
space In a wIndow do not aIIect other wIndows or programs, makIng It saIe to experI·
ment wIth IocaI modIIIcatIons to the name space, Ior exampIe to substItute IIIes Irom
the dump IIIe system when debuggIng. Once the debuggIng Is done, the wIndow can be
deIeted and aII trace oI the experImentaI apparatus Is gone. SImIIar arguments appIy to
the prIvate space each wIndow has Ior envIronment varIabIes, notes (anaIogous to UNÌX
sIgnaIs), etc.
Each wIndow Is created runnIng an appIIcatIon, such as the sheII, wIth standard
Input and output connected to the edItabIe text oI the wIndow. Each wIndow aIso has a
prIvate bItmap and muItIpIexed access to the keyboard, mouse, and other graphIcaI
resources through IIIes IIke /dev/mouse, /dev/bitblt, and /dev/cons (anaIo·
gous to UNÌXߣs /dev/tty). These IIIes are provIded by 8l, whIch Is ImpIemented as a
IIIe server. UnIIke X wIndows, where a new appIIcatIon typIcaIIy creates a new wIndow to
run In, an 8l graphIcs appIIcatIon usuaIIy runs In the wIndow where It starts. Ìt Is possI·
bIe and eIIIcIent Ior an appIIcatIon to create a new wIndow, but that Is not the styIe oI
the system. AgaIn contrastIng to X, In whIch a remote appIIcatIon makes a network caII
· S ·
to the X server to start runnIng, a remote 8l appIIcatIon sees the mouse, bitblt, and
cons IIIes Ior the wIndow as usuaI In /dev; It does not know whether the IIIes are
IocaI. Ìt just reads and wrItes them to controI the wIndow; the network connectIon Is
aIready there and muItIpIexed.
The Intended styIe oI use Is to run InteractIve appIIcatIons such as the wIndow sys·
tem and text edItor on the termInaI and to run computatIon- or IIIe-IntensIve appIIca·
tIons on remote servers. DIIIerent wIndows may be runnIng programs on dIIIerent
machInes over dIIIerent networks, but by makIng the name space equIvaIent In aII wIn·
dows, thIs Is transparent: the same commands and resources are avaIIabIe, wIth the
same names, wherever the computatIon Is perIormed.
The command set oI PIan 9 Is sImIIar to that oI UNÌX. The commands IaII Into sev·
eraI broad cIasses. Some are new programs Ior oId jobs: programs IIke ls, cat, and
who have IamIIIar names and IunctIons but are new, sImpIer ImpIementatIons. Who, Ior
exampIe, Is a sheII scrIpt, whIIe ps Is just 9S IInes oI C code. Some commands are
essentIaIIy the same as theIr UNÌX ancestors: awk, troff, and others have been con·
verted to ANSÌ C and extended to handIe UnIcode, but are stIII the IamIIIar tooIs. Some
are entIreIy new programs Ior oId nIches: the sheII rc, text edItor sam, debugger
acid, and others dIspIace the better-known UNÌX tooIs wIth sImIIar jobs. FInaIIy, about
haII the commands are new.
CompatIbIIIty was not a requIrement Ior the system. Where the oId commands or
notatIon seemed good enough, we kept them. When they dIdnߣt, we repIaced them.
The File Server
A centraI IIIe server stores permanent IIIes and presents them to the network as a
IIIe hIerarchy exported usIng 9P. The server Is a stand-aIone system, accessIbIe onIy
over the network, desIgned to do Its one job weII. Ìt runs no user processes, onIy a
IIxed set oI routInes compIIed Into the boot Image. Rather than a set oI dIsks or sepa·
rate IIIe systems, the maIn hIerarchy exported by the server Is a sIngIe tree, represent·
Ing IIIes on many dIsks. That hIerarchy Is shared by many users over a wIde area on a
varIety oI networks. Other IIIe trees exported by the server IncIude specIaI-purpose sys·
tems such as temporary storage and, as expIaIned beIow, a backup servIce.
The IIIe server has three IeveIs oI storage. The centraI server In our InstaIIatIon has
about 100 megabytes oI memory buIIers, 27 gIgabytes oI magnetIc dIsks, and 3S0 gIga·
bytes oI buIk storage In a wrIte-once-read-many (WORM) jukebox. The dIsk Is a cache
Ior the WORM and the memory Is a cache Ior the dIsk; each Is much Iaster, and sees
about an order oI magnItude more traIIIc, than the IeveI It caches. The addressabIe data
In the IIIe system can be Iarger than the sIze oI the magnetIc dIsks, because they are
onIy a cache; our maIn IIIe server has about 40 gIgabytes oI actIve storage.
The most unusuaI Ieature oI the IIIe server comes Irom Its use oI a WORM devIce
Ior stabIe storage. Every mornIng at S oߣcIock, a dump oI the IIIe system occurs auto·
matIcaIIy. The IIIe system Is Irozen and aII bIocks modIIIed sInce the Iast dump are
queued to be wrItten to the WORM. Once the bIocks are queued, servIce Is restored and
the read-onIy root oI the dumped IIIe system appears In a hIerarchy oI aII dumps ever
taken, named by Its date. For exampIe, the dIrectory /n/dump/1995/0315 Is the
root dIrectory oI an Image oI the IIIe system as It appeared In the earIy mornIng oI
March 1S, 199S. Ìt takes a Iew mInutes to queue the bIocks, but the process to copy
bIocks to the WORM, whIch runs In the background, may take hours.
There are two ways the dump IIIe system Is used. The IIrst Is by the users them·
seIves, who can browse the dump IIIe system dIrectIy or attach pIeces oI It to theIr name
space. For exampIe, to track down a bug, It Is straIghtIorward to try the compIIer Irom
three months ago or to IInk a program wIth yesterdayߣs IIbrary. WIth daIIy snapshots oI
aII IIIes, It Is easy to IInd when a partIcuIar change was made or what changes were
· 6 ·
made on a partIcuIar date. PeopIe IeeI Iree to make Iarge specuIatIve changes to IIIes In
the knowIedge that they can be backed out wIth a sIngIe copy command. There Is no
backup system as such; Instead, because the dump Is In the IIIe name space, backup
probIems can be soIved wIth standard tooIs such as cp, ls, grep, and diff.
The other (very rare) use Is compIete system backup. Ìn the event oI dIsaster, the
actIve IIIe system can be InItIaIIzed Irom any dump by cIearIng the dIsk cache and set·
tIng the root oI the actIve IIIe system to be a copy oI the dumped root. AIthough easy to
do, thIs Is not to be taken IIghtIy: besIdes IosIng any change made aIter the date oI the
dump, thIs recovery method resuIts In a very sIow system. The cache must be reIoaded
Irom WORM, whIch Is much sIower than magnetIc dIsks. The IIIe system takes a Iew
days to reIoad the workIng set and regaIn Its IuII perIormance.
Access permIssIons oI IIIes In the dump are the same as they were when the dump
was made. NormaI utIIItIes have normaI permIssIons In the dump wIthout any specIaI
arrangement. The dump IIIe system Is read-onIy, though, whIch means that IIIes In the
dump cannot be wrItten regardIess oI theIr permIssIon bIts; In Iact, sInce dIrectorIes are
part oI the read-onIy structure, even the permIssIons cannot be changed.
Once a IIIe Is wrItten to WORM, It cannot be removed, so our users never see
ߢߢpIease cIean up your IIIesߣߣ messages and there Is no df command. We regard the
WORM jukebox as an unIImIted resource. The onIy Issue Is how Iong It wIII take to IIII.
Our WORM has served a communIty oI about S0 users Ior IIve years and has absorbed
daIIy dumps, consumIng a totaI oI 6SZ oI the storage In the jukebox. Ìn that tIme, the
manuIacturer has Improved the technoIogy, doubIIng the capacIty oI the IndIvIduaI
dIsks. ÌI we were to upgrade to the new medIa, we wouId have more Iree space than In
the orIgInaI empty jukebox. TechnoIogy has created storage Iaster than we can use It.
Unusual file servers
PIan 9 Is characterIzed by a varIety oI servers that oIIer a IIIe-IIke InterIace to
unusuaI servIces. Many oI these are ImpIemented by user-IeveI processes, aIthough the
dIstInctIon Is unImportant to theIr cIIents; whether a servIce Is provIded by the kerneI, a
user process, or a remote server Is IrreIevant to the way It Is used. There are dozens oI
such servers; In thIs sectIon we present three representatIve ones.
Perhaps the most remarkabIe IIIe server In PIan 9 Is 8l, the wIndow system. Ìt Is
dIscussed at Iength eIsewhere [PIke91], but deserves a brIeI expIanatIon here. 8l pro·
vIdes two InterIaces: to the user seated at the termInaI, It oIIers a tradItIonaI styIe oI
InteractIon wIth muItIpIe wIndows, each runnIng an appIIcatIon, aII controIIed by a
mouse and keyboard. To the cIIent programs, the vIew Is aIso IaIrIy tradItIonaI: pro·
grams runnIng In a wIndow see a set oI IIIes In /dev wIth names IIke mouse, screen,
and cons. Programs that want to prInt text to theIr wIndow wrIte to /dev/cons; to
read the mouse, they read /dev/mouse. Ìn the PIan 9 styIe, bItmap graphIcs Is ImpIe·
mented by provIdIng a IIIe /dev/bitblt on whIch cIIents wrIte encoded messages to
execute graphIcaI operatIons such as bitblt (RasterOp). What Is unusuaI Is how thIs
Is done: 8l Is a IIIe server, servIng the IIIes In /dev to the cIIents runnIng In each wIn·
dow. AIthough every wIndow Iooks the same to Its cIIent, each wIndow has a dIstInct set
oI IIIes In /dev. 8l muItIpIexes Its cIIentsߣ access to the resources oI the termInaI by
servIng muItIpIe sets oI IIIes. Each cIIent Is gIven a prIvate name space wIth a different
set oI IIIes that behave the same as In aII other wIndows. There are many advantages to
thIs structure. One Is that 8l serves the same IIIes It needs Ior Its own
ImpIementatIonߞIt muItIpIexes Its own InterIaceߞso It may be run, recursIveIy, as a
cIIent oI ItseII. AIso, consIder the ImpIementatIon oI /dev/tty In UNÌX, whIch
requIres specIaI code In the kerneI to redIrect open caIIs to the approprIate devIce.
Ìnstead, In 8l the equIvaIent servIce IaIIs out automatIcaIIy: 8l serves /dev/cons as
Its basIc IunctIon; there Is nothIng extra to do. When a program wants to read Irom the
keyboard, It opens /dev/cons, but It Is a prIvate IIIe, not a shared one wIth specIaI
· 7 ·
propertIes. AgaIn, IocaI name spaces make thIs possIbIe; conventIons about the consIs·
tency oI the IIIes wIthIn them make It naturaI.
8l has a unIque Ieature made possIbIe by Its desIgn. 8ecause It Is ImpIemented as
a IIIe server, It has the power to postpone answerIng read requests Ior a partIcuIar wIn·
dow. ThIs behavIor Is toggIed by a reserved key on the keyboard. ToggIIng once sus·
pends cIIent reads Irom the wIndow; toggIIng agaIn resumes normaI reads, whIch
absorb whatever text has been prepared, one IIne at a tIme. ThIs aIIows the user to edIt
muItI-IIne Input text on the screen beIore the appIIcatIon sees It, obvIatIng the need to
Invoke a separate edItor to prepare text such as maII messages. A reIated property Is
that reads are answered dIrectIy Irom the data structure deIInIng the text on the dIspIay:
text may be edIted untII Its IInaI newIIne makes the prepared IIne oI text readabIe by the
cIIent. Even then, untII the IIne Is read, the text the cIIent wIII read can be changed. For
exampIe, aIter typIng
% make
rm *
to the sheII, the user can backspace over the IInaI newIIne at any tIme untII make IIn·
Ishes, hoIdIng oII executIon oI the rm command, or even poInt wIth the mouse beIore
the rm and type another command to be executed IIrst.
There Is no ftp command In PIan 9. Ìnstead, a user-IeveI IIIe server caIIed ftpfs
dIaIs the FTP sIte, Iogs In on behaII oI the user, and uses the FTP protocoI to examIne
IIIes In the remote dIrectory. To the IocaI user, It oIIers a IIIe hIerarchy, attached to
/n/ftp In the IocaI name space, mIrrorIng the contents oI the FTP sIte. Ìn other
words, It transIates the FTP protocoI Into 9P to oIIer PIan 9 access to FTP sItes. The
ImpIementatIon Is trIcky; ftpfs must do some sophIstIcated cachIng Ior eIIIcIency and
use heurIstIcs to decode remote dIrectory InIormatIon. 8ut the resuIt Is worthwhIIe: aII
the IocaI IIIe management tooIs such as cp, grep, diff, and oI course ls are avaII·
abIe to FTP-served IIIes exactIy as II they were IocaI IIIes. Other systems such as ]ade
and Prospero have expIoIted the same opportunIty [Rao81, Neu92], but because oI IocaI
name spaces and the sImpIIcIty oI ImpIementIng 9P, thIs approach IIts more naturaIIy
Into PIan 9 than Into other envIronments.
One server, exportfs, Is a user process that takes a portIon oI Its own name
space and makes It avaIIabIe to other processes by transIatIng 9P requests Into system
caIIs to the PIan 9 kerneI. The IIIe hIerarchy It exports may contaIn IIIes Irom muItIpIe
servers. Exportfs Is usuaIIy run as a remote server started by a IocaI program, eIther
import or cpu. Import makes a network caII to the remote machIne, starts
exportfs there, and attaches Its 9P connectIon to the IocaI name space. For exampIe,
import helix /net
makes HeIIxߣs network InterIaces vIsIbIe In the IocaI /net dIrectory. HeIIx Is a centraI
server and has many network InterIaces, so thIs permIts a machIne wIth one network to
access to any oI HeIIxߣs networks. AIter such an Import, the IocaI machIne may make
caIIs on any oI the networks connected to HeIIx. Another exampIe Is
import helix /proc
whIch makes HeIIxߣs processes vIsIbIe In the IocaI /proc, permIttIng IocaI debuggers to
examIne remote processes.
The cpu command connects the IocaI termInaI to a remote CPU server. Ìt works In
the opposIte dIrectIon to import: aIter caIIIng the server, It starts a local exportfs
and mounts It In the name space oI a process, typIcaIIy a newIy created sheII, on the
server. Ìt then rearranges the name space to make IocaI devIce IIIes (such as those
served by the termInaIߣs wIndow system) vIsIbIe In the serverߣs /dev dIrectory. The
eIIect oI runnIng a cpu command Is thereIore to start a sheII on a Iast machIne, one
more tIghtIy coupIed to the IIIe server, wIth a name space anaIogous to the IocaI one.
· 8 ·
AII IocaI devIce IIIes are vIsIbIe remoteIy, so remote appIIcatIons have IuII access to IocaI
servIces such as bItmap graphIcs, /dev/cons, and so on. ThIs Is not the same as
rlogin, whIch does nothIng to reproduce the IocaI name space on the remote system,
nor Is It the same as IIIe sharIng wIth, say, NFS, whIch can achIeve some name space
equIvaIence but not the combInatIon oI access to IocaI hardware devIces, remote IIIes,
and remote CPU resources. The cpu command Is a unIqueIy transparent mechanIsm.
For exampIe, It Is reasonabIe to start a wIndow system In a wIndow runnIng a cpu com·
mand; aII wIndows created there automatIcaIIy start processes on the CPU server.
Configurability and administration
The unIIorm InterconnectIon oI components In PIan 9 makes It possIbIe to conIIg·
ure a PIan 9 InstaIIatIon many dIIIerent ways. A sIngIe Iaptop PC can IunctIon as a
stand-aIone PIan 9 system; at the other extreme, our setup has centraI muItIprocessor
CPU servers and IIIe servers and scores oI termInaIs rangIng Irom smaII PCs to hIgh-end
graphIcs workstatIons. Ìt Is such Iarge InstaIIatIons that best represent how PIan 9 oper·
ates.
The system soItware Is portabIe and the same operatIng system runs on aII hard·
ware. Except Ior perIormance, the appearance oI the system on, say, an SCÌ workstatIon
Is the same as on a Iaptop. SInce computIng and IIIe servIces are centraIIzed, and termI·
naIs have no permanent IIIe storage, aII termInaIs are IunctIonaIIy IdentIcaI. Ìn thIs way,
PIan 9 has one oI the good propertIes oI oId tImesharIng systems, where a user couId sIt
In Iront oI any machIne and see the same system. Ìn the modern workstatIon commu·
nIty, machInes tend to be owned by peopIe who customIze them by storIng prIvate InIor·
matIon on IocaI dIsk. We reject thIs styIe oI use, aIthough the system ItseII can be used
thIs way. Ìn our group, we have a Iaboratory wIth many pubIIc-access machInesߞa ter·
mInaI roomߞand a user may sIt down at any one oI them and work.
CentraI IIIe servers centraIIze not just the IIIes, but aIso theIr admInIstratIon and
maIntenance. Ìn Iact, one server Is the maIn server, hoIdIng aII system IIIes; other
servers provIde extra storage or are avaIIabIe Ior debuggIng and other specIaI uses, but
the system soItware resIdes on one machIne. ThIs means that each program has a sIn·
gIe copy oI the bInary Ior each archItecture, so It Is trIvIaI to InstaII updates and bug
IIxes. There Is aIso a sIngIe user database; there Is no need to synchronIze dIstInct
/etc/passwd IIIes. On the other hand, dependIng on a sIngIe centraI server does
IImIt the sIze oI an InstaIIatIon.
Another exampIe oI the power oI centraIIzed IIIe servIce Is the way PIan 9 admInIs·
ters network InIormatIon. On the centraI server there Is a dIrectory, /lib/ndb, that
contaIns aII the InIormatIon necessary to admInIster the IocaI Ethernet and other net·
works. AII the machInes use the same database to taIk to the network; there Is no need
to manage a dIstrIbuted namIng system or keep paraIIeI IIIes up to date. To InstaII a
new machIne on the IocaI Ethernet, choose a name and ÌP address and add these to a
sIngIe IIIe In /lib/ndb; aII the machInes In the InstaIIatIon wIII be abIe to taIk to It
ImmedIateIy. To start runnIng, pIug the machIne Into the network, turn It on, and use
8OOTP and TFTP to Ioad the kerneI. AII eIse Is automatIc.
FInaIIy, the automated dump IIIe system Irees aII users Irom the need to maIntaIn
theIr systems, whIIe provIdIng easy access to backup IIIes wIthout tapes, specIaI com·
mands, or the InvoIvement oI support staII. Ìt Is dIIIIcuIt to overstate the Improvement
In IIIestyIe aIIorded by thIs servIce.
PIan 9 runs on a varIety oI hardware wIthout constraInIng how to conIIgure an
InstaIIatIon. Ìn our Iaboratory, we chose to use centraI servers because they amortIze
costs and admInIstratIon. A sIgn that thIs Is a good decIsIon Is that our cheap termInaIs
remaIn comIortabIe pIaces to work Ior about IIve years, much Ionger than workstatIons
that must provIde the compIete computIng envIronment. We do, however, upgrade the
centraI machInes, so the computatIon avaIIabIe Irom even oId PIan 9 termInaIs Improves
· 9 ·
wIth tIme. The money saved by avoIdIng reguIar upgrades oI termInaIs Is Instead spent
on the newest, Iastest muItIprocessor servers. We estImate thIs costs about haII the
money oI networked workstatIons yet provIdes generaI access to more powerIuI
machInes.
C Programming
PIan 9 utIIItIes are wrItten In severaI Ianguages. Some are scrIpts Ior the sheII, rc
[DuII90]; a handIuI are wrItten In a new C-IIke concurrent Ianguage caIIed AIeI [WInt9S],
descrIbed beIow. The great majorIty, though, are wrItten In a dIaIect oI ANSÌ C [ANSÌC].
OI these, most are entIreIy new programs, but some orIgInate In pre-ANSÌ C code Irom
our research UNÌX system [UNÌX8S]. These have been updated to ANSÌ C and reworked
Ior portabIIIty and cIeanIIness.
The PIan 9 C dIaIect has some mInor extensIons, descrIbed eIsewhere [PIke9S], and
a Iew major restrIctIons. The most Important restrIctIon Is that the compIIer demands
that aII IunctIon deIInItIons have ANSÌ prototypes and aII IunctIon caIIs appear In the
scope oI a prototyped decIaratIon oI the IunctIon. As a styIIstIc ruIe, the prototyped
decIaratIon Is pIaced In a header IIIe IncIuded by aII IIIes that caII the IunctIon. Each sys·
tem IIbrary has an assocIated header IIIe, decIarIng aII IunctIons In that IIbrary. For
exampIe, the standard PIan 9 IIbrary Is caIIed libc, so aII C source IIIes IncIude
<libc.h>. These ruIes guarantee that aII IunctIons are caIIed wIth arguments havIng
the expected types ߞ somethIng that was not true wIth pre-ANSÌ C programs.
Another restrIctIon Is that the C compIIers accept onIy a subset oI the preprocessor
dIrectIves requIred by ANSÌ. The maIn omIssIon Is #if, sInce we beIIeve It Is never nec·
essary and oIten abused. AIso, Its eIIect Is better achIeved by other means. For
Instance, an #if used to toggIe a Ieature at compIIe tIme can be wrItten as a reguIar if
statement, reIyIng on compIIe-tIme constant IoIdIng and dead code eIImInatIon to dIs·
card object code.
CondItIonaI compIIatIon, even wIth #ifdef, Is used sparIngIy In PIan 9. The onIy
archItecture-dependent #ifdefs In the system are In Iow-IeveI routInes In the graph·
Ics IIbrary. Ìnstead, we avoId such dependencIes or, when necessary, IsoIate them In
separate source IIIes or IIbrarIes. 8esIdes makIng code hard to read, #ifdefs make It
ImpossIbIe to know what source Is compIIed Into the bInary or whether source protected
by them wIII compIIe or work properIy. They make It harder to maIntaIn soItware.
The standard PIan 9 IIbrary overIaps much oI ANSÌ C and POSÌX [POSÌX], but
dIverges when approprIate to PIan 9ߣs goaIs or ImpIementatIon. When the semantIcs oI
a IunctIon change, we aIso change the name. For Instance, Instead oI UNÌXߣs creat,
PIan 9 has a create IunctIon that takes three arguments, the orIgInaI two pIus a thIrd
that, IIke the second argument oI open, deIInes whether the returned IIIe descrIptor Is
to be opened Ior readIng, wrItIng, or both. ThIs desIgn was Iorced by the way 9P ImpIe·
ments creatIon, but It aIso sImpIIIIes the common use oI create to InItIaIIze a tempo·
rary IIIe.
Another departure Irom ANSÌ C Is that PIan 9 uses a 16-bIt character set caIIed UnI·
code [ÌSO10646, UnIcode]. AIthough we stopped short oI IuII InternatIonaIIzatIon, PIan
9 treats the representatIon oI aII major Ianguages unIIormIy throughout aII Its soItware.
To sImpIIIy the exchange oI text between programs, the characters are packed Into a
byte stream by an encodIng we desIgned, caIIed UTF-8, whIch Is now becomIng
accepted as a standard [FSSUTF]. Ìt has severaI attractIve propertIes, IncIudIng byte-
order Independence, backwards compatIbIIIty wIth ASCÌÌ, and ease oI ImpIementatIon.
There are many probIems In adaptIng exIstIng soItware to a Iarge character set
wIth an encodIng that represents characters wIth a varIabIe number oI bytes. ANSÌ C
addresses some oI the Issues but IaIIs short oI soIvIng them aII. Ìt does not pIck a char·
acter set encodIng and does not deIIne aII the necessary Ì]O IIbrary routInes.
· 10 ·
Furthermore, the IunctIons It does deIIne have engIneerIng probIems. SInce the stan·
dard IeIt too many probIems unsoIved, we decIded to buIId our own InterIace. A sepa·
rate paper has the detaIIs [PIke93].
A smaII cIass oI PIan 9 programs do not IoIIow the conventIons dIscussed In thIs
sectIon. These are programs Imported Irom and maIntaIned by the UNÌX communIty;
tex Is a representatIve exampIe. To avoId reconvertIng such programs every tIme a
new versIon Is reIeased, we buIIt a portIng envIronment, caIIed the ANSÌ C]POSÌX EnvI·
ronment, or APE [TrIc9S]. APE comprIses separate IncIude IIIes, IIbrarIes, and com·
mands, conIormIng as much as possIbIe to the strIct ANSÌ C and base-IeveI POSÌX specI·
IIcatIons. To port network-based soItware such as X WIndows, It was necessary to add
some extensIons to those specIIIcatIons, such as the 8SD networkIng IunctIons.
Portability and Compilation
PIan 9 Is portabIe across a varIety oI processor archItectures. WIthIn a sIngIe com·
putIng sessIon, It Is common to use severaI archItectures: perhaps the wIndow system
runnIng on an ÌnteI processor connected to a MÌPS-based CPU server wIth IIIes resIdent
on a SPARC system. For thIs heterogeneIty to be transparent, there must be conventIons
about data Interchange between programs; Ior soItware maIntenance to be straIghtIor·
ward, there must be conventIons about cross-archItecture compIIatIon.
To avoId byte order probIems, data Is communIcated between programs as text
whenever practIcaI. SometImes, though, the amount oI data Is hIgh enough that a
bInary Iormat Is necessary; such data Is communIcated as a byte stream wIth a pre-
deIIned encodIng Ior muItI-byte vaIues. Ìn the rare cases where a Iormat Is compIex
enough to be deIIned by a data structure, the structure Is never communIcated as a unIt;
Instead, It Is decomposed Into IndIvIduaI IIeIds, encoded as an ordered byte stream, and
then reassembIed by the recIpIent. These conventIons aIIect data rangIng Irom kerneI
or appIIcatIon program state InIormatIon to object IIIe IntermedIates generated by the
compIIer.
Programs, IncIudIng the kerneI, oIten present theIr data through a IIIe system Inter·
Iace, an access mechanIsm that Is InherentIy portabIe. For exampIe, the system cIock Is
represented by a decImaI number In the IIIe /dev/time; the time IIbrary IunctIon
(there Is no time system caII) reads the IIIe and converts It to bInary. SImIIarIy, Instead
oI encodIng the state oI an appIIcatIon process In a serIes oI IIags and bIts In prIvate
memory, the kerneI presents a text strIng In the IIIe named status In the /proc IIIe
system assocIated wIth each process. The PIan 9 ps command Is trIvIaI: It prInts the
contents oI the desIred status IIIes aIter some mInor reIormattIng; moreover, aIter
import helix /proc
a IocaI ps command reports on the status oI HeIIxߣs processes.
Each supported archItecture has Its own compIIers and Ioader. The C and AIeI
compIIers produce IntermedIate IIIes that are portabIy encoded; the contents are unIque
to the target archItecture but the Iormat oI the IIIe Is Independent oI compIIIng proces·
sor type. When a compIIer Ior a gIven archItecture Is compIIed on another type oI pro·
cessor and then used to compIIe a program there, the IntermedIate produced on the
new archItecture Is IdentIcaI to the IntermedIate produced on the natIve processor.
From the compIIerߣs poInt oI vIew, every compIIatIon Is a cross-compIIatIon.
AIthough each archItectureߣs Ioader accepts onIy IntermedIate IIIes produced by
compIIers Ior that archItecture, such IIIes couId have been generated by a compIIer exe·
cutIng on any type oI processor. For Instance, It Is possIbIe to run the MÌPS compIIer on
a 486, then use the MÌPS Ioader on a SPARC to produce a MÌPS executabIe.
SInce PIan 9 runs on a varIety oI archItectures, even In a sIngIe InstaIIatIon, dIstIn·
guIshIng the compIIers and IntermedIate names sImpIIIIes muItI-archItecture
· 11 ·
deveIopment Irom a sIngIe source tree. The compIIers and the Ioader Ior each archItec·
ture are unIqueIy named; there Is no cc command. The names are derIved by concate·
natIng a code Ietter assocIated wIth the target archItecture wIth the name oI the com·
pIIer or Ioader. For exampIe, the Ietter ߢ8ߣ Is the code Ietter Ior ÌnteI x86 processors; the
C compIIer Is named 8c, the AIeI compIIer 8al, and the Ioader Is caIIed 8l. SImIIarIy,
the compIIer IntermedIate IIIes are suIIIxed .8, not .o.
The PIan 9 buIId program mk, a reIatIve oI make, reads the names oI the current
and target archItectures Irom envIronment varIabIes caIIed $cputype and $objtype.
8y deIauIt the current processor Is the target, but settIng $objtype to the name oI
another archItecture beIore InvokIng mk resuIts In a cross-buIId:
% objtype=sparc mk
buIIds a program Ior the SPARC archItecture regardIess oI the executIng machIne. The
vaIue oI $objtype seIects a IIIe oI archItecture-dependent varIabIe deIInItIons that
conIIgures the buIId to use the approprIate compIIers and Ioader. AIthough sImpIe-
mInded, thIs technIque works weII In practIce: aII appIIcatIons In PIan 9 are buIIt Irom a
sIngIe source tree and It Is possIbIe to buIId the varIous archItectures In paraIIeI wIthout
conIIIct.
Parallel programming
PIan 9ߣs support Ior paraIIeI programmIng has two aspects. FIrst, the kerneI pro·
vIdes a sImpIe process modeI and a Iew careIuIIy desIgned system caIIs Ior synchronIza·
tIon and sharIng. Second, a new paraIIeI programmIng Ianguage caIIed AIeI supports
concurrent programmIng. AIthough It Is possIbIe to wrIte paraIIeI programs In C, AIeI Is
the paraIIeI Ianguage oI choIce.
There Is a trend In new operatIng systems to ImpIement two cIasses oI processes:
normaI UNÌX-styIe processes and IIght-weIght kerneI threads. Ìnstead, PIan 9 provIdes
a sIngIe cIass oI process but aIIows IIne controI oI the sharIng oI a processߣs resources
such as memory and IIIe descrIptors. A sIngIe cIass oI process Is a IeasIbIe approach In
PIan 9 because the kerneI has an eIIIcIent system caII InterIace and cheap process cre·
atIon and scheduIIng.
ParaIIeI programs have three basIc requIrements: management oI resources shared
between processes, an InterIace to the scheduIer, and IIne-graIn process synchronIza·
tIon usIng spIn Iocks. On PIan 9, new processes are created usIng the rfork system
caII. Rfork takes a sIngIe argument, a bIt vector that specIIIes whIch oI the parent
processߣs resources shouId be shared, copIed, or created anew In the chIId. The
resources controIIed by rfork IncIude the name space, the envIronment, the IIIe
descrIptor tabIe, memory segments, and notes (PIan 9ߣs anaIog oI UNÌX sIgnaIs). One oI
the bIts controIs whether the rfork caII wIII create a new process; II the bIt Is oII, the
resuItIng modIIIcatIon to the resources occurs In the process makIng the caII. For exam·
pIe, a process caIIs rfork(RFNAMEG) to dIsconnect Its name space Irom Its parentߣs.
AIeI uses a IIne-graIned Iork In whIch aII the resources, IncIudIng memory, are shared
between parent and chIId, anaIogous to creatIng a kerneI thread In many systems.
An IndIcatIon that rfork Is the rIght modeI Is the varIety oI ways It Is used. Other
than the canonIcaI use In the IIbrary routIne fork, It Is hard to IInd two caIIs to rfork
wIth the same bIts set; programs use It to create many dIIIerent Iorms oI sharIng and
resource aIIocatIon. A system wIth just two types oI processesߞreguIar processes and
threadsߞcouId not handIe thIs varIety.
There are two ways to share memory. FIrst, a IIag to rfork causes aII the mem·
ory segments oI the parent to be shared wIth the chIId (except the stack, whIch Is Iorked
copy-on-wrIte regardIess). AIternatIveIy, a new segment oI memory may be attached
usIng the segattach system caII; such a segment wIII aIways be shared between par·
ent and chIId.
· 12 ·
The rendezvous system caII provIdes a way Ior processes to synchronIze. AIeI
uses It to ImpIement communIcatIon channeIs, queuIng Iocks, muItIpIe reader]wrIter
Iocks, and the sIeep and wakeup mechanIsm. Rendezvous takes two arguments, a
tag and a vaIue. When a process caIIs rendezvous wIth a tag It sIeeps untII another
process presents a matchIng tag. When a paIr oI tags match, the vaIues are exchanged
between the two processes and both rendezvous caIIs return. ThIs prImItIve Is suIII·
cIent to ImpIement the IuII set oI synchronIzatIon routInes.
FInaIIy, spIn Iocks are provIded by an archItecture-dependent IIbrary at user IeveI.
Most processors provIde atomIc test and set InstructIons that can be used to ImpIement
Iocks. A notabIe exceptIon Is the MÌPS R3000, so the SCÌ Power serIes muItIprocessors
have specIaI Iock hardware on the bus. User processes gaIn access to the Iock hardware
by mappIng pages oI hardware Iocks Into theIr address space usIng the segattach
system caII.
A PIan 9 process In a system caII wIII bIock regardIess oI Its ߢweIghtߣ. ThIs means
that when a program wIshes to read Irom a sIow devIce wIthout bIockIng the entIre caI·
cuIatIon, It must Iork a process to do the read Ior It. The soIutIon Is to start a sateIIIte
process that does the Ì]O and deIIvers the answer to the maIn program through shared
memory or perhaps a pIpe. ThIs sounds onerous but works easIIy and eIIIcIentIy In
practIce; In Iact, most InteractIve PIan 9 appIIcatIons, even reIatIveIy ordInary ones wrIt·
ten In C, such as the text edItor Sam [PIke87], run as muItIprocess programs.
The kerneI support Ior paraIIeI programmIng In PIan 9 Is a Iew hundred IInes oI
portabIe code; a handIuI oI sImpIe prImItIves enabIe the probIems to be handIed cIeanIy
at user IeveI. AIthough the prImItIves work IIne Irom C, they are partIcuIarIy expressIve
Irom wIthIn AIeI. The creatIon and management oI sIave Ì]O processes can be wrItten In
a Iew IInes oI AIeI, provIdIng the IoundatIon Ior a consIstent means oI muItIpIexIng data
IIows between arbItrary processes. Moreover, ImpIementIng It In a Ianguage rather than
In the kerneI ensures consIstent semantIcs between aII devIces and provIdes a more gen·
eraI muItIpIexIng prImItIve. Compare thIs to the UNÌX select system caII: select
appIIes onIy to a restrIcted set oI devIces, IegIsIates a styIe oI muItIprogrammIng In the
kerneI, does not extend across networks, Is dIIIIcuIt to ImpIement, and Is hard to use.
Another reason paraIIeI programmIng Is Important In PIan 9 Is that muItI-threaded
user-IeveI IIIe servers are the preIerred way to ImpIement servIces. ExampIes oI such
servers IncIude the programmIng envIronment Acme [PIke94], the name space exportIng
tooI exportfs [PPTTW93], the HTTP daemon, and the network name servers cs and
dns [PrWI93]. CompIex appIIcatIons such as Acme prove that careIuI operatIng system
support can reduce the dIIIIcuIty oI wrItIng muItI-threaded appIIcatIons wIthout movIng
threadIng and synchronIzatIon prImItIves Into the kerneI.
Implementation of Name Spaces
User processes construct name spaces usIng three system caIIs: mount, bind,
and unmount. The mount system caII attaches a tree served by a IIIe server to the
current name space. 8eIore caIIIng mount, the cIIent must (by outsIde means) acquIre
a connectIon to the server In the Iorm oI a IIIe descrIptor that may be wrItten and read
to transmIt 9P messages. That IIIe descrIptor represents a pIpe or network connectIon.
The mount caII attaches a new hIerarchy to the exIstIng name space. The bind
system caII, on the other hand, dupIIcates some pIece oI exIstIng name space at another
poInt In the name space. The unmount system caII aIIows components to be removed.
UsIng eIther bind or mount, muItIpIe dIrectorIes may be stacked at a sIngIe poInt
In the name space. Ìn PIan 9 termInoIogy, thIs Is a union dIrectory and behaves IIke the
concatenatIon oI the constItuent dIrectorIes. A IIag argument to bind and mount
specIIIes the posItIon oI a new dIrectory In the unIon, permIttIng new eIements to be
added eIther at the Iront or rear oI the unIon or to repIace It entIreIy. When a IIIe Iookup
· 13 ·
Is perIormed In a unIon dIrectory, each component oI the unIon Is searched In turn and
the IIrst match taken; IIkewIse, when a unIon dIrectory Is read, the contents oI each oI
the component dIrectorIes Is read In turn. UnIon dIrectorIes are one oI the most wIdeIy
used organIzatIonaI Ieatures oI the PIan 9 name space. For Instance, the dIrectory
/bin Is buIIt as a unIon oI /$cputype/bin (program bInarIes), /rc/bin (sheII
scrIpts), and perhaps more dIrectorIes provIded by the user. ThIs constructIon makes
the sheII $PATH varIabIe unnecessary.
One questIon raIsed by unIon dIrectorIes Is whIch eIement oI the unIon receIves a
newIy created IIIe. AIter severaI desIgns, we decIded on the IoIIowIng. 8y deIauIt, dIrec·
torIes In unIons do not accept new IIIes, aIthough the create system caII appIIed to an
exIstIng IIIe succeeds normaIIy. When a dIrectory Is added to the unIon, a IIag to bind
or mount enabIes create permIssIon (a property oI the name space) In that dIrectory.
When a IIIe Is beIng created wIth a new name In a unIon, It Is created In the IIrst dIrec·
tory oI the unIon wIth create permIssIon; II that creatIon IaIIs, the entIre create IaIIs.
ThIs scheme enabIes the common use oI pIacIng a prIvate dIrectory anywhere In a unIon
oI pubIIc ones, whIIe aIIowIng creatIon onIy In the prIvate dIrectory.
8y conventIon, kerneI devIce IIIe systems are bound Into the /dev dIrectory, but to
bootstrap the name space buIIdIng process It Is necessary to have a notatIon that per·
mIts dIrect access to the devIces wIthout an exIstIng name space. The root dIrectory oI
the tree served by a devIce drIver can be accessed usIng the syntax #c, where c Is a
unIque character (typIcaIIy a Ietter) IdentIIyIng the type oI the devIce. SImpIe devIce
drIvers serve a sIngIe IeveI dIrectory contaInIng a Iew IIIes. As an exampIe, each serIaI
port Is represented by a data and a controI IIIe:
% bind −a ’#t’ /dev
% cd /dev
% ls −l eia*
−−rw−rw−rw− t 0 bootes bootes 0 Feb 24 21:14 eia1
−−rw−rw−rw− t 0 bootes bootes 0 Feb 24 21:14 eia1ctl
−−rw−rw−rw− t 0 bootes bootes 0 Feb 24 21:14 eia2
−−rw−rw−rw− t 0 bootes bootes 0 Feb 24 21:14 eia2ctl
The bind program Is an encapsuIatIon oI the bind system caII; Its −a IIag posItIons
the new dIrectory at the end oI the unIon. The data IIIes eia1 and eia2 may be read
and wrItten to communIcate over the serIaI IIne. Ìnstead oI usIng specIaI operatIons on
these IIIes to controI the devIces, commands wrItten to the IIIes eia1ctl and
eia2ctl controI the correspondIng devIce; Ior exampIe, wrItIng the text strIng b1200
to /dev/eia1ctl sets the speed oI that IIne to 1200 baud. Compare thIs to the UNÌX
ioctl system caII: In PIan 9, devIces are controIIed by textuaI messages, Iree oI byte
order probIems, wIth cIear semantIcs Ior readIng and wrItIng. Ìt Is common to conIIgure
or debug devIces usIng sheII scrIpts.
Ìt Is the unIversaI use oI the 9P protocoI that connects PIan 9ߣs components
together to Iorm a dIstrIbuted system. Rather than InventIng a unIque protocoI Ior each
servIce such as rlogin, FTP, TFTP, and X wIndows, PIan 9 ImpIements servIces In
terms oI operatIons on IIIe objects, and then uses a sIngIe, weII-documented protocoI to
exchange InIormatIon between computers. UnIIke NFS, 9P treats IIIes as a sequence oI
bytes rather than bIocks. AIso unIIke NFS, 9P Is stateIuI: cIIents perIorm remote proce·
dure caIIs to estabIIsh poInters to objects In the remote IIIe server. These poInters are
caIIed IIIe IdentIIIers or fids. AII operatIons on IIIes suppIy a IId to IdentIIy an object In
the remote IIIe system.
The 9P protocoI deIInes 17 messages, provIdIng means to authentIcate users, navI·
gate IIds around a IIIe system hIerarchy, copy IIds, perIorm Ì]O, change IIIe attrIbutes,
and create and deIete IIIes. Ìts compIete specIIIcatIon Is In SectIon S oI the
Programmerߣs ManuaI [9man]. Here Is the procedure to gaIn access to the name hIerar·
chy suppIIed by a server. A IIIe server connectIon Is estabIIshed vIa a pIpe or network
· 14 ·
connectIon. An InItIaI session message perIorms a bIIateraI authentIcatIon between
cIIent and server. An attach message then connects a IId suggested by the cIIent to
the root oI the server IIIe tree. The attach message IncIudes the IdentIty oI the user
perIormIng the attach; henceIorth aII IIds derIved Irom the root IId wIII have permIssIons
assocIated wIth that user. MuItIpIe users may share the connectIon, but each must per·
Iorm an attach to estabIIsh hIs or her IdentIty.
The walk message moves a IId through a sIngIe IeveI oI the IIIe system hIerarchy.
The clone message takes an estabIIshed IId and produces a copy that poInts to the
same IIIe as the orIgInaI. Ìts purpose Is to enabIe waIkIng to a IIIe In a dIrectory wIthout
IosIng the IId on the dIrectory. The open message Iocks a IId to a specIIIc IIIe In the
hIerarchy, checks access permIssIons, and prepares the IId Ior Ì]O. The read and
write messages aIIow Ì]O at arbItrary oIIsets In the IIIe; the maxImum sIze transIerred
Is deIIned by the protocoI. The clunk message IndIcates the cIIent has no Iurther use
Ior a IId. The remove message behaves IIke clunk but causes the IIIe assocIated wIth
the IId to be removed and any assocIated resources on the server to be deaIIocated.
9P has two Iorms: RPC messages sent on a pIpe or network connectIon and a pro·
ceduraI InterIace wIthIn the kerneI. SInce kerneI devIce drIvers are dIrectIy addressabIe,
there Is no need to pass messages to communIcate wIth them; Instead each 9P transac·
tIon Is ImpIemented by a dIrect procedure caII. For each IId, the kerneI maIntaIns a IocaI
representatIon In a data structure caIIed a channel, so aII operatIons on IIIes perIormed
by the kerneI InvoIve a channeI connected to that IId. The sImpIest exampIe Is a user
processߣs IIIe descrIptors, whIch are Indexes Into an array oI channeIs. A tabIe In the
kerneI provIdes a IIst oI entry poInts correspondIng one to one wIth the 9P messages Ior
each devIce. A system caII such as read Irom the user transIates Into one or more pro·
cedure caIIs through that tabIe, Indexed by the type character stored In the channeI:
procread, eiaread, etc. Each caII takes at Ieast one channeI as an argument. A
specIaI kerneI drIver, caIIed the mount drIver, transIates procedure caIIs to messages,
that Is, It converts IocaI procedure caIIs to remote ones. Ìn eIIect, thIs specIaI drIver
becomes a IocaI proxy Ior the IIIes served by a remote IIIe server. The channeI poInter
In the IocaI caII Is transIated to the assocIated IId In the transmItted message.
The mount drIver Is the soIe RPC mechanIsm empIoyed by the system. The seman·
tIcs oI the suppIIed IIIes, rather than the operatIons perIormed upon them, create a par·
tIcuIar servIce such as the cpu command. The mount drIver demuItIpIexes protocoI
messages between cIIents sharIng a communIcatIon channeI wIth a IIIe server. For each
outgoIng RPC message, the mount drIver aIIocates a buIIer IabeIed by a smaII unIque
Integer, caIIed a tag. The repIy to the RPC Is IabeIed wIth the same tag, whIch Is used by
the mount drIver to match the repIy wIth the request.
The kerneI representatIon oI the name space Is caIIed the mount table, whIch
stores a IIst oI bIndIngs between channeIs. Each entry In the mount tabIe contaIns a paIr
oI channeIs: a from channeI and a to channeI. Every tIme a waIk succeeds In movIng a
channeI to a new IocatIon In the name space, the mount tabIe Is consuIted to see II a
ߢIromߣ channeI matches the new name; II so the ߢtoߣ channeI Is cIoned and substItuted
Ior the orIgInaI. UnIon dIrectorIes are ImpIemented by convertIng the ߢtoߣ channeI Into a
IIst oI channeIs: a successIuI waIk to a unIon dIrectory returns a ߢtoߣ channeI that Iorms
the head oI a IIst oI channeIs, each representIng a component dIrectory oI the unIon. ÌI
a waIk IaIIs to IInd a IIIe In the IIrst dIrectory oI the unIon, the IIst Is IoIIowed, the next
component cIoned, and waIk trIed on that dIrectory.
Each IIIe In PIan 9 Is unIqueIy IdentIIIed by a set oI Integers: the type oI the channeI
(used as the Index oI the IunctIon caII tabIe), the server or devIce number dIstInguIshIng
the server Irom others oI the same type (decIded IocaIIy by the drIver), and a qid Iormed
Irom two 32-bIt numbers caIIed path and version. The path Is a unIque IIIe number
assIgned by a devIce drIver or IIIe server when a IIIe Is created. The versIon number Is
updated whenever the IIIe Is modIIIed; as descrIbed In the next sectIon, It can be used
· 1S ·
to maIntaIn cache coherency between cIIents and servers.
The type and devIce number are anaIogous to UNÌX major and mInor devIce num·
bers; the qId Is anaIogous to the I-number. The devIce and type connect the channeI to
a devIce drIver and the qId IdentIIIes the IIIe wIthIn that devIce. ÌI the IIIe recovered Irom
a waIk has the same type, devIce, and qId path as an entry In the mount tabIe, they are
the same IIIe and the correspondIng substItutIon Irom the mount tabIe Is made. ThIs Is
how the name space Is ImpIemented.
File Caching
The 9P protocoI has no expIIcIt support Ior cachIng IIIes on a cIIent. The Iarge
memory oI the centraI IIIe server acts as a shared cache Ior aII Its cIIents, whIch reduces
the totaI amount oI memory needed across aII machInes In the network. NonetheIess,
there are sound reasons to cache IIIes on the cIIent, such as a sIow connectIon to the IIIe
server.
The versIon IIeId oI the qId Is changed whenever the IIIe Is modIIIed, whIch makes
It possIbIe to do some weakIy coherent Iorms oI cachIng. The most Important Is cIIent
cachIng oI text and data segments oI executabIe IIIes. When a process execs a pro·
gram, the IIIe Is re-opened and the qIdߣs versIon Is compared wIth that In the cache; II
they match, the IocaI copy Is used. The same method can be used to buIId a IocaI cach·
Ing IIIe server. ThIs user-IeveI server Interposes on the 9P connectIon to the remote
server and monItors the traIIIc, copyIng data to a IocaI dIsk. When It sees a read oI
known data, It answers dIrectIy, whIIe wrItes are passed on ImmedIateIyߞthe cache Is
wrIte-throughߞto keep the centraI copy up to date. ThIs Is transparent to processes on
the termInaI and requIres no change to 9P; It works weII on home machInes connected
over serIaI IInes. A sImIIar method can be appIIed to buIId a generaI cIIent cache In
unused IocaI memory, but thIs has not been done In PIan 9.
Networks and Communication Devices
Network InterIaces are kerneI-resIdent IIIe systems, anaIogous to the EÌA devIce
descrIbed earIIer. CaII setup and shutdown are achIeved by wrItIng text strIngs to the
controI IIIe assocIated wIth the devIce; InIormatIon Is sent and receIved by readIng and
wrItIng the data IIIe. The structure and semantIcs oI the devIces Is common to aII net·
works so, other than a IIIe name substItutIon, the same procedure makes a caII usIng
TCP over Ethernet as URP over DatakIt [Fra80].
ThIs exampIe IIIustrates the structure oI the TCP devIce:
% ls −lp /net/tcp
d−r−xr−xr−x I 0 bootes bootes 0 Feb 23 20:20 0
d−r−xr−xr−x I 0 bootes bootes 0 Feb 23 20:20 1
−−rw−rw−rw− I 0 bootes bootes 0 Feb 23 20:20 clone
% ls −lp /net/tcp/0
−−rw−rw−−−− I 0 rob bootes 0 Feb 23 20:20 ctl
−−rw−rw−−−− I 0 rob bootes 0 Feb 23 20:20 data
−−rw−rw−−−− I 0 rob bootes 0 Feb 23 20:20 listen
−−r−−r−−r−− I 0 bootes bootes 0 Feb 23 20:20 local
−−r−−r−−r−− I 0 bootes bootes 0 Feb 23 20:20 remote
−−r−−r−−r−− I 0 bootes bootes 0 Feb 23 20:20 status
%
The top dIrectory, /net/tcp, contaIns a clone IIIe and a dIrectory Ior each connec·
tIon, numbered 0 to n. Each connectIon dIrectory corresponds to an TCP]ÌP connectIon.
OpenIng clone reserves an unused connectIon and returns Its controI IIIe. ReadIng the
controI IIIe returns the textuaI connectIon number, so the user process can construct the
IuII name oI the newIy aIIocated connectIon dIrectory. The local, remote, and
status IIIes are dIagnostIc; Ior exampIe, remote contaIns the address (Ior TCP, the
· 16 ·
ÌP address and port number) oI the remote sIde.
A caII Is InItIated by wrItIng a connect message wIth a network-specIIIc address as
Its argument; Ior exampIe, to open a TeInet sessIon (port 23) to a remote machIne wIth
ÌP address 13S.104.9.S2, the strIng Is:
connect 135.104.9.52!23
The wrIte to the controI IIIe bIocks untII the connectIon Is estabIIshed; II the destInatIon
Is unreachabIe, the wrIte returns an error. Once the connectIon Is estabIIshed, the
telnet appIIcatIon reads and wrItes the data IIIe to taIk to the remote TeInet dae·
mon. On the other end, the TeInet daemon wouId start by wrItIng
announce 23
to Its controI IIIe to IndIcate Its wIIIIngness to receIve caIIs to thIs port. Such a daemon
Is caIIed a listener In PIan 9.
A unIIorm structure Ior network devIces cannot hIde aII the detaIIs oI addressIng
and communIcatIon Ior dIssImIIar networks. For exampIe, DatakIt uses textuaI, hIerar·
chIcaI addresses unIIke ÌPߣs 32-bIt addresses, so an appIIcatIon gIven a controI IIIe must
stIII know what network It represents. Rather than make every appIIcatIon know the
addressIng oI every network, PIan 9 hIdes these detaIIs In a connection server, caIIed
cs. Cs Is a IIIe system mounted In a known pIace. Ìt suppIIes a sIngIe controI IIIe that
an appIIcatIon uses to dIscover how to connect to a host. The appIIcatIon wrItes the
symboIIc address and servIce name Ior the connectIon It wIshes to make, and reads
back the name oI the clone IIIe to open and the address to present to It. ÌI there are
muItIpIe networks between the machInes, cs presents a IIst oI possIbIe networks and
addresses to be trIed In sequence; It uses heurIstIcs to decIde the order. For Instance, It
presents the hIghest-bandwIdth choIce IIrst.
A sIngIe IIbrary IunctIon caIIed dial taIks to cs to estabIIsh the connectIon. An
appIIcatIon that uses dial needs no changes, not even recompIIatIon, to adapt to new
networks; the InterIace to cs hIdes the detaIIs.
The unIIorm structure Ior networks In PIan 9 makes the import command aII that
Is needed to construct gateways.
Kernel structure for networks
The kerneI pIumbIng used to buIId PIan 9 communIcatIons channeIs Is caIIed
streams [RIt84][Presotto]. A stream Is a bIdIrectIonaI channeI connectIng a physIcaI or
pseudo-devIce to a user process. The user process Inserts and removes data at one end
oI the stream; a kerneI process actIng on behaII oI a devIce operates at the other end. A
stream comprIses a IInear IIst oI processing modules. Each moduIe has both an
upstream (toward the process) and downstream (toward the devIce) put routine. CaIIIng
the put routIne oI the moduIe on eIther end oI the stream Inserts data Into the stream.
Each moduIe caIIs the succeedIng one to send data up or down the stream. LIke UNÌX
streams [RIt84], PIan 9 streams can be dynamIcaIIy conIIgured.
The IL Protocol
The 9P protocoI must run above a reIIabIe transport protocoI wIth deIImIted mes·
sages. 9P has no mechanIsm to recover Irom transmIssIon errors and the system
assumes that each read Irom a communIcatIon channeI wIII return a sIngIe 9P message;
It does not parse the data stream to dIscover message boundarIes. PIpes and some net·
work protocoIs aIready have these propertIes but the standard ÌP protocoIs do not. TCP
does not deIImIt messages, whIIe UDP [RFC768] does not provIde reIIabIe In-order deIIv·
ery.
We desIgned a new protocoI, caIIed ÌL (Ìnternet LInk), to transmIt 9P messages over
ÌP. Ìt Is a connectIon-based protocoI that provIdes reIIabIe transmIssIon oI sequenced
· 17 ·
messages between machInes. SInce a process can have onIy a sIngIe outstandIng 9P
request, there Is no need Ior IIow controI In ÌL. LIke TCP, ÌL has adaptIve tImeouts: It
scaIes acknowIedge and retransmIssIon tImes to match the network speed. ThIs aIIows
the protocoI to perIorm weII on both the Ìnternet and on IocaI Ethernets. AIso, ÌL does
no bIInd retransmIssIon, to avoId addIng to the congestIon oI busy networks. FuII
detaIIs are In another paper [PrWI9S].
Ìn PIan 9, the ImpIementatIon oI ÌL Is smaIIer and Iaster than TCP. ÌL Is our maIn
Ìnternet transport protocoI.
Overview of authentication
AuthentIcatIon estabIIshes the IdentIty oI a user accessIng a resource. The user
requestIng the resource Is caIIed the client and the user grantIng access to the resource
Is caIIed the server. ThIs Is usuaIIy done under the auspIces oI a 9P attach message. A
user may be a cIIent In one authentIcatIon exchange and a server In another. Servers
aIways act on behaII oI some user, eIther a normaI cIIent or some admInIstratIve entIty,
so authentIcatIon Is deIIned to be between users, not machInes.
Each PIan 9 user has an assocIated DES [N8S77] authentIcatIon key; the userߣs Iden·
tIty Is verIIIed by the abIIIty to encrypt and decrypt specIaI messages caIIed chaIIenges.
SInce knowIedge oI a userߣs key gIves access to that userߣs resources, the PIan 9 authen·
tIcatIon protocoIs never transmIt a message contaInIng a cIeartext key.
AuthentIcatIon Is bIIateraI: at the end oI the authentIcatIon exchange, each sIde Is
convInced oI the otherߣs IdentIty. Every machIne begIns the exchange wIth a DES key In
memory. Ìn the case oI CPU and IIIe servers, the key, user name, and domaIn name Ior
the server are read Irom permanent storage, usuaIIy non-voIatIIe RAM. Ìn the case oI
termInaIs, the key Is derIved Irom a password typed by the user at boot tIme. A specIaI
machIne, known as the authentication server, maIntaIns a database oI keys Ior aII users
In Its admInIstratIve domaIn and partIcIpates In the authentIcatIon protocoIs.
The authentIcatIon protocoI Is as IoIIows: aIter exchangIng chaIIenges, one party
contacts the authentIcatIon server to create permIssIon-grantIng tickets encrypted wIth
each partyߣs secret key and contaInIng a new conversatIon key. Each party decrypts Its
own tIcket and uses the conversatIon key to encrypt the other partyߣs chaIIenge.
ThIs structure Is somewhat IIke Kerberos [M8SS87], but avoIds Its reIIance on syn·
chronIzed cIocks. AIso unIIke Kerberos, PIan 9 authentIcatIon supports a ߢspeaks Iorߣ
reIatIon [LA8W91] that enabIes one user to have the authorIty oI another; thIs Is how a
CPU server runs processes on behaII oI Its cIIents.
PIan 9ߣs authentIcatIon structure buIIds secure servIces rather than dependIng on
IIrewaIIs. Whereas IIrewaIIs requIre specIaI code Ior every servIce penetratIng the waII,
the PIan 9 approach permIts authentIcatIon to be done In a sIngIe pIaceߞ9PߞIor aII ser·
vIces. For exampIe, the cpu command works secureIy across the Ìnternet.
Authenticating external connections
The reguIar PIan 9 authentIcatIon protocoI Is not suItabIe Ior text-based servIces
such as TeInet or FTP. Ìn such cases, PIan 9 users authentIcate wIth hand-heId DES caI·
cuIators caIIed authenticators. The authentIcator hoIds a key Ior the user, dIstInct Irom
the userߣs normaI authentIcatIon key. The user ߢIogs onߣ to the authentIcator usIng a 4-
dIgIt PÌN. A correct PÌN enabIes the authentIcator Ior a chaIIenge]response exchange
wIth the server. SInce a correct chaIIenge]response exchange Is vaIId onIy once and
keys are never sent over the network, thIs procedure Is not susceptIbIe to repIay attacks,
yet Is compatIbIe wIth protocoIs IIke TeInet and FTP.
· 18 ·
Special users
PIan 9 has no super-user. Each server Is responsIbIe Ior maIntaInIng Its own secu·
rIty, usuaIIy permIttIng access onIy Irom the consoIe, whIch Is protected by a password.
For exampIe, IIIe servers have a unIque admInIstratIve user caIIed adm, wIth specIaI prIv·
IIeges that appIy onIy to commands typed at the serverߣs physIcaI consoIe. These prIvI·
Ieges concern the day-to-day maIntenance oI the server, such as addIng new users and
conIIgurIng dIsks and networks. The prIvIIeges do not IncIude the abIIIty to modIIy,
examIne, or change the permIssIons oI any IIIes. ÌI a IIIe Is read-protected by a user,
onIy that user may grant access to others.
CPU servers have an equIvaIent user name that aIIows admInIstratIve access to
resources on that server such as the controI IIIes oI user processes. Such permIssIon Is
necessary, Ior exampIe, to kIII rogue processes, but does not extend beyond that server.
On the other hand, by means oI a key heId In protected non-voIatIIe RAM, the IdentIty oI
the admInIstratIve user Is proven to the authentIcatIon server. ThIs aIIows the CPU
server to authentIcate remote users, both Ior access to the server ItseII and when the
CPU server Is actIng as a proxy on theIr behaII.
FInaIIy, a specIaI user caIIed none has no password and Is aIways aIIowed to con·
nect; anyone may cIaIm to be none. None has restrIcted permIssIons; Ior exampIe, It
Is not aIIowed to examIne dump IIIes and can read onIy worId-readabIe IIIes.
The Idea behInd none Is anaIogous to the anonymous user In FTP servIces. On
PIan 9, guest FTP servers are Iurther conIIned wIthIn a specIaI restrIcted name space. Ìt
dIsconnects guest users Irom system programs, such as the contents oI /bin, but
makes It possIbIe to make IocaI IIIes avaIIabIe to guests by bIndIng them expIIcItIy Into
the space. A restrIcted name space Is more secure than the usuaI technIque oI export·
Ing an ad hoc dIrectory tree; the resuIt Is a kInd oI cage around untrusted users.
The cpu command and proxied authentication
When a caII Is made to a CPU server Ior a user, say Peter, the Intent Is that Peter
wIshes to run processes wIth hIs own authorIty. To ImpIement thIs property, the CPU
server does the IoIIowIng when the caII Is receIved. FIrst, the IIstener Iorks oII a process
to handIe the caII. ThIs process changes to the user none to avoId gIvIng away permIs·
sIons II It Is compromIsed. Ìt then perIorms the authentIcatIon protocoI to verIIy that
the caIIIng user reaIIy Is Peter, and to prove to Peter that the machIne Is ItseII trustwor·
thy. FInaIIy, It reattaches to aII reIevant IIIe servers usIng the authentIcatIon protocoI to
IdentIIy ItseII as Peter. Ìn thIs case, the CPU server Is a cIIent oI the IIIe server and per·
Iorms the cIIent portIon oI the authentIcatIon exchange on behaII oI Peter. The authen·
tIcatIon server wIII gIve the process tIckets to accompIIsh thIs onIy II the CPU serverߣs
admInIstratIve user name Is aIIowed to speak for Peter.
The speaks for reIatIon [LA8W91] Is kept In a tabIe on the authentIcatIon server. To
sImpIIIy the management oI users computIng In dIIIerent authentIcatIon domaIns, It aIso
contaIns mappIngs between user names In dIIIerent domaIns, Ior exampIe sayIng that
user rtm In one domaIn Is the same person as user rtmorris In another.
File Permissions
One oI the advantages oI constructIng servIces as IIIe systems Is that the soIutIons
to ownershIp and permIssIon probIems IaII out naturaIIy. As In UNÌX, each IIIe or dIrec·
tory has separate read, wrIte, and execute]search permIssIons Ior the IIIeߣs owner, the
IIIeߣs group, and anyone eIse. The Idea oI group Is unusuaI: any user name Is potentIaIIy
a group name. A group Is just a user wIth a IIst oI other users In the group. Conven·
tIons make the dIstInctIon: most peopIe have user names wIthout group members, whIIe
groups have Iong IIsts oI attached names. For exampIe, the sys group tradItIonaIIy has
aII the system programmers, and system IIIes are accessIbIe by group sys. ConsIder
· 19 ·
the IoIIowIng two IInes oI a user database stored on a server:
pjw:pjw:
sys::pjw,ken,philw,presotto
The IIrst estabIIshes user pjw as a reguIar user. The second estabIIshes user sys as a
group and IIsts Iour users who are members oI that group. The empty coIon-separated
IIeId Is space Ior a user to be named as the group leader. ÌI a group has a Ieader, that
user has specIaI permIssIons Ior the group, such as Ireedom to change the group per·
mIssIons oI IIIes In that group. ÌI no Ieader Is specIIIed, each member oI the group Is
consIdered equaI, as II each were the Ieader. Ìn our exampIe, onIy pjw can add mem·
bers to hIs group, but aII oI sysߣs members are equaI partners In that group.
ReguIar IIIes are owned by the user that creates them. The group name Is InherIted
Irom the dIrectory hoIdIng the new IIIe. DevIce IIIes are treated specIaIIy: the kerneI
may arrange the ownershIp and permIssIons oI a IIIe approprIate to the user accessIng
the IIIe.
A good exampIe oI the generaIIty thIs oIIers Is process IIIes, whIch are owned and
read-protected by the owner oI the process. ÌI the owner wants to Iet someone eIse
access the memory oI a process, Ior exampIe to Iet the author oI a program debug a
broken Image, the standard chmod command appIIed to the process IIIes does the job.
Another unusuaI appIIcatIon oI IIIe permIssIons Is the dump IIIe system, whIch Is
not onIy served by the same IIIe server as the orIgInaI data, but represented by the same
user database. FIIes In the dump are thereIore gIven IdentIcaI protectIon as IIIes In the
reguIar IIIe system; II a IIIe Is owned by pjw and read-protected, once It Is In the dump
IIIe system It Is stIII owned by pjw and read-protected. AIso, sInce the dump IIIe sys·
tem Is ImmutabIe, the IIIe cannot be changed; It Is read-protected Iorever. Drawbacks
are that II the IIIe Is readabIe but shouId have been read-protected, It Is readabIe Ior·
ever, and that user names are hard to re-use.
Performance
As a sImpIe measure oI the perIormance oI the PIan 9 kerneI, we compared the
tIme to do some sImpIe operatIons on PIan 9 and on SCÌߣs ÌRÌX ReIease S.3 runnIng on
an SCÌ ChaIIenge M wIth a 100MHz MÌPS R4400 and a 1-megabyte secondary cache.
The test program was wrItten In AIeI, compIIed wIth the same compIIer, and run on Iden·
tIcaI hardware, so the onIy varIabIes are the operatIng system and IIbrarIes.
The program tests the tIme to do a context swItch (rendezvous on PIan 9,
blockproc on ÌRÌX); a trIvIaI system caII (rfork(0) and nap(0)); and IIghtweIght
Iork (rfork(RFPROC) and sproc(PR_SFDS|PR_SADDR)). Ìt aIso measures the
tIme to send a byte on a pIpe Irom one process to another and the throughput on a pIpe
between two processes. The resuIts appear In TabIe 1.
______________________________________________
Test PIan 9 ÌRÌX
______________________________________________
Context swItch 39 µs 1S0 µs
System caII 6 µs 36 µs
LIght Iork 1300 µs 2200 µs
PIpe Iatency 110 µs 200 µs
PIpe bandwIdth 11678 K8]s 14S4S K8]s
______________________________________________
















Table 1. Performance comparison.
AIthough the PIan 9 tImes are not spectacuIar, they show that the kerneI Is competItIve
wIth commercIaI systems.
· 20 ·
Discussion
PIan 9 has a reIatIveIy conventIonaI kerneI; the systemߣs noveIty IIes In the pIeces
outsIde the kerneI and the way they Interact. When buIIdIng PIan 9, we consIdered aII
aspects oI the system together, soIvIng probIems where the soIutIon IIt best. Some·
tImes the soIutIon spanned many components. An exampIe Is the probIem oI heteroge·
neous InstructIon archItectures, whIch Is addressed by the compIIers (dIIIerent code
characters, portabIe object code), the envIronment ($cputype and $objtype), the
name space (bIndIng In /bin), and other components. SometImes many Issues couId
be soIved In a sIngIe pIace. The best exampIe Is 9P, whIch centraIIzes namIng, access,
and authentIcatIon. 9P Is reaIIy the core oI the system; It Is IaIr to say that the PIan 9
kerneI Is prImarIIy a 9P muItIpIexer.
PIan 9ߣs Iocus on IIIes and namIng Is centraI to Its expressIveness. PartIcuIarIy In
dIstrIbuted computIng, the way thIngs are named has proIound InIIuence on the system
[Nee89]. The combInatIon oI IocaI name spaces and gIobaI conventIons to Interconnect
networked resources avoIds the dIIIIcuIty oI maIntaInIng a gIobaI unIIorm name space,
whIIe namIng everythIng IIke a IIIe makes the system easy to understand, even Ior nov·
Ices. ConsIder the dump IIIe system, whIch Is trIvIaI to use Ior anyone IamIIIar wIth hIer·
archIcaI IIIe systems. At a deeper IeveI, buIIdIng aII the resources above a sIngIe unI·
Iorm InterIace makes InteroperabIIIty easy. Once a resource exports a 9P InterIace, It
can combIne transparentIy wIth any other part oI the system to buIId unusuaI appIIca·
tIons; the detaIIs are hIdden. ThIs may sound object-orIented, but there are dIstInc·
tIons. FIrst, 9P deIInes a IIxed set oI ߢmethodsߣ; It Is not an extensIbIe protocoI. More
Important, IIIes are weII-deIIned and weII-understood and come prepackaged wIth
IamIIIar methods oI access, protectIon, namIng, and networkIng. Objects, despIte theIr
generaIIty, do not come wIth these attrIbutes deIIned. 8y reducIng ߢobjectߣ to ߢIIIeߣ, PIan
9 gets some technoIogy Ior Iree.
NonetheIess, It Is possIbIe to push the Idea oI IIIe-based computIng too Iar. Con·
vertIng every resource In the system Into a IIIe system Is a kInd oI metaphor, and meta·
phors can be abused. A good exampIe oI restraInt Is /proc, whIch Is onIy a vIew oI a
process, not a representatIon. To run processes, the usuaI fork and exec caIIs are
stIII necessary, rather than doIng somethIng IIke
cp /bin/date /proc/clone/mem
The probIem wIth such exampIes Is that they requIre the server to do thIngs not under
Its controI. The abIIIty to assIgn meanIng to a command IIke thIs does not ImpIy the
meanIng wIII IaII naturaIIy out oI the structure oI answerIng the 9P requests It generates.
As a reIated exampIe, PIan 9 does not put machIneߣs network names In the IIIe name
space. The network InterIaces provIde a very dIIIerent modeI oI namIng, because usIng
open, create, read, and write on such IIIes wouId not oIIer a suItabIe pIace to
encode aII the detaIIs oI caII setup Ior an arbItrary network. ThIs does not mean that the
network InterIace cannot be IIIe-IIke, just that It must have a more tIghtIy deIIned struc·
ture.
What wouId we do dIIIerentIy next tImeZ Some eIements oI the ImpIementatIon are
unsatIsIactory. UsIng streams to ImpIement network InterIaces In the kerneI aIIows pro·
tocoIs to be connected together dynamIcaIIy, such as to attach the same TTY drIver to
TCP, URP, and ÌL connectIons, but PIan 9 makes no use oI thIs conIIgurabIIIty. (Ìt was
expIoIted, however, In the research UNÌX system Ior whIch streams were Invented.)
RepIacIng streams by statIc Ì]O queues wouId sImpIIIy the code and make It Iaster.
AIthough the maIn PIan 9 kerneI Is portabIe across many machInes, the IIIe server
Is ImpIemented separateIy. ThIs has caused severaI probIems: drIvers that must be wrIt·
ten twIce, bugs that must be IIxed twIce, and weaker portabIIIty oI the IIIe system code.
The soIutIon Is easy: the IIIe server kerneI shouId be maIntaIned as a varIant oI the regu·
Iar operatIng system, wIth no user processes and specIaI compIIed-In kerneI processes
· 21 ·
to ImpIement IIIe servIce. Another Improvement to the IIIe system wouId be a change oI
InternaI structure. The WORM jukebox Is the Ieast reIIabIe pIece oI the hardware, but
because It hoIds the metadata oI the IIIe system, It must be present In order to serve
IIIes. The system couId be restructured so the WORM Is a backup devIce onIy, wIth the
IIIe system proper resIdIng on magnetIc dIsks. ThIs wouId requIre no change to the
externaI InterIace.
AIthough PIan 9 has per-process name spaces, It has no mechanIsm to gIve the
descrIptIon oI a processߣs name space to another process except by dIrect InherItance.
The cpu command, Ior exampIe, cannot In generaI reproduce the termInaIߣs name
space; It can onIy re-Interpret the userߣs IogIn proIIIe and make substItutIons Ior thIngs
IIke the name oI the bInary dIrectory to Ioad. ThIs mIsses any IocaI modIIIcatIons made
beIore runnIng cpu. Ìt shouId Instead be possIbIe to capture the termInaIߣs name space
and transmIt Its descrIptIon to a remote process.
DespIte these probIems, PIan 9 works weII. Ìt has matured Into the system that
supports our research, rather than beIng the subject oI the research ItseII. ExperImentaI
new work IncIudes deveIopIng InterIaces to Iaster networks, IIIe cachIng In the cIIent
kerneI, encapsuIatIng and exportIng name spaces, and the abIIIty to re-estabIIsh the
cIIent state aIter a server crash. AttentIon Is now IocusIng on usIng the system to buIId
dIstrIbuted appIIcatIons.
One reason Ior PIan 9ߣs success Is that we use It Ior our daIIy work, not just as a
research tooI. ActIve use Iorces us to address shortcomIngs as they arIse and to adapt
the system to soIve our probIems. Through thIs process, PIan 9 has become a comIort·
abIe, productIve programmIng envIronment, as weII as a vehIcIe Ior Iurther systems
research.
References
[9man] Plan 9 Programmer’s Manual, Volume 1, AT&T 8eII LaboratorIes, Murray HIII, N], 199S.
[ANSÌC] American National Standard for Information Systems ߝ Programming Language C, Amer·
Ican NatIonaI Standards ÌnstItute, Ìnc., New York, 1990.
[DuII90] Tom DuII, ߢߢRc - A SheII Ior PIan 9 and UNÌX systemsߣߣ, Proc. of the Summer 1990 UKUUG
Conf., London, ]uIy, 1990, pp. 21-33, reprInted, In a dIIIerent Iorm, In thIs voIume.
[Fra80] A.C. Fraser, ߢߢDatakIt ߝ A ModuIar Network Ior Synchronous and Asynchronous TraIIIcߣߣ,
Proc. Int. Conf. on Commun., ]une 1980, 8oston, MA.
[FSSUTF] File System Safe UCS Transformation Format (FSS−UTF), X]Open PreIImInary
SpecIIIcatIon, 1993. ÌSO desIgnatIon Is ÌSO]ÌEC ]TC1]SC2]WC2 N 1036, dated 1994-
08-01.
[ÌSO10646] ÌSO]ÌEC DÌS 10646-1:1993 Information technology ߝ Universal Multiple−Octet Coded
Character Set (UCS) ߞ Part 1: Architecture and Basic Multilingual Plane.
[KIII84] T.]. KIIIIan, ߢߢProcesses as FIIesߣߣ, USENIX Summer 1984 Conf. Proc., ]une 1984, SaIt Lake
CIty, UT.
[LA8W91] 8utIer Lampson, MartIn AbadI, MIchaeI 8urrows, and Edward Wobber, ߢߢAuthentIcatIon In
DIstrIbuted Systems: Theory and PractIceߣߣ, Proc. 13th ACM Symp. on Op. Sys. Princ.,
AsIIomar, 1991, pp. 16S-182.
[M8SS87] S. P. MIIIer, 8. C. Neumann, ]. Ì. SchIIIer, and ]. H. SaItzer, ߢߢKerberos AuthentIcatIon and
AuthorIzatIon Systemߣߣ, Massachusetts ÌnstItute oI TechnoIogy, 1987.
[N8S77] NatIonaI 8ureau oI Standards (U.S.), Federal Information Processing Standard 46,
NatIonaI TechnIcaI ÌnIormatIon ServIce, SprIngIIeId, VA, 1977.
[Nee89] R. Needham, ߢߢNamesߣߣ, In Distributed systems, S. MuIIender, ed., AddIson WesIey, 1989
[NeHe82] R.M. Needham and A.]. Herbert, The Cambridge Distributed Computing System,
AddIson-WesIey, London, 1982
[Neu92] 8. CIIIIord Neuman, ߢߢThe Prospero FIIe Systemߣߣ, USENIX File Systems Workshop Proc.,
Ann Arbor, 1992, pp. 13-28.
[OCDNW88] ]ohn Ousterhout, Andrew Cherenson, Fred DougIIs, MIke NeIson, and 8rent WeIch,
ߢߢThe SprIte Network OperatIng Systemߣߣ, IEEE Computer, 21(2), 23-38, Feb. 1988.
· 22 ·
[PIke87] Rob PIke, ߢߢThe Text EdItor samߣߣ, Software − Practice and Experience, Nov 1987, 17(11),
pp. 813-84S; reprInted In thIs voIume.
[PIke91] Rob PIke, ߢߢ8l, the PIan 9 WIndow Systemߣߣ, USENIX Summer Conf. Proc., NashvIIIe, ]une,
1991, pp. 2S7-26S, reprInted In thIs voIume.
[PIke93] Rob PIke and Ken Thompson, ߢߢHeIIo WorId or KcAqµcpc xóoµc or ߣߣ,
USENIX Winter Conf. Proc., San DIego, 1993, pp. 43-S0, reprInted In thIs voIume.
[PIke94] Rob PIke, ߢߢAcme: A User ÌnterIace Ior Programmersߣߣ, USENIX Proc. of the Winter 1994
Conf., San FrancIsco, CA,
[PIke9S] Rob PIke, ߢߢHow to Use the PIan 9 C CompIIerߣߣ, Plan 9 Programmer’s Manual, Volume 2,
AT&T 8eII LaboratorIes, Murray HIII, N], 199S.
[POSÌX] Information TechnologyߞPortable Operating System Interface (POSIX) Part 1: System
Application Program Interface (API) [C Language], ÌEEE, New York, 1990.
[PPTTW93] Rob PIke, Dave Presotto, Ken Thompson, Howard TrIckey, and PhII WInterbottom, ߢߢThe
Use oI Name Spaces In PIan 9ߣߣ, Op. Sys. Rev., VoI. 27, No. 2, AprII 1993, pp. 72-76,
reprInted In thIs voIume.
[Presotto]Dave Presotto, ߢߢMuItIprocessor Streams Ior PIan 9ߣߣ, UKUUG Summer 1990 Conf. Proc.,
]uIy 1990, pp. 11-19.
[PrWI93] Dave Presotto and PhII WInterbottom, ߢߢThe OrganIzatIon oI Networks In PIan 9ߣߣ, USENIX
Proc. of the Winter 1993 Conf., San DIego, CA, pp. 43-S0, reprInted In thIs voIume.
[PrWI9S] Dave Presotto and PhII WInterbottom, ߢߢThe ÌL ProtocoIߣߣ, Plan 9 Programmer’s Manual,
Volume 2, AT&T 8eII LaboratorIes, Murray HIII, N], 199S.
[RFC768] ]. PosteI, RFC768, User Datagram Protocol, DARPA Internet Program Protocol
Specification, August 1980.
[RFC793] RFC793, Transmission Control Protocol, DARPA Internet Program Protocol Specification,
September 1981.
[Rao91] Herman Chung-Hwa Rao, The Jade File System, (Ph. D. DIssertatIon), Dept. oI Comp. ScI,
UnIversIty oI ArIzona, TR 91-18.
[RIt84] D.M. RItchIe, ߢߢA Stream Ìnput-Output Systemߣߣ, AT&T Bell Laboratories Technical
Journal, 63(8), October, 1984.
[TrIc9S] Howard TrIckey, ߢߢAPE ߞ The ANSÌ]POSÌX EnvIronmentߣߣ, Plan 9 Programmer’s Manual,
Volume 2, AT&T 8eII LaboratorIes, Murray HIII, N], 199S.
[UnIcode]The Unicode Standard, Worldwide Character Encoding, Version 1.0, Volume 1, The
UnIcode ConsortIum, AddIson WesIey, New York, 1991.
[UNÌX8S] UNIX Time−Sharing System Programmer’s Manual, Research Version, Eighth Edition,
Volume 1. AT&T 8eII LaboratorIes, Murray HIII, N], 198S.
[WeIc94] 8rent WeIch, ߢߢA ComparIson oI Three DIstrIbuted FIIe System ArchItectures: Vnode,
SprIte, and PIan 9ߣߣ, Computing Systems, 7(2), pp. 17S-199, SprIng, 1994.
[WInt9S] PhII WInterbottom, ߢߢAIeI Language ReIerence ManuaIߣߣ, Plan 9 Programmer’s Manual,
Volume 2, AT&T 8eII LaboratorIes, Murray HIII, N], 199S.