You are on page 1of 100

12.

1 Addition of a machine to a network 301


specified in dotted decimal notation oi as a 4-byte hexadecimal numbei beginning
with 0x. In eithei case, bits set to 1 aie pait of the netwoik numbei, and bits set to 0
aie pait of the host numbei.
The broadcast option specifies the IP bioadcast addiess foi the inteiface, expiessed
in eithei hex oi dotted quad notation. The coiiect bioadcast addiess is one in which
the host pait is set to all 1s, and most systems default to this value; they use the net-
mask and IP addiess to calculate the bioadcast addiess.
On Linux, you can set the bioadcast addiess to any IP addiess that's valid foi the
netwoik to which the host is attached. Some sites have chosen weiid values foi the
bioadcast addiess in the hope of avoiding ceitain types of denial of seivice attacks
that aie based on bioadcast pings. We dislike this appioach foi seveial ieasons.
Fiist, it iequiies you to ieset the bioadcast addiess on eveiy host on the local net-
woik, a choie that can be time-consuming on a laige net. Second, it iequiies you to
be absolutely suie that you ieconfiguie eveiy host; otheiwise, bioadcast stoims, in
which packets tiavel fiom machine to machine until theii TTIs expiie, can eiupt.
Bioadcast stoims occui because the same link-layei bioadcast addiess must be used
to tianspoit packets no mattei what the IP bioadcast addiess has been set to. Foi
example, suppose that machine X thinks the bioadcast addiess is A1 and machine
Y thinks it is A2. If X sends a packet to addiess A1, Y will ieceive the packet (be-
cause the link-layei destination addiess is the bioadcast addiess), will see that the
packet is not foi itself and also not foi the bioadcast addiess (because Y thinks the
bioadcast addiess is A2), and will then foiwaid
18
the packet back to the net. If two
machines aie in Y's state, the packet ciiculates until its TTL expiies. Bioadcast
stoims can eiode youi bandwidth, especially on a laige switched net.
A bettei way to avoid pioblems with bioadcast pings is to pievent youi boidei iout-
eis fiom foiwaiding them and to tell individual hosts not to iespond to them. See
page ?16 foi instiuctions on how to implement these constiaints.
In the ifconfig example at the beginning of this section, the bioadcast addiess is
192.168.1.255 because the netwoik is a /24, as specified by the netmask value of
255.255.255.0.
Executing ifconfig shows the following output:
redla¹$ /sbín/ífconfíg eth0
e¹l0 Lirl erca¡.L¹lerre¹ HVaddr 00.02.B3.lº.C8.8o
ire¹ addr.lº2.lo8.l.l3 Bcas¹.lº2.lo8.l.2SS Masl.2SS.2SS.2SS.0
UF BROADCAST RUNNlNG MULTlCAST MTU.lS00 Me¹ric.l
RX ¡acle¹s.20oº83 errors.0 dro¡¡ed.0 overrurs.0 írane.0
TX ¡acle¹s.2l82º2 errors.0 dro¡¡ed.0 overrurs.0 carrier.0
collisiors.0 ¹xqueueler.l00
lr¹erru¡¹.¯ Base address.0xeí00
18. Machine Y musi be configuied wiih i¡_íorvardir¡ iuined on foi ihis io happen.
302 Chaþter 12 - JCP/lP Networking
The lack of collisions on the Etheinet inteiface in this example may indicate a veiy
lightly loaded netwoik oi, moie likely, a switched netwoik. On a shaied netwoik
(built with hubs instead of switches), you should check this numbei to ensuie that it
is below about 5% of the output packets. Iots of collisions indicate a loaded netwoik
that needs to be watched and possibly split into multiple subnets oi migiated to a
switched infiastiuctuie.
Iet's look at some complete examples.
= ífconfíg Io 127.0.0.1 up
This command configuies the loopback inteiface, which doesn't usually iequiie any
options to be set. You should nevei need to change youi system's default configuia-
tion foi this inteiface. The implied netmask of 255.0.0.0 is coiiect and does not need
to be manually oveiiidden.
= ífconfíg eth0 128.138.243.151 netmask 255.255.255.192
broadcast 128.138.243.191 up
This is a typical example foi an Etheinet inteiface. The IP and bioadcast addiesses
aie set to 128.1?8.24?.151 and 128.1?8.24?.191, iespectively. The netwoik is class B
(you can tell fiom the fiist byte of the addiess), but it has been subnetted by an addi-
tional 10 bits into a /26 netwoik. The 192 in the netmask is 11000000 in binaiy and so
adds 2 extia bits to the 24 contained in the thiee 255 octets. The 191 in the bioadcast
addiess is 10111111 in binaiy, which sets all 6 host bits to 1s and indicates that this
inteiface is pait of the ?
id
netwoik (fiist two bits 10) in the gioup of 4 caived out of
the 4
th
octet. (This is the kind of situation in which an IP calculatoi comes in handy!)
Now that you know how to configuie a netwoik inteiface by hand, you need to figuie
out how the paiameteis to ifconfig aie set when the machine boots, and you need to
make suie that the new values aie enteied coiiectly. You noimally do this by editing
one oi moie configuiation files; see the vendoi-specific sections staiting on page 307
foi moie infoimation.
m||-too|. conf|gure autonegot|at|on and other med|a-sµec|f|c oµt|ons
Occasionally, netwoik haidwaie has configuiable options that aie specific to its me-
dia type. One extiemely common example of this is modein-day Etheinet, wheiein
an inteiface caid may suppoit 10, 100, oi even 1000 Mb/s in both half duplex and
full duplex modes. Most equipment defaults to autonegotiation mode, in which both
the caid and its upstieam connection (usually a switch poit) tiy to guess what the
othei wants to use.
Histoiically, autonegotiation has woiked about as well as a blindfolded cowpoke tiy-
ing to iope a calf. Moie iecently, vendoi netwoik devices play bettei togethei, but
autonegotiation is still a common souice of failuie. High packet loss iates (especially
foi laige packets) aie a common aitifact of failed autonegotiation.
The best way to avoid this pitfall is to lock the inteiface speed and duplex both on
seiveis and on the switch poits to which they aie connected. Autonegotiation is use-
ful foi poits in public aieas wheie ioving laptops may stop foi a visit, but it seives no
12.1 Addition of a machine to a network 303
useful puipose foi statically attached hosts. If you'ie having pioblems with mysteii-
ous packet loss, tuin off autonegotiation eveiywheie as youi fiist couise of action.
Undei Iinux, the mii-tool command queiies and sets media-specific paiameteis
such as link speed and duplex. You can queiy the status of an inteiface with the -v
flag. Foi example, this eth0 inteiface has autonegotiation enabled:
$ míí-tooI -v eth0
e¹l0. re¡o¹ia¹ed l00baseTx-!D ílov-cor¹rol, lirl ol
¡roduc¹ irío. verdor 00.l0.Sa, nodel 0 rev 0
basic node. au¹ore¡o¹ia¹ior erabled
basic s¹a¹us. au¹ore¡o¹ia¹ior con¡le¹e, lirl ol
ca¡abili¹ies. l00baseTx-!D l00baseTx-HD l0baseT-!D l0baseT-HD
adver¹isir¡. l00baseTx-!D l00baseTx-HD l0baseT-!D l0baseT-HD ílov-cor¹rol
lirl ¡ar¹rer. l00baseTx-!D l00baseTx-HD l0baseT-!D l0baseT-HD ílov-cor¹rol
To lock this inteiface to 100 Mb/s full duplex, use the command
= míí-tooI -force=100BaseTx-FD eth0
Add this command to a system staitup sciipt to make it peimanent. Afteiwaid, the
status queiy ietuins
$ míí-tooI -v eth0
e¹l0. l00 Mbi¹, íull du¡lex, lirl ol
¡roduc¹ irío. verdor 00.l0.Sa, nodel 0 rev 0
basic node. l00 Mbi¹, íull du¡lex
basic s¹a¹us. lirl ol
ca¡abili¹ies. l00baseTx-!D l00baseTx-HD l0baseT-!D l0baseT-HD
adver¹isir¡. l00baseTx-!D l00baseTx-HD l0baseT-!D l0baseT-HD ílov-cor¹rol
route. conf|gure stat|c routes
The route command defines static ioutes, explicit iouting table entiies that nevei
change (you hope), even if you iun a iouting daemon. When you add a new ma-
chine to a local aiea netwoik, you usually need to specify only a default ioute; see the
next section foi details.
This book's discussion of iouting is split between this section and Chaptei 1?, Rcut-
ing. Although most of the basic infoimation about iouting and the route command
is heie, you might find it helpful to iead the fiist few sections of Chaptei 1? if you
need moie infoimation.
Routing is peifoimed at the IP layei. When a packet bound foi some othei host ai-
iives, the packet's destination IP addiess is compaied with the ioutes in the keinel's
iouting table. If the addiess matches oi paitially matches a ioute in the table, the
packet is foiwaided to the "next-hop gateway" IP addiess associated with that ioute.
Theie aie two special cases. Fiist, a packet may be destined foi some host on a di-
iectly connected netwoik. In this case, the "next-hop gateway" addiess in the iouting
table is one of the local host's own inteifaces, and the packet is sent diiectly to its
304 Chaþter 12 - JCP/lP Networking
destination. This type of ioute is added to the iouting table foi you by the ifconfig
command when you configuie an inteiface.
Second, it may be that no ioute matches the destination addiess. In this case, the
default ioute is invoked if one exists. Otheiwise, an ICMP "netwoik unieachable" oi
"host unieachable" message is ietuined to the sendei. Many local aiea netwoiks
have only one way out, and theii default ioute points to it. On the Inteinet backbone,
the iouteis do not have default ioutes-the buck stops theie. If they do not have a
iouting entiy foi a destination, that destination cannot be ieached.
Each route command adds oi iemoves one ioute. The foimat is
route [o¡| [íy¡c| ucsíiruíior gw quícuuy [mcíric| [ irícrjucc|
The cp aigument should be add to add a ioute, del to iemove one, and omitted to
display the iouting tables. destinaticn can be a host addiess (type -host), a netwoik
addiess (type -net), oi the keywoid default. If destinaticn is a netwoik addiess, you
should also specify a netmask.
The gateway is the machine to which packets should be foiwaided. It must be on a
diiectly connected netwoik; foiwaiding can only be peifoimed one hop at a time.
Iinux lets you specify an inteiface instead of (oi along with) the gateway. The dev
keywoid in the inteiface specification is optional and can be omitted.
metric is the numbei of foiwaidings (the hop count) iequiied to ieach the destina-
tion. Iinux does not iequiie oi use the hop count, but if you set it, Iinux keeps the
value in the iouting tables so that iouting piotocols can use it.
The optional type aigument suppoits host ioutes, which apply to a complete IP ad-
diess (a specific host) iathei than to a netwoik addiess. The values -net and -host
aie accepted foi the type paiametei. If a type isn't specified, route checks the host
pait of the destination addiess to see if it's zeio. If the host pait is 0 oi the addiess is a
netwoik defined in the /etc/networks file, then the ioute is assumed to be a noimal
netwoik ioute.
19
Since route cannot magically know which netwoik numbeis have been subnetted,
you must fiequently use the type field to install ceitain ioutes. Foi example, the ad-
diess 128.1?8.24?.0 iefeis to a subnetted class B netwoik at oui site, but to route it
looks like a class B addiess of 128.1?8 with a host pait of 24?.0; you must specify the
-net option to deconfuse route. In geneial, it's good hygiene to piovide an explicit
type foi all ioutes that involve subnets.
route del destinaticn iemoves a specific entiy fiom the iouting table. Othei UNIX
systems have an option to route, usually -f oi -flush, that completely flushes the
iouting tables and staits ovei. Iinux does not suppoit this option, so you might be
19. /etc/networks can map names io neiwoik numbeis, much like ihe /etc/hosts file maps hosinames io
compleie IP addiesses. Many commands ihai expeci a neiwoik numbei can accepi a neiwoik name if ii
is lisied in ihe /etc/networks file (oi in INS).
12.1 Addition of a machine to a network 305
faced with many route dels to clean out a laige iouting table-be suie you aie logged
in locally oi you may end up half done and disconnected!
To inspect existing ioutes, use the command netstat -nr oi netstat -r if you want to
see names instead of numbeis. Numbeis aie often bettei if you aie debugging, since
the name lookup may be the thing that is bioken.
redla¹$ netstat -nr
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
lº2.lo8.l.0 0.0.0.0 2SS.2SS.2SS.0 U 0 0 0 e¹l0
l2¯.0.0.0 0.0.0.0 2SS.0.0.0 U 0 0 0 lo
0.0.0.0 lº2.lo8.l.2S4 0.0.0.0 UG 0 0 0 e¹l0
redla¹$ netstat -r
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
lº2.lo8.l.0 2SS.2SS.2SS.0 U 0 0 0 e¹l0
l2¯.0.0.0 2SS.0.0.0 U 0 0 0 lo
deíaul¹ s¡rir¹-¡v 0.0.0.0 UG 0 0 0 e¹l0
The Cenmask is the netmask associated with the destination. The Flags specify the
status of the ioute, how it was leained, and othei paiameteis. Finally, the Iface is the
inteiface thiough which packets using that ioute aie sent. These examples aie fiom a
Red Hat system, but SUSE and Iebian aie identical except that Iebian doesn't show
the loopback ioute by default.
0efau|t routes
A default ioute causes all packets whose destination netwoik is not found in the kei-
nel's iouting table to be sent to the indicated gateway. To set a default ioute, simply
add the following line to youi staitup files:
route add defauIt gw quícuuyíFuuurcss
Rathei than haidcoding an explicit IP addiess into the staitup files, most vendois
have theii systems get the gateway IP addiess fiom a configuiation file. The way that
local iouting infoimation is integiated into the staitup sequence is unfoitunately
diffeient foi each of oui Iinux systems (huiiy ISB, fix this not-invented-heie syn-
diome!). Table 12.10 summaiizes the necessaiy incantations.
1ab|e 12.10 how to set the defau|t route
System I||e to change var|ab|e to change
Red nat,
ledora
/etc/sysconf|g/network GATLVAY
SuSl /etc/route.conf add line: deíaul¹ íFuuur musr irícrjucc
0ebian,
ubuntu
/etc/network/|nterfaces ¡a¹evay
306 Chaþter 12 - JCP/lP Networking
0NS conf|gurat|on
To configuie a machine as a INS client, you need to edit only one oi two files: all
systems iequiie /etc/resolv.conf to be modified, and some iequiie you to modify a
"seivice switch" file as well.
The /etc/resolv.conf file lists the INS domains that should be seaiched to iesolve
names that aie incomplete (that is, not fully qualified, such as anchoi instead of an-
choi.cs.coloiado.edu) and the IP addiesses of the name seiveis to contact foi name
lookups. A sample is shown heie; foi moie details, see page 418.
searcl cs.colorado.edu colorado.edu
raneserver l28.l38.242.l
raneserver l28.l38.243.lSl
raneserver lº2.l08.2l.l
/etc/resolv.conf should list the "closest" stable name seivei fiist because the seivei
in the first position will be contacted first. You can have up to three raneserver en-
tiies. If possible, you should always have moie than one. The timeout peiiod foi a
DNS queiy to a paiticulai name seivei seems quite long, so if the fiist name seivei
does not iespond, youi useis will notice.
You will sometimes see a donair line instead of a searcl line. Such a line indicates
eithei an ancient resolv.conf file that has not been updated to use the searcl diiec-
tive oi an ancient iesolvei that doesn't undeistand searcl. donair defines only the
cuiient domain, wheieas searcl accepts up to six diffeient domains to queiy. Thus,
searcl is piefeiied.
See Chapter 17 fcr
mcre infcrmaticn
abcut NIS.
Also, some ancient systems do not use INS by default, even if a piopeily configuied
resolv.conf file exists. These systems have a "seivice switch" file that deteimines
which mechanisms will be used to iesolve hostname-to-IP-addiess mappings. Piioi-
itization of infoimation souices is coveied in moie detail staiting on page 515, but
we mention the topic heie as well, since it sometimes foils youi attempts to configuie
a legacy machine.
The seivice switch file lets you specify the oidei in which INS, NIS, and /etc/hosts
should be consulted. In most cases, you can also iule out ceitain souices of data en-
tiiely. Youi choice of oidei impacts the machine's ability to boot and the way in
which booting inteiacts with the contents of the /etc/hosts file.
If DNS is chosen as the fiist data souice to consult, you may need to have a name
seivei on the local netwoik and have its hostname and IP addiess in the hosts file in
oidei foi eveiything to woik at boot time.
Table 12.11 lists the location of the ielevant config files and the default configuiation
foi host lookups on each of oui example systems.
12.8 0istribution-sþecific network configuration 307
1he L|nux network|ng stack
See page 727 fcr mcre
infcrmaticn abcut vir-
tual interfaces.
The netwoiking stack in Linux keinels 2.2 and above suppoits viitual netwoik in-
teifaces and selective acknowledgements (oi SACKs, as they aie called). Keinels 2.4
and up implement explicit congestion notification (ECN).
ECN maiks TCP packets to infoim the iemote peei of congestion pioblems instead
of letting diopped packets seive as the only indication that something has gone
wiong. ECN was oiiginally specified in RFC2481 (Januaiy 1999) and is now a pio-
posed standaid documented in RFC?168. RFC2884 (July 2000) included an evalua-
tion of ECN's peifoimance. It found that ECN benefited a vaiiety of netwoik tians-
actions.
Iinux is always one of the fiist netwoiking stacks to include new featuies. The Iinux
folks aie sometimes so quick that the iest of the netwoiking infiastiuctuie cannot
inteiopeiate. Foi example, the Iinux ICN featuie (which is on by default) collided
with incoiiect default settings on an oldei Cisco fiiewall pioduct, causing all pack-
ets with the ECN bit set to be diopped. Oops.
Iinux developeis love to tinkei, and they often implement featuies and algoiithms
that aien't yet accepted standaids. Une example is the Iinux 2.6.13 addition of plug-
gable congestion contiol algoiithms. The seveial options include vaiiations foi lossy
netwoiks, high-speed WANs with lots of packet loss, satellite links, and moie. The
standaid TCP "ieno" mechanism (slow stait, congestion avoidance, fast ietiansmit,
and fast iecoveiy) is still used by default, but a vaiiant may be moie appiopiiate foi
youi enviionment.
12.8 0IS1kI8u1IUN-SP£CIIIC N£1wUkk CUNII6ukA1IUN
Chaptei 2 desciibes the details of oui example systems' booting pioceduies. In the
next few sections, we simply summaiize the choies that aie ielated to configuiing a
netwoik. Oui example systems automatically configuie the loopback inteiface; you
should nevei need to modify that pait of the configuiation. Beyond that, each sys-
tem is a bit diffeient.
1ab|e 12.11 Serv|ce sw|tch f||es by system
System Sw|tch f||e 0efau|t for hostname |ookuµs
ubuntu
/etc/nssw|tch.conf
/etc/host.conf
íiles drs ndrs
a
los¹s, bird
0thers
/etc/nssw|tch.conf
/etc/host.conf
íiles drs
los¹s, bird
a. ndrs ¬ multicast 0NS, a somewhat uncommon þrotocol that allows 0NS-like name res-
olution on a small network with no local 0NS server.
308 Chaþter 12 - JCP/lP Networking
Foui files aie common to each of oui example systems: /etc/hosts, /etc/resolv.conf,
/etc/nsswitch.conf, and /etc/host.conf. These weie coveied in the geneiic netwoik
configuiation sections above and, except foi resolv.conf and possibly hosts, usually
do not need to be modified when you add a machine to the netwoik.
Aftei any change to a file that contiols netwoik configuiation at boot time, you may
need to eithei ieboot oi biing the netwoik inteiface down and then up again foi youi
change to take effect. On all of oui example distiibutions you can use the ifup and
ifdown commands.
Network conf|gurat|on for ked hat and Iedora
Table 12.12 shows the Red Hat and Fedoia netwoik configuiation files.
You set the machine's hostname in /etc/sysconfig/network, which also contains
lines that specify the machine's INS domain and default gateway. Foi example, heie
is a network file foi a host with a single Itheinet inteiface:
NLTVORKlNG=yes
HOSTNAML=redla¹.¹oadrarcl.con
DOMAlNNAML=¹oadrarcl.con === o¡¹ioral
GATLVAY=lº2.lo8.l.2S4
Inteiface-specific data is stoied in /etc/sysconfig/network-scripts/ifcfg-ifname,
wheie ifname is the name of the netwoik inteiface. These configuiation files let you
set the IP addiess, netmask, netwoik, and bioadcast addiess foi each inteiface. They
also include a line that specifies whethei the inteiface should be configuied "up" at
boot time.
Typically, files foi an Etheinet inteiface (eth0) and foi the loopback inteiface (lo) aie
piesent. Foi example,
DLVlCL=e¹l0
lFADDR=lº2.lo8.l.l3
NLTMASK=2SS.2SS.2SS.0
NLTVORK=lº2.lo8.l.0
BROADCAST=lº2.lo8.l.2SS
ONBOOT=yes
1ab|e 12.12 ked hat and Iedora network conf|gurat|on f||es
I||e |n /etc/sysconf|g what's set there
network nostname, default route
stat|c-routes Static routes
network-scr|µts/|fcfg-|lname Per-interface þarameters: lP address, netmask, etc.
12.8 0istribution-sþecific network configuration 309
and
DLVlCL=lo
lFADDR=l2¯.0.0.l
NLTMASK=2SS.0.0.0
NLTVORK=l2¯.0.0.0
BROADCAST=l2¯.2SS.2SS.2SS
ONBOOT=yes
NAML=loo¡bacl
aie the ifcfg-eth0 and ifcfg-lo files foi the machine iedhat.toadianch.com desciibed
in the network file eailiei in this section.
A couple of handy sciipts facilitate inteiface management. ifup and ifdown accept
the name of a netwoik inteiface as an aigument and biing the specified inteiface up
oi down. Aftei changing netwoik infoimation in any of the /etc/sysconfig diiecto-
iies, be suie to iun ifdown ifname followed by ifup ifname. Bettei yet, ieboot the
system to be suie youi changes don't cause some kind of subtle problem. There are
no man pages for ifup and ifdown, but they are shell scripts (kept in /sbin), so you
can take a look and see what they do in detail.
If you need to manage all the inteifaces at once, iun the /etc/rc.d/init.d/network
sciipt, which accepts the aiguments start, stop, restart, and status. This sciipt is
invoked at boot time with the start aigument.
The staitup sciipts can also configuie static ioutes. Any ioutes added to the file
/etc/sysconfig/static-routes aie enteied into the iouting table at boot time. The en-
tiies specify aiguments to route add, although in mixed-up oidei (the inteiface is
fiist instead of last):
e¹l0 re¹ l30.22S.204.48 re¹nasl 2SS.2SS.2SS.248 ¡v l30.22S.204.4º
e¹ll re¹ lº2.38.8.0 re¹nasl 2SS.2SS.2SS.224 ¡v lº2.38.8.l2º
The inteiface is specified fiist, followed by aiguments to route: the ioute type (re¹
oi los¹), the taiget netwoik, the netmask associated with that netwoik, and finally,
the next-hop gateway. The keywoid ¡v is iequiied. Cuiient Iinux keinels do not
use the ne¹ric paiametei to route but allow it to be enteied and maintained in the
iouting table foi iouting daemons to use. The static-routes example above would
pioduce the following route commands:
route add -net 130.225.204.48 netmask 255.255.255.248 gw 130.225.204.49 eth0
route add -net 192.38.8.0 netmask 255.255.255.224 gw 192.38.8.129 eth1
Network conf|gurat|on for SuS£
Table 12.1? on the next page shows the netwoik configuiation files used by SUSE.
SUSE has a unique netwoik configuiation scheme. With the exceptions of INS pa-
iameteis and the system hostname, SUSE sets most netwoiking configuiation op-
tions in ifcfg-interface files in the /etc/sysconfig/network diiectoiy. One file should
be piesent foi each inteiface on the system.
310 Chaþter 12 - JCP/lP Networking
Foi a ieal netwoik inteiface (that is, not the loopback), the filename has the ex-
tended foim ifcfg-interface-id-MAC, wheie MAC is the haidwaie addiess of the net-
woik inteiface. (ifcfg-eth-id-00:0c:29:d4:ea:26 is an example.)
In addition to specifying the IP addiess, gateway, and bioadcast infoimation foi an
inteiface, the ifcfg-* files can tune many othei netwoik dials; the ifcfg.template file
is a well-commented iundown of the possible paiameteis.
SUSE's YaST tool includes a mothei-in-law-ieady inteiface foi configuiing the net-
woik. It woiks well, and we iecommend it foi managing the ifcfg-* files whenevei
possible. If you must configuie the netwoik manually, heie's a simple template with
oui comments:
BOOTFROTO='s¹a¹ic' = S¹a¹ic is in¡lied bu¹ i¹ doesr'¹ lur¹ ¹o be verbose.
lFADDR='lº2.lo8.l.4/24' = Tle /24 deíires ¹le NLTVORK ard NLTMASK vars
NAML='AMD FCre¹ - !as¹ ¯ºCº¯l' = Used ¹o s¹ar¹ ard s¹o¡ ¹le ir¹eríace.
STARTMODL='au¹o' = S¹ar¹ au¹ona¹ically a¹ boo¹
USLRCONTROL='ro' = Disable cor¹rol ¹lrou¡l lir¹erre¹/cir¹erre¹ GUl
Clobal static iouting infoimation foi a SISI system (including the default ioute) is
stoied in the routes file. Each line in this file is like a route command with the com-
mand name omitted: destination, gateway, netmask, inteiface, and optional extia pa-
iameteis to be stoied in the iouting table foi use by iouting daemons. Foi the host
configuied above, which has only a default ioute, the routes file contains the line
deíaul¹ lº2.lo8.l.l - -
Routes unique to specific inteifaces aie kept in ifroute-interface files, wheie the no-
menclatuie of the  component is the same as foi the ifcfg-* files. The con-
tents have the same foimat as the routes file.
Network conf|gurat|on for 0eb|an and ubuntu
As shown in Table 12.14, Iebian and Ubuntu configuie the netwoik mostly in
/etc/hostname and /etc/network/interfaces, with a bit of help fiom the file
/etc/network/options.
The hostname is set in /etc/hostname. The name in this file should be fully quali-
fied; its value is used in a vaiiety of contexts, some of which iequiie that. Howevei,
the standaid Iebian installation leaves a shoit name theie.
1ab|e 12.13 SuS£ network conf|gurat|on f||es |n /etc/sysconf|g/network
I||e what's set there
|fcfg-|n|etlace nostname, lP address, netmask, and more
|froute-|n|etlace lnterface-sþecific route definitions
routes 0efault route and static routes for all interfaces
conf|g lots of less commonly used network variables
12.9 0nCP: the 0ynamic nost Configuration Protocol 311
The IP addiess, netmask, and default gateway aie set in /etc/network/interfaces. A
line staiting with the iíace keywoid intioduces each inteiface. The iíace line can be
followed by indented lines that specify additional paiameteis. Foi example:
iíace lo ire¹ loo¡bacl
iíace e¹l0 ire¹ s¹a¹ic
address lº2.lo8.l.l02
re¹nasl 2SS.2SS.2SS.0
¡a¹evay lº2.lo8.l.2S4
The ifup and ifdown commands iead this file and biing the inteifaces up oi down
by calling lowei-level commands (such as ifconfig) with the appiopiiate paiame-
teis. The ire¹ keywoid in the iíace line is the addiess family; this is always ire¹. The
keywoid s¹a¹ic is called a "method" and specifies that the IP addiess and netmask
foi eth0 aie diiectly assigned. The address and re¹nasl lines aie iequiied foi static
configuiations; eailiei veisions of the Iinux keinel also iequiied the netwoik addiess
to be specified, but now the keinel is smaitei and can figuie out the netwoik addiess
fiom the IP addiess and netmask. The ¡a¹evay line specifies the addiess of the de-
fault netwoik gateway and is used to install a default ioute.
The options file lets you set netwoiking vaiiables at boot time. By default, Debian
tuins IP foiwaiding off, spoof piotection on, and syn cookies off.
12.9 0hCP. 1h£ 0¥NAMIC hUS1 CUNII6ukA1IUN PkU1UCUL
IHCI is defined in
RFCs 2131 and 2132.
Linux hosts have histoiically iequiied manual configuiation to be added to a net-
woik. When you plug a Mac oi PC into a netwoik, it just woiks. Why can't Iinux do
that? The Iynamic Host Configuiation Piotocol (IHCP) biings this ieasonable ex-
pectation seveial steps closei to ieality.
The protocol enables a IHCP client to "lease" a variety of network and administrative
paiameteis fiom a cential seivei that is authoiized to distiibute them. The leasing
paiadigm is paiticulaily convenient foi PCs that aie tuined off when not in use and
foi ISPs that have inteimittent dial-up customeis.
Ieasable paiameteis include
·
IP addiesses and netmasks
·
Cateways (default ioutes)
·
INS name seiveis
·
Syslog hosts
1ab|e 12.14 0eb|an and ubuntu network conf|gurat|on f||es
I||e what's set there
/etc/hostname nostname
/etc/network/|nterfaces lP address, netmask, default route
/etc/network/oµt|ons low-level network oþtions (lP forwarding, etc.)
312 Chaþter 12 - JCP/lP Networking
·
WINS seiveis, X font seiveis, pioxy seiveis, NTP seiveis
·
TFTP seiveis (foi loading a boot image)
and dozens moie (see RFC2132). Real-woild use of the moie exotic paiameteis is
iaie, howevei. In many cases, a IHCP seivei supplies only basic netwoiking paiam-
eteis such as IP addiesses, netmasks, default gateways, and name seiveis.
Clients must iepoit back to the IHCP seivei peiiodically to ienew theii leases. If a
lease is not ienewed, it eventually expiies. The IHCP seivei is then fiee to assign the
addiess (oi whatevei was being leased) to a diffeient client. The lease peiiod is con-
figuiable, but it's usually quite long (houis oi days).
IHCP can save a foimeily hapless sysadmin a lot of time and suffeiing. Once the
seivei is up and iunning, clients can use it to automatically obtain theii netwoik
configuiation at boot time. No fuss, no mess.
0hCP software
Iinux distiibutions histoiically shipped a vaiiety of diffeient IHCP seiveis and cli-
ents. These days, they have all moie oi less standaidized on the iefeience implemen-
tation fiom the Inteinet Systems Consoitium, ISC. The ISC seivei also speaks the
BOOTP piotocol, which is similai in concept to IHCP, but oldei and less sophisti-
cated. The IHCP client softwaie is installed by default on all modein distiibutions,
but you must sometimes install additional packages to get the ISC seivei and ielay
agent up and iunning.
IHCP clients initiate conveisations with the IHCP seivei by using the geneiic all-1s
bioadcast addiess-the clients don't yet know theii subnet masks and theiefoie
cannot use the subnet bioadcast addiess.
ISC's DHCP seivei speaks the DNS dynamic update piotocol. Not only does the
seivei give youi host its IP addiess and othei netwoiking paiameteis but it also up-
dates the DNS database with the coiiect hostname-to-IP-addiess mapping. See
page 448 foi moie infoimation about dynamic INS updates.
In the next few sections, we biiefly discuss the IHCP piotocol, explain how to set up
the ISC seivei that implements it, and then discuss some client configuiation issues.
how 0hCP works
IHCP is a backwaid-compatible extension of BOOTP, a piotocol that was oiiginally
devised to enable diskless UNIX woikstations to boot. BOOTP supplies clients with
theii IP addiess, netmask, default gateway, and TFTP booting infoimation. IHCP
geneializes the paiameteis that can be supplied and adds the "lease" concept.
A IHCP client begins its inteiaction with a IHCP seivei by bioadcasting a "Help!
Who am I?" message. If a DHCP seivei is piesent on the local netwoik, it negotiates
with the client to lease it an IP addiess and piovides othei netwoiking paiameteis
(netmask, name seivei infoimation and default gateway). If theie is no IHCP seivei
12.9 0nCP: the 0ynamic nost Configuration Protocol 313
on the local net, seiveis on diffeient subnets can also ieceive the initial bioadcast
message fiom a pioxy called a "ielay agent."
When the client's lease time is half ovei, it will ienew the lease. The seivei is obliged
to keep tiack of the addiesses it has handed out, and this infoimation must peisist
acioss ieboots. Clients aie supposed to keep theii lease state acioss ieboots too, al-
though many do not. The goal is to maximize stability in netwoik configuiation.
Incidentally, IHCP is noimally not used to configuie dial-up PPP inteifaces. PPP's
own PPPCP (PPP Contiol Piotocol) typically fills that iole.
ISC's 0hCP server
To configuie the IHCP seivei, dhcpd, edit the sample dhcpd.conf file fiom the
server diiectoiy and install it in /etc/dhcpd.conf.
20
You must also cieate an empty
lease database file called /var/db/dhcp.leases; use the touch command. Make suie
that dhcpd can wiite to this file. To set up the dhcpd.conf file, you need the follow-
ing infoimation:
·
The subnets foi which dhcpd should manage IP addiesses, and the ianges
of addiesses to dole out
·
The initial and maximum lease duiations, in seconds
·
Configuiations foi BOOTP clients if you have any (they have static IP
addiesses and must have theii MAC-level haidwaie addiess listed as well)
·
Any othei options the seivei should pass to IHCP clients: netmask,
default ioute, INS domain, name seiveis, etc.
The dhcpd man page ieviews the configuiation piocess. The dhcpd.conf man page
coveis the exact syntax of the config file. Both aie located in the distiibution's server
subdiiectoiy. Some distiibutions include a sample dhcpd.conf file in the /etc diiec-
toiy; change it to match youi local site's netwoik configuiation.
dhcpd should be staited automatically at boot time. You may find it helpful to make
the staitup of the daemon conditional on the existence of /etc/dhcpd.conf.
Heie's a sample dhcpd.conf file fiom a Iinux box with two inteifaces: one inteinal
and one that connects to the Inteinet. This machine peifoims NAT tianslation foi
the inteinal netwoik and leases out a iange of 10 IP addiesses on this netwoik as well.
The dhcpd.conf file contains a dummy entiy foi the exteinal inteiface (iequiied)
and a los¹ entiy foi one paiticulai machine that needs a fixed addiess.
= dlc¡d.corí
=
= ¡lobal o¡¹iors
o¡¹ior donair-rane ¨syracl.re¹¨,
o¡¹ior donair-rane-servers ¡v.syracl.re¹,
20. Be caieful: ihe dhcpd.conf file foimai is a bii fiagile. Leave oui a semicolon, and you'll ieceive an
obscuie, unhelpful eiioi message.
314 Chaþter 12 - JCP/lP Networking
o¡¹ior subre¹-nasl 2SS.2SS.2SS.0,
deíaul¹-lease-¹ine o00,
nax-lease-¹ine ¯200,
subre¹ lº2.lo8.l.0 re¹nasl 2SS.2SS.2SS.0 ¦
rar¡e lº2.lo8.l.Sl lº2.lo8.l.o0,
o¡¹ior broadcas¹-address lº2.lo8.l.2SS,
o¡¹ior rou¹ers ¡v.syracl.re¹,
|
subre¹ 20º.l80.2Sl.0 re¹nasl 2SS.2SS.2SS.0 ¦
|
los¹ ¡ardalí ¦
lardvare e¹lerre¹ 08.00.0¯.l2.34.So,
íixed-address ¡ardalí.syracl.re¹,
|
See Chapter 15 fcr
mcre infcrmaticn
abcut INS.
Addiesses assigned by IHCP might potentially be in conflict with the contents of
the INS database. Sites often assign a geneiic name to each dynamically leased ad-
diess (e.g., dhcp1.synack.net) and allow the names of individual machines to "float"
with theii IP addiesses. If you aie iunning a iecent veision of BINI that suppoits
dynamic updates, you can also configuie dhcpd to update the INS database as it
hands out addiesses. The dynamic update solution is moie complicated, but it has
the advantage of pieseiving each machine's hostname.
dhcpd iecoids each lease tiansaction in the file dhcp.leases. It also peiiodically
backs up this file by ienaming it to dhcpd.leases~ and iecieating the dhcp.leases
file fiom its in-memoiy database. If dhcpd weie to ciash duiing this opeiation, you
might end up with only a dhcp.leases~ file. In that case, dhcpd will iefuse to stait
and you will have to iename the file befoie iestaiting it. Dc nct just cieate an empty
dhcp.leases file, oi chaos will ensue as clients end up with duplicate addiesses.
The IHCP client does not ieally iequiie configuiation. It stoies status files foi each
connection in the diiectoiy /var/lib/dhcp oi /var/lib/dhclient. The files aie named
aftei the inteifaces they desciibe. Foi example, dhclient-eth0.leases would contain
all the netwoiking paiameteis that dhclient had set foi the eth0 inteiface.
12.10 0¥NAMIC k£CUNII6ukA1IUN AN0 1uNIN6
Iinux has its own special way of tuning keinel and netwoiking paiameteis. Instead
of supplying a iegulai configuiation file that is iead to deteimine appiopiiate val-
ues, Linux puts a iepiesentation of each vaiiable that can be tuned into the /proc
viitual filesystem. The netwoiking vaiiables aie in /proc/sys/net/ipv4:
$ cd /proc/sys/net/ípv4; Is -F
corí/ i¡_local_¡or¹_rar¡e ¹c¡_nen
icn¡_eclo_i¡rore_all i¡_rorlocal_bird ¹c¡_nodera¹e_rcvbuí
icn¡_eclo_i¡rore_broadcas¹s i¡_ro_¡n¹u_disc ¹c¡_ro_ne¹rics_save
icn¡_errors_use_irbourd_iíaddr rei¡l/ ¹c¡_or¡lar_re¹ries
icn¡_i¡rore_bo¡us_error_res¡orsesrou¹e/ ¹c¡_reorderir¡
12.10 0ynamic reconfiguration and tuning 315
icn¡_ra¹elini¹ ¹c¡_abc ¹c¡_re¹rars_colla¡se
icn¡_ra¹enasl ¹c¡_abor¹_or_overílov ¹c¡_re¹riesl
i¡n¡_nax_nenbersli¡s ¹c¡_adv_vir_scale ¹c¡_re¹ries2
i¡n¡_nax_nsí ¹c¡_a¡¡_vir ¹c¡_rícl33¯
ire¹_¡eer_¡c_nax¹ine ¹c¡_cor¡es¹ior_cor¹rol ¹c¡_rnen
ire¹_¡eer_¡c_nir¹ine ¹c¡_dna_co¡ybreal ¹c¡_sacl
ire¹_¡eer_nax¹¹l ¹c¡_dsacl ¹c¡_s¹dur¡
ire¹_¡eer_nir¹¹l ¹c¡_ecr ¹c¡_syracl_re¹ries
ire¹_¡eer_¹lreslold ¹c¡_íacl ¹c¡_syrcoolies
i¡_au¹ocoríi¡ ¹c¡_íir_¹ineou¹ ¹c¡_syr_re¹ries
i¡_deíaul¹_¹¹l ¹c¡_ír¹o ¹c¡_¹ines¹an¡s
i¡_dyraddr ¹c¡_lee¡alive_ir¹vl ¹c¡_¹so_vir_divisor
i¡_íorvard ¹c¡_lee¡alive_¡robes ¹c¡_¹v_recycle
i¡íra¡_li¡l_¹lresl ¹c¡_lee¡alive_¹ine ¹c¡_¹v_reuse
i¡íra¡_lov_¹lresl ¹c¡_lov_la¹ercy ¹c¡_virdov_scalir¡
i¡íra¡_nax_dis¹ ¹c¡_nax_or¡lars ¹c¡_vnen
i¡íra¡_secre¹_ir¹erval ¹c¡_nax_syr_bacllo¡
i¡íra¡_¹ine ¹c¡_nax_¹v_bucle¹s
Many of the vaiiables with rate and max in theii names aie used to thwait denial of
seivice attacks. The conf subdiiectoiy contains vaiiables that aie set pei inteiface.
It contains subdiiectoiies all and default and a subdiiectoiy foi each inteiface (in-
cluding the loopback). Each subdiiectoiy contains the same set of files.
$ cd conf/defauIt; Is -F
acce¡¹_redirec¹s boo¹¡_relay lo¡_nar¹iarsr¡_íil¹er
acce¡¹_source_rou¹e disable_¡olicy nc_íorvardir¡secure_redirec¹s
ar¡_arrource disable_xírn nediun_idserd_redirec¹s
ar¡_íil¹er íorce_i¡n¡_versior ¡rono¹e_secordariesslared_nedia
ar¡_i¡rore íorvardir¡ ¡roxy_ar¡¹a¡
If you change something in the all subdiiectoiy, youi change applies to all inteifaces.
If you change the same vaiiable in, say, the eth0 subdiiectoiy, only that inteiface is
affected. The defaults subdiiectoiy contains the default values as shipped.
The neigh diiectoiy contains a subdiiectoiy foi each inteiface. The files in each
subdiiectoiy contiol ARP table management and IPv6 neighboi discoveiy foi that
inteiface. Heie is the list of vaiiables; the ones staiting with gc (foi gaibage collec-
tion) deteimine how ARP table entiies aie timed out and discaided.
$ cd neígh/defauIt; Is -F
arycas¹_delay ¡c_ir¹erval locl¹inere¹rars_¹ine_ns
a¡¡_solici¹ ¡c_s¹ale_¹ine ncas¹_solici¹ucas¹_solici¹
base_reaclable_¹ine ¡c_¹lresll ¡roxy_delayurres_qler
base_reaclable_¹ine_ns ¡c_¹lresl2 ¡roxy_qler
delay_íirs¹_¡robe_¹ine ¡c_¹lresl3 re¹rars_¹ine
To see the value of a vaiiable, use cat; to set it, use echo iediiected to the piopei
filename. Foi example, the command
$ cat ícmp_echo_ígnore_broadcasts
0
316 Chaþter 12 - JCP/lP Networking
shows that this vaiiable's value is 0, meaning that bioadcast pings aie not ignoied.
To set it to 1 (and avoid falling piey to smuif-type denial of seivice attacks), iun
$ sudo sh -c "echo 1 > ícmp_echo_ígnore_broadcasts"
2l
fiom the /proc/sys/net diiectoiy. You aie typically logged in ovei the same netwoik
you aie tweaking as you adjust these vaiiables, so be caieful! You can mess things up
badly enough to iequiie a ieboot fiom the console to iecovei, which might be incon-
venient if the system happens to be in Point Baiiow, Alaska, and it's Januaiy. Test-
tune these vaiiables on youi desktop system befoie you even think of attacking a pio-
duction machine.
To change any of these paiameteis peimanently (oi moie accuiately, to ieset them
eveiy time the system boots), add the appiopiiate vaiiables to /etc/sysctl.conf,
which is iead by the sysctl command at boot time. The foimat of the sysctl.conf file
is variable=value iathei than echo value > variable as you would iun fiom the shell
to change the vaiiable by hand. Vaiiable names aie pathnames ielative to /proc/sys;
you can also use dots instead of slashes if you piefei. Foi example, eithei of the lines
re¹.i¡v4.i¡_íorvard=0
re¹/i¡v4/i¡_íorvard=0
in the /etc/sysctl.conf file would cause IP foiwaiding to be tuined off (foi this host).
The document /usr/src/linux/Documentation/proc.txt, wiitten by the SUSE
folks, is a nice piimei on keinel tuning with /proc.
22
It tells you what the vaiiables
ieally mean and sometimes piovides suggested values. The proc.txt file is a bit out
of date-the Iinux codeis seem to wiite fastei than the documenteis.
12.11 S£CukI1¥ ISSu£S
We addiess the topic of secuiity in a chaptei of its own (Chaptei 20), but seveial secu-
iity issues ielevant to IP netwoiking meiit discussion heie. In this section, we biiefly
look at a few netwoiking featuies that have acquiied a ieputation foi causing secuiity
pioblems, and we iecommend ways to minimize theii impact. The details of oui ex-
ample Iinux systems' default behavioi on these issues (and appiopiiate methods foi
changing them) aie coveied latei in this section.
IP forward|ng
A Iinux box that has IP foiwaiding enabled can act as a ioutei. Unless youi system
has multiple netwoik inteifaces and is actually supposed to function as a ioutei, it's
advisable to tuin this featuie off. Hosts that foiwaid packets can sometimes be co-
eiced into compiomising secuiity by making exteinal packets appeai to have come
21. If you iiy ihis command in ihe foim sudo echo 1 > icmp_echo_ignore_broadcasts, you jusi geneiaie
a "peimission denied" message-youi shell aiiempis io open ihe ouipui file befoie ii iuns sudo. You
wani ihe sudo io apply io boih ihe echo command and ihe iediieciion. Eigo, you musi cieaie a iooi
subshell in which io execuie ihe eniiie command.
22. To have a copy of proc.txt available, you musi insiall ihe keinel souice code.
12.11 Security issues 317
fiom inside youi netwoik. This subteifuge can help naughty packets evade netwoik
scanneis and packet filteis.
ICMP red|rects
ICVP iediiects can be used maliciously to ieioute tiaffic and mess with youi iouting
tables. Vost opeiating systems listen to them and follow theii instiuctions by default.
It would be bad if all youi tiaffic weie ieiouted to a competitoi's netwoik foi a few
houis, especially while backups weie iunning! We iecommend that you configuie
youi iouteis (and hosts acting as iouteis) to ignoie and peihaps log ICVP iediiects.
Source rout|ng
IP's souice iouting mechanism lets you specify an explicit seiies of gateways foi a
packet to tiansit on the way to its destination. Souice iouting bypasses the next-hop
iouting algoiithm that's noimally iun at each gateway to deteimine how a packet
should be foiwaided.
Souice iouting was pait of the oiiginal IP specification; it was intended piimaiily to
facilitate testing. It can cieate secuiity pioblems because packets aie often filteied
accoiding to theii oiigin. If someone can cleveily ioute a packet to make it appeai to
have oiiginated within youi netwoik instead of the Inteinet, it might slip thiough
youi fiiewall. We iecommend that you neithei accept noi foiwaid souice-iouted
packets.
8roadcast µ|ngs and other forms of d|rected broadcast
Ping packets addiessed to a netwoik's bioadcast addiess (instead of to a paiticulai
host addiess) aie typically deliveied to eveiy host on the netwoik. Such packets have
been used in denial of seivice attacks; foi example, the so-called smuif attacks. Most
hosts have a way to disable bioadcast pings-that is, the host can be configuied not
to iespond to oi foiwaid bioadcast pings. Youi Inteinet ioutei can also filtei out
bioadcast pings befoie they ieach youi inteinal netwoik. It's a good idea to use both
host and fiiewall-level secuiity measuies if you can.
Bioadcast pings aie a foim of "diiected bioadcast" in that they aie packets sent to
the bioadcast addiess of a distant netwoik. The default handling of such packets has
been giadually changing. Foi example, veisions of Cisco's IOS up thiough 11.x foi-
waided diiected bioadcast packets by default, but IOS ieleases since 12.0 do not. It is
usually possible to convince youi TCP/IP stack to ignoie bioadcast packets that come
fiom afai, but since this behavioi must be set on each inteiface, this can be a non-
tiivial task at a laige site.
IP sµoof|ng
The souice addiess on an IP packet is noimally filled in by the keinel's TCP/IP im-
plementation and is the IP address of the host from which the packet was sent. How-
ever, if the softwaie cieating the packet uses a iaw socket, it can fill in any souice
addiess it likes. This is called IP spoofing and is usually associated with some kind
318 Chaþter 12 - JCP/lP Networking
of malicious netwoik behavioi. The machine identified by the spoofed souice IP ad-
diess (if it is a ieal addiess) is often the victim in the scheme. Eiioi and ietuin pack-
ets can disiupt oi flood the victim's netwoik connections.
You should deny IP spoofing at youi boidei ioutei by blocking outgoing packets
whose souice addiess is not within youi addiess space. This piecaution is especially
impoitant if youi site is a univeisity wheie students like to expeiiment and often feel
vindictive towaid "jeiks" on theii favoiite chat channels.
At the same time, if you aie using piivate addiess space inteinally, you can filtei to
catch any inteinal addiesses escaping to the Inteinet. Such packets can nevei be an-
sweied (owing to the lack of a backbone ioute) and usually indicate that youi site
has an inteinal configuiation eiioi.
With Iinux-based fiiewalls, desciibed in the next section, you can implement such
filteiing pei host. Howevei, most sites piefei to implement this type of filteiing at
theii boidei iouteis iathei than at each host. This is the appioach we iecommend as
well. We desciibe host-based fiiewalls only foi completeness and foi use in special
situations.
You must also piotect against a hackei foiging the souice addiess on exteinal pack-
ets to fool youi fiiewall into thinking that they oiiginated on youi inteinal netwoik.
The keinel paiametei rp_filter (settable in the /proc/sys/net/ipv4/conf/ifname di-
iectoiy) can help you detect such packets; the rp stands foi ieveise path. If you set
this vaiiable to 1, the keinel discaids packets that aiiive on an inteiface that is diffei-
ent fiom the one on which they would leave if the souice addiess weie the destina-
tion. This behavioi is tuined on by default.
If youi site has multiple connections to the Inteinet, it may be peifectly ieasonable
foi inbound and outbound ioutes to be diffeient. In this situation, set rp_filter to 0
to make youi iouting piotocol woik piopeily. If youi site has only one way out to the
Inteinet, then setting rp_filter to 1 is usually safe and appiopiiate.
host-based f|rewa||s
Iinux includes packet filteiing (aka "fiiewall") softwaie. Although we desciibe this
softwaie latei in this chaptei (page ?19) and also in the Security chaptei (page 701),
we don't ieally iecommend using a woikstation as a fiiewall. The secuiity of Iinux
(especially as shipped by oui fiiendly vendois) is weak, and secuiity on Windows is
even woise. We suggest that you buy a dedicated haidwaie solution to use as a fiie-
wall. Even a sophisticated softwaie solution like Check Point's FiieWall-1 pioduct
(which iuns on a Solaiis host) is not as good as a piece of dedicated haidwaie such as
Cisco's PIX box-and it's almost the same piice!
A moie thoiough discussion of fiiewall-ielated issues begins on page 701.
v|rtua| µr|vate networks
Many oiganizations that have offices in seveial paits of the woild would like to have
all those locations connected to one big piivate netwoik. Unfoitunately, the cost of
12.12 linux NAJ 319
leasing a tiansoceanic oi even a tianscontinental data line can be piohibitive. Such
organizations can actually use the Internet as if it were a private data line by establish-
ing a seiies of secuie, enciypted "tunnels" among theii vaiious locations. A "piivate"
netwoik that includes such tunnels is known as a viitual piivate netwoik oi VPN.
See page 7u9 fcr
mcre infcrmaticn
abcut IIsec.
Some VPNs use the IPsec piotocol, which was standaidized by the IITF in 1998.
Utheis use piopiietaiy solutions that don't usually inteiopeiate with each othei. If
you need VPN functionality, we suggest that you look at pioducts like Cisco's 3660
ioutei oi the Watchguaid Fiiebox, both of which can do tunneling and enciyption.
The Watchguaid device uses PPP to a seiial poit foi management. A sysadmin can
dial into the box to configuie it oi to access the VPN foi testing.
Foi a low-budget VPN solution, see the example on page ?28 that uses PPP ovei an
ssh connection to implement a viitual piivate netwoik.
Secur|ty-re|ated kerne| var|ab|es
Table 12.15 shows Iinux's default behavioi with iegaid to vaiious touchy netwoik
issues. Foi a biief desciiption of the implications of these behaviois, see page ?16.
We iecommend that you change the values of these vaiiables so that you do not an-
swei bioadcast pings, do not listen to iouting iediiects, and do not accept souice-
iouted packets.
12.12 LINuX NA1
Iinux tiaditionally implements only a limited foim of Netwoik Addiess Tianslation
(NAT) that is moie piopeily called Poit Addiess Tianslation, oi PAT. Instead of using
a range of IP addresses as a true XAT implementation would, PAT multiplexes all con-
nections onto a single addiess. To add to the confusion, many Iinux documents ie-
fei to the feature as neither XAT nor PAT but as "IP masquerading." The details and
diffeiences aien't of much piactical impoitance, so we iefei to the Iinux implementa-
tion as NAT foi the sake of consistency.
iptables implements not only NAT but also packet filteiing. In eailiei veisions of
Iinux this was a bit of a mess, but iptables makes a much cleanei sepaiation between
the NAT and filteiing featuies.
1ab|e 12.15 0efau|t secur|ty-re|ated network behav|ors |n L|nux
Ieature host 6ateway Contro| f||e (|n /µroc/sys/net)
lP forwarding off on |µv4/|µ_forward for the whole system
|µv4/conf/|n|etlace/forward|ng þer interface
a
lCVP redirects obeys ignores |µv4/conf/|n|etlace/acceµt_red|rects
Source routing ignores obeys |µv4/conf/|n|etlace/acceµt_source_route
broadcast þing answers answers |µv4/|cmµ_echo_|gnore_broadcasts
a. Jhe |n|etlace can be either a sþecific interface name or a||.
320 Chaþter 12 - JCP/lP Networking
Packet filteiing featuies aie coveied in moie detail in the Security chaptei staiting
on page 701. If you use NAT to let local hosts access the Inteinet, you must use a full
complement of fiiewall filteis when iunning NAT. The fact that NAT "isn't ieally IP
iouting" doesn't make a Linux NAT gateway any moie secuie than a Linux ioutei.
Foi bievity, we desciibe only the actual NAT configuiation heie; howevei, this is
only a small pait of a full configuiation.
To make NAT woik, you must enable IP foiwaiding in the keinel by setting the
/proc/sys/net/ipv4/ip_forward keinel vaiiable to 1. Additionally, you must inseit
the appiopiiate keinel modules:
$ sudo /sbín/modprobe íptabIe_nat
$ sudo /sbín/modprobe íp_conntrack
$ sudo /sbín/modprobe íp_conntrack_ftp
The iptables command to ioute packets using NAT is of the foim
$ sudo íptabIes -t nat -A PO5TROUTINC -o eth1 -j 5NAT --to 63.173.189.1
In this example, eth0 is the inteiface connected to the Inteinet, and its IP addiess is
the one that appeais as the aigument to --to. The eth1 inteiface is the one connected
to the inteinal netwoik.
To Inteinet hosts, it appeais that all packets fiom hosts on the inteinal netwoik have
eth0's IP addiess. The host peifoiming NAT ieceives incoming packets, looks up
theii tiue destinations, iewiites them with the appiopiiate inteinal netwoik IP ad-
diess, and sends them on theii meiiy way.
12.13 PPP. 1h£ PUIN1-1U-PUIN1 PkU1UCUL
PPP, the Point-to-Point Piotocol, has the distinction of being used on both the slow-
est and fastest Inteinet links. In its synchionous foim, it is the encapsulation pioto-
col used on high-speed ciicuits that have fat iouteis at eithei end. In its asynchio-
nous foim, it is a seiial line encapsulation piotocol that specifies how IP packets must
be encoded foi tiansmission on a slow (and often unieliable) seiial line. Seiial lines
simply tiansmit stieams of bits and have no concept of the beginning oi end of a
packet. The PPP device diivei takes caie of encoding and decoding packets on the
seiial line; it adds a link-level headei and maikeis that sepaiate packets.
PPP is sometimes used with the newei home technologies such as ISI and cable
modems, but this fact is usually hidden fiom you as an administiatoi. Encapsulation
is typically peifoimed by the inteiface device, and the tiaffic is biidged to Etheinet.
You just see an Etheinet connection.
Iesigned by committee, PPP is the "eveiything and the kitchen sink" encapsulation
piotocol. It was inspiied by the SIIP (Seiial Iine IP) and CSIIP (Compiessed SIIP)
piotocols designed by Rick Adams and Van Jacobson, iespectively. PPP diffeis fiom
these systems in that it allows the tiansmission of multiple piotocols ovei a single
link. It is specified in RFC1??1.
12.13 PPP: the Point-to-Point Protocol 321
Address|ng PPP µerformance |ssues
PPP piovides all the functionality of Etheinet, but at much slowei speeds. Noimal
office IANs opeiate at 100 Mb/s oi 1 Cb/s-that's 100,000-1,000,000 Kb/s. A dial-
up connection opeiates at about 28-56 Kb/s. To put these numbeis in peispective, it
takes about 5 minutes to tiansfei a one-megabyte file acioss a dial-up PPP line. The
speed is OK foi email oi web biowsing with images tuined off, but glitzy web sites
will diive you ciazy. To impiove inteiactive peifoimance, you can set the MTU of
the point-to-point link quite low. It usually defaults to 512 bytes; tiy 128 if you aie
doing a lot of inteiactive woik. If you aie using PPP ovei Etheinet, use tcpdump to
see the sizes of the packets going ovei the netwoik and set the MTU accoidingly.
Etheinet's MTU is 1500, but the PPP encapsulation makes slightly smallei values
moie efficient. Foi example, pppoe suggests 1412 bytes foi hosts behind the PPP
connection and 1492 on the PPP link. You ceitainly don't want each packet to be
fiagmented because you've set youi default MTU too big.
See Chapter 1õ fcr
mcre infcrmaticn
abcut NFS.
Running NFS ovei a PPP link can be painfully slow. You should considei it only if you
can iun NFS ovei TCP instead of IIP.
The X Window System piotocol uses TCP, so it's possible to iun X applications ovei
a PPP link. Piogiams like xterm woik fine, but avoid applications that use fancy
fonts oi bitmapped giaphics.
Connect|ng to a network w|th PPP
To connect a host to a netwoik with PPP, you must satisfy thiee pieiequisites:
·
Youi host's keinel must be able to send IP packets acioss a seiial line as
specified by the PPP piotocol standaid.
·
You must have a usei-level piogiam that allows you to establish and main-
tain PPP connections.
·
A host on the othei end of the seiial line must undeistand the piotocol you
aie using.
Mak|ng your host sµeak PPP
See page 299 fcr
mcre infcrmaticn
abcut |fconf|g.
Foi a PPP connection to be established, the host must be capable of sending and
ieceiving PPP packets. On Iinux systems, PPP is a loadable keinel module that
places netwoik packets in the seiial device output queue, and vice veisa. This mod-
ule usually pietends to be just anothei netwoik inteiface, so it can be manipulated
with standaid configuiation tools such as ifconfig.
Contro|||ng PPP ||nks
The exact sequence of events involved in establishing a PPP connection depends on
youi OS and on the type of seivei you aie dialing in to. Connections can be initiated
eithei manually oi dynamically.
322 Chaþter 12 - JCP/lP Networking
To establish a PPP connection manually, you iun a command that dials a modem,
logs in to a iemote host, and staits the iemote PPP piotocol engine. If this pioceduie
succeeds, the seiial poit is then configuied as a netwoik inteiface. This option noi-
mally leaves the link up foi a long time, which makes it best suited foi a phone line
dedicated to IP connectivity.
In a dynamic configuiation, a daemon watches youi seiial "netwoik" inteifaces to
see when tiaffic has been queued foi them. When someone tiies to send a packet,
the daemon automatically dials a modem to establish the connection, tiansmits the
packet, and if the line goes back to being idle, disconnects the line aftei a ieasonable
amount of time. Iynamic dial-up is often used if a phone line caiiies both voice and
data tiaffic oi if the connection involves long distance oi connect-time chaiges.
Softwaie to implement both of these connection schemes is included with most vei-
sions of PPP.
Ass|gn|ng an address
See page 298 fcr mcre
infcrmaticn abcut
assigning II addresses.
Just as you must assign an IP addiess to a new host on youi Etheinet, you need to
assign an IP addiess to each PPP inteiface. Theie aie a numbei of ways to assign
addiesses to these links (including assigning no addiesses at all). We discuss only
the simplest method heie.
Think of a PPP link as a netwoik of its own. That is, a netwoik of exactly two hosts,
often called a "point to point" netwoik. You need to assign a netwoik numbei to the
link just as you would assign a netwoik numbei to a new Etheinet segment, using
whatevei iules aie in effect at youi site. You can pick any two host addiesses on that
netwoik and assign one to each end of the link. Follow othei local customs, such as
subnetting standaids, as well. Each host then becomes a "gateway" to the point-to-
point netwoik as fai as the iest of the woild is conceined. (In the ieal woild, you
usually do not contiol both ends of the link; youi ISP gives you the IP addiess you
must use at youi end.)
IHCP can also assign the IP addiess at the end of a PPP link. Some ISPs offei home
seivice that uses IHCP and business seivice that is moie expensive but includes a set
of static addiesses.
kout|ng
See Chapter 13 fcr
mcre infcrmaticn
abcut rcuting.
Since PPP iequiies the iemote seivei to act as an IP ioutei, you need to be as con-
ceined with IP iouting as you would be foi a "ieal" gateway, such as a machine that
connects two Etheinets. The puipose of iouting is to diiect packets thiough gate-
ways so that they can ieach theii ultimate destinations. Routing can be configuied in
seveial diffeient ways.
A iun-of-the-mill PPP client host should have a default ioute that foiwaids packets
to the PPP seivei. Iikewise, the seivei needs to be known to the othei hosts on its
netwoik as the gateway to the leaf machine.
Most PPP packages handle these iouting choies automatically.
12.13 PPP: the Point-to-Point Protocol 323
£nsur|ng secur|ty
See Chapter 2u fcr
mcre infcrmaticn
abcut security.
Secuiity issues aiise whenevei you add a host to a netwoik. Since a host connected
via PPP is a bona fide membei of the netwoik, you need to tieat it as such: veiify that
the system has no accounts without passwoids oi with insecuie passwoids, that all
appiopiiate vendoi secuiity fixes have been installed, and so on. See the Security is-
sues section on page ?16 foi some specifics on netwoik secuiity. PPP on Iinux sup-
poits two authentication piotocols: PAP, the Passwoid Authentication Piotocol, and
CHAP, the Challenge Handshake Authentication Piotocol.
us|ng chat scr|µts
The Iinux seiial line PPP implementation uses a "chat sciipt" to talk to the modem
and also to log in to the iemote machine and stait up a PPP seivei. A chat sciipt
consists of a sequence of stiings to send and stiings to expect in ietuin, with a limited
foim of conditional statement that can expiess concepts such as "expect the stiing
'Iogin', but if you don't get it, send a caiiiage ietuin and wait foi it again."
The idea of a chat sciipt oiiginated with the IICP stoie-and-foiwaid system of days
gone by. In the 1980s, machines would call each othei up in the middle of the night,
log in through chat scripts, and exchange files. Iespite popular demand, IICP is not
quite completely dead yet: the usei uucp is the gioup ownei of seiial device files on
SUSE, and you must be a member of the uucp group to use a dial-out modem for PPP.
Vost PPP implementations come with sample chat sciipts that you can adapt to youi
own enviionment. You need to edit the sciipts to set paiameteis such as the tele-
phone numbei to call and the command to iun aftei a successful login. Most chat
sciipts contain a cleaitext passwoid; set the peimissions accoidingly.
Conf|gur|ng L|nux PPP
Modems (along with piinteis) have always been a thoin in the side of system admin-
stiatois. And it's no wondei, when the softwaie to configuie a PPP connection ovei a
iandom modem has ovei 125 possible options-fai too many to weigh and config-
uie caiefully.
All oui distiibutions except Iebian include Paul Mackeiias's PPP package in the
default installation. It uses a daemon called pppd and keeps most of its configuia-
tion files in /etc/ppp. Run the command pppd --version to see what veision of the
PPP package has been installed on youi paiticulai distiibution. Use apt-get install
ppp to install this package on Iebian.
Oui iefeience systems include a veision of PPP fiom Roaiing Penguin Softwaie
that's designed foi use ovei Etheinet (foi example, on a ISI connection to a local
ISP). The iefeience systems also include PPP suppoit foi ISIN connections. The
configuiation files foi these additional media aie co-located with those foi PPP ovei
seiial links in the diiectoiy /etc/ppp. Filenames aie usually similai but with the ad-
dition of oe foi "ovei Etheinet" oi i foi ISIN. Table 12.16 on the next page shows the
locations of the ielevant commands and config files.
324 Chaþter 12 - JCP/lP Networking
See page 853 fcr mcre
infcrmaticn abcut the
names cf serial pcrts.
In oui configuiation file examples, /dev/modem is oui name foi the seiial poit that
has a modem attached to it. Some distiibutions actually have a /dev/modem file
that is a link to one of the system's seiial poits (usually /dev/ttyS0 oi /dev/ttyS1),
but this piactice is now depiecated. Substitute the device file appiopiiate foi youi
situation.
In addition to PPP softwaie, each distiibution includes the wvdial piogiam to actu-
ally dial the telephone and establish a connection.
We talked above about the modem poits and dialei softwaie; now we talk about how
to set up pppd to use them. Global options aie set in the file /etc/ppp/options, and
options foi paiticulai connections can be stoied in the diiectoiies /etc/ppp/peers
and /etc/chatscripts (on Debian and Ubuntu). Red Hat, Fedoia, and SUSE tend to
put chat sciipts in the /etc/ppp diiectoiy with names like chat.remctehcst. Alteina-
tively, on Red Hat, the file /etc/sysconfig/network-scripts/ifcfg-ttyname can in-
clude connection-specific options foi a paiticulai PPP inteiface.
1ab|e 12.16 PPP-re|ated commands and conf|g f||es by system
System Commands or conf|g f||es 0escr|µt|on
All /usr/sb|n/µµµd
/usr/sb|n/chat
/usr/sb|n/µµµstats
/usr/sb|n/µµµdumµ
/etc/µµµ/oµt|ons
PPP daemon þrogram
Jalks to modem
Shows statistics of PPP link
Vakes PPP þackets readable ASCll
Config file for µµµd
0ebian,
ubuntu
/usr/b|n/µon
/usr/b|n/µoff
/usr/b|n/µ|og
/usr/sb|n/µµµconf|g
/etc/µµµ/µeers/µrov|der
/etc/chatscr|µts/µrov|der
Starts uþ a PPP connection
Shuts down a PPP connection
Shows the tail end of µµµ.|og
Configures µµµd
0þtions for µon to contact your lSP
Chat scriþt for µon to talk to the lSP
Red nat (0Sl) /usr/sb|n/µµµoe
/usr/sb|n/µµµoe-server
/usr/sb|n/µµµoe-sn|ff
/usr/sb|n/ads|-connect
/usr/sb|n/ads|-setuµ
/usr/sb|n/ads|-start
/usr/sb|n/ads|-stoµ
/usr/sb|n/ads|-status
/etc/µµµ/µµµoe.conf
/etc/µµµ/µµµoe-server-oµt|ons
PPP-over-lthernet client
PPP-over-lthernet server
Sniffer that debugs þrovider's quirks
Scriþt that manages link
Scriþt that configures µµµoe
Scriþt that brings uþ µµµoe link
Scriþt that shuts down µµµoe link
Shows the status of µµµoe link
Config file used by ads|-"
lile for extra oþtions to server
SuSl (0Sl) /usr/sb|n/µµµoed
/etc/µµµoed.conf
PPP over lthernet client
Config file for µµµoed
All (lS0N) /usr/sb|n/|µµµd
/usr/sb|n/|µµµstats
/etc/µµµ/|oµt|ons
PPP over lS0N daemon
Shows lS0P PPP statistics
0þtions to |µµµd
12.13 PPP: the Point-to-Point Protocol 325
By default, pppd consults the options file fiist, then the usei's peisonal ~/.ppprc
staitup file, then the connection-specific options.ttyname file (if one exists), and fi-
nally, its command-line aiguments.
A handy tiick suggested by Jonathan Coibet, a Linux old-timei, is to define moie
than one PPP inteiface: one foi home, one foi hotels while tiaveling, etc. This setup
can make it easiei to switch contexts.
wvdial is smaitei than chat and has sensible default behavioi if paiameteis aie left
unspecified. wvdial gets its configuiation infoimation fiom /etc/wvdial.conf: mo-
dem details, login name, passwoid, telephone numbei, etc. You can piovide infoima-
tion foi multiple destinations in the single configuiation file. Use the wvdialconf
piogiam to figuie out youi modem's chaiacteiistics and cieate an initial wvdial.conf
file foi it.
The configuiation files below aie diawn fiom seveial diffeient PPP setups. The fiist
file, /etc/ppp/options, sets global options foi pppd. The active options foi each dis-
tiibution as shipped aie shown below:
Red Hat and Fedoia /etc/ppp/options:
locl
SUSE /etc/ppp/options:
roi¡deíaul¹
roau¹l
cr¹sc¹s
locl
noden
asyrcna¡ 0
rode¹acl
lc¡-eclo-ir¹erval 30
lc¡-eclo-íailure 4
lc¡-nax-coríi¡ure o0
lc¡-res¹ar¹ 2
idle o00
roi¡x
íile /e¹c/¡¡¡/íil¹ers
Iebian and Ubuntu /etc/ppp/options:
asyrcna¡ 0
au¹l
cr¹sc¹s
locl
lide-¡assvord
noden
¡roxyar¡
lc¡-eclo-ir¹erval 30
lc¡-eclo-íailure 4
roi¡x
326 Chaþter 12 - JCP/lP Networking
We like to use the following options file:
= Global FFF o¡¹iors
locl = Alvays locl ¹le device you're usir¡
asyrcna¡ 0x00000000 = By deíaul¹, dor'¹ esca¡e ary¹lir¡
cr¹sc¹s = Use lardvare ílov cor¹rol
deíaul¹rou¹e = Add deíaul¹ rou¹e ¹lru ¹le ¡¡¡ ir¹eríace
nru SS2 = MRU/MTU Sl2 (da¹a) + 40 (leader)
n¹u SS2
The following /etc/sysconfig/network-scripts/ifcgf-ppp0 file comes fiom a Red
Hat system. This skeletal file was constiucted by the linuxconf utility.
FLRSlST=yes
DL!ROUTL=yes
ONBOOT=ro
lNlTSTRlNG=ATZ
MODLMFORT=/dev/noden
LlNLSFLLD=llS200
LSCAFLCHARS=ro
DL!ABORT=yes
HARD!LOVCTL=yes
DLVlCL=¡¡¡0
FFFOFTlONS=
DLBUG=yes
FAFNAML=reno¹e
RLMlF=
lFADDR=
BOOTFROTO=rore
MTU=
MRU=
DlSCONNLCTTlMLOUT=
RLTRYTlMLOUT=
USLRCTL=ro
Heie is a sample chat sciipt (chat-ppp0) that coiiesponds to the ifcfg-ppp0 file
above (with all of its teise and slightly bizaiie syntax):
'ABORT' 'BUSY'
'ABORT' 'LRROR'
'ABORT' 'NO CARRlLR'
'ABORT' 'NO DlALTONL'
'ABORT' 'lrvalid Lo¡ir'
'ABORT' 'Lo¡ir ircorrec¹'
' ' 'ATZ'
'OK' 'ATDT ¡rorcrumrcr'
'CONNLCT' ' '
'TlMLOUT' 'l20'
'o¡ir.' 'uccourí'
'ord.' '¡ussuoru'
'TlMLOUT' 'S'
'~--' ' '
12.13 PPP: the Point-to-Point Protocol 327
Seveial lines in this chat sciipt contain a null paiametei indicated by a paii of single
quotes, which look similai to double quotes in this font.
You can usually adapt an existing chat sciipt to youi enviionment without woiiying
too much about exactly how it woiks. Heie, the fiist few lines set up some geneial
conditions on which the sciipt should aboit. The next lines initialize the modem and
dial the phone, and the iemaining lines wait foi a connection and entei the appio-
piiate useiname and passwoid.
The timeout in the chat sciipt sometimes needs to be adjusted to deal with compli-
cated dialing situations such as those in hotels oi businesses with local telephone
switches, oi to deal with the voice mail signal that some phone companies use befoie
they give you a ieal dial tone. Un most modems, a comma in the phone numbei indi-
cates a pause in dialing. You may need seveial commas if you have to dial a paiticu-
lai digit and then wait foi a second dial tone befoie continuing.
PPP logins at oui site aie just useinames with a P in fiont of them. This convention
makes it easy to iemembei to whom a paiticulai PPP machine belongs.
The association between ifcfg-ppp0 and chat.ppp0 is made by the ifup command,
which iuns automatically duiing staitup since the ifcfg file exists. You can also call
pppd explicitly with a connection-specific options file as an aigument, piovided
that file contains a correc¹ line that lists the coiiesponding chat filename.
Oui next dial-up example is fiom a Iebian system. It uses the peers diiectoiy, puts
its chat sciipt in the /etc/chatscripts diiectoiy, and uses the PAP authentication
mechanism instead of stoiing the passwoid in the chat sciipt. Fiist, the options foi
this connection, /etc/ppp/peers/my-isp:
/dev/moucm === íill ir ¹le serial ¡or¹ oí your noden
debu¡
cr¹sc¹s
rane uscrrumc === userrane a¹ ny-is¡
reno¹erane myis¡
roau¹l
roi¡deíaul¹
deíaul¹rou¹e
correc¹ '/usr/sbir/cla¹ -v -í /e¹c/cla¹scri¡¹s/ny-is¡'
/etc/chatscripts/my-isp contains the following entiies:
'ABORT' 'BUSY'
'ABORT' 'LRROR'
'ABORT' 'NO CARRlLR'
'ABORT' 'NO DlALTONL'
' ' 'ATZ'
'OK' 'ATDT ¡rorcrumrcr'
'CONNLCT' ' '
'TlMLOUT' lS
'~--' ' '
328 Chaþter 12 - JCP/lP Networking
The authentication file used to connect to the ISP, /etc/ppp/pap-secrets, needs to
contain the line:
lo¡ir-rane myis¡ ¡ussuoru
wheie my-isp is the value of the reno¹erane vaiiable in the options above. To
biing up the connection in this scenaiio, use the command pppd call my-isp.
Heie is an example that uses PPP ovei existing geneiic Inteinet connectivity but
teams up with ssh to cieate a secuie connection thiough a viitual piivate netwoik
(VPN). We show both the seivei and client configuiations.
The seivei's /etc/ppp/options file:
roau¹l
lo¡íile ¡¡¡d.lo¡
¡assive
siler¹
rode¹acl
Each connection also has an /etc/ppp/options.ttyname file that contains the IP ad-
diess assignments foi the connection:
¦ocu¦íFuuurcss.rcmoícíFuuurcss
The PPP usei's shell is set to /usr/sbin/pppd on the seivei so that the seivei daemon
is staited automatically. All the authentication keys have to be set up in advance with
ssh-agent so that no passwoid is iequested. On the client side, the configuiation is
done in the /etc/ppp/peers diiectoiy with a file named foi the seivei-let's call the
configuiation "my-woik". The client's /etc/ppp/peers/my-work file would contain
roau¹l
debu¡
lo¡íile ¡¡¡d.lo¡
¡assive
siler¹
¡¹y ¨ssl -¹ uscr@rcmoícrosí¨
To log in to woik fiom home on a secuie PPP connection, the usei would just type
pppd call my-work.
Finally, we include an example that uses the wvdial command and its easy configu-
iation to avoid all the chat sciipt magic that seems to be necessaiy:
/etc/wvdial.conf:
[Dialer Deíaul¹s|
Flore = ¡rorcrumrcr
Userrane = ¦oqirrumc
Fassvord = ¡ussuoru
Moden = /dev/¹¹ySl
[Dialer credi¹card|
Flore = ¦orquisíurccucccsscouc,,,¡rorcrumrcr,,ccrumrcr
12.13 PPP: the Point-to-Point Protocol 329
If wvdial is invoked with no aiguments, it uses the dialei defaults section of the
/etc/wvdial.conf file oi youi ~/.wvdialrc to make the call and stait up PPP. If called
with a paiametei (e.g., wvdial creditcard) it uses the appiopiiate section of the con-
fig file to oveiiide any paiameteis specified in the defaults section.
To take a PPP connection down, you'ie bettei off using ifdown than just killing the
pppd daemon. If you kill pppd diiectly, Iinux will notice and iestait it on you.
$ sudo ífdown ppp0
If youi machine is poitable and sometimes uses Etheinet instead of PPP, theie may
be a default ioute thiough the Etheinet inteiface befoie pppd staits up. Unfoitu-
nately, pppd is too polite to iip out that ioute and install its own, which is the behav-
ioi you'd actually want. To fix the pioblem, simply iun ifdown on the appiopiiate
inteiface to iemove the ioute.
Heie's what the PPP inteiface configuiation and iouting table look like aftei the PPP
connection has been biought up:
$ ífconfíg ppp0
¡¡¡0 Lirl erca¡.Foir¹-¹o-Foir¹ Fro¹ocol
ire¹ addr.l0.0.0.So F-¹-F.l0.0.0.SS Masl.2SS.2SS.2SS.2SS
UF FOlNTOFOlNT RUNNlNG NOARF MULTlCAST MTU.lS00 Me¹ric.l
RX ¡acle¹s.l2S errors.0 dro¡¡ed.0 overrurs.0 írane.0
TX ¡acle¹s.2l4 errors.0 dro¡¡ed.0 overrurs.0 carrier.0
collisiors.0 ¹xqueueler.3
RX by¹es.ll44o (ll.l Kb) TX by¹es.l0SS8o (l03.l Kb)
$ netstat -nr
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
l0.0.0.SS 0.0.0.0 2SS.2SS.2SS.2SS UH 40 0 0 ¡¡¡0
0.0.0.0 l0.0.0.SS 0.0.0.0 UG 40 0 0 ¡¡¡0
You can obtain statistics about the PPP connection and the packets it has tiansfeiied
with the pppstats command:
$ pppstats
lN FACK V[COMF V[UNC V[LRR | OUT FACK V[COMF V[UNC NON-V[
ll8o2 l33 8 ºo 0 | ll044o 22o 2¯ 8º ll0
The VJCOMP column counts packets that use Van Jacobson's TCP headei compies-
sion, and the VJUNC column counts those that don't. See RFC1144 foi details.
Iebugging a PPP connection can be a ieal pain because so many playeis aie involved.
pppd submits log entiies to the daemon facility thiough syslog on Red Hat and Ie-
bian systems and to facility local2 on SISI. You can inciease the logging level by
using the debug flag on pppd's command line oi by iequesting moie logging in the
options file. pppd also piovides detailed exit codes on failuie, so if you tiy to iun
pppd and it balks, iun echo $status (befoie you do anything else) to iecovei the exit
code and then look up this value in the pppd man page.
330 Chaþter 12 - JCP/lP Networking
SUSE tends to include sample configuiation files foi each subsystem; the files aie
mostly comments that explain the foimat and the meaning of available options. The
files in SUSE's /etc/ppp diiectoiy aie no exception; they aie well documented and
contain sensible suggested values foi many paiameteis.
Iebian also has well-documented sample configuiation files foi PPP. It has a subdi-
iectoiy, /etc/chatscripts, devoted to chat sciipts. To biing up an inteiface with PPP,
you can include it in the /etc/network/interfaces file with the ¡¡¡ method and the
¡rovider option to tie the name of youi piovidei (in oui case, my-isp) to a filename
in the /etc/peers diiectoiy (/etc/peers/my-isp). Foi example:
iíace e¹l0 ire¹ ¡¡¡
¡rovider ny-is¡
In this case, the Iebian-specific commands pon and poff manage the connection.
12.14 LINuX N£1wUkkIN6 quIkkS
Unlike most keinels, Linux pays attention to the type-of-seivice (TOS) bits in IP
packets and gives fastei seivice to packets that aie labeled as being inteiactive (low
latency). Cool! Infoitunately, biain damage on the pait of Viciosoft necessitates that
you tuin off this peifectly ieasonable behavioi.
All packets oiiginating on oldei Windows systems aie labeled as being inteiactive,
no mattei what theii puipose. UNIX systems, on the othei hand, usually do not
maik any packets as being inteiactive. If youi Iinux gateway seives a mixed netwoik
of INIX and Windows systems, the Windows packets will consistently get piefeien-
tial tieatment. If you woik in an enviionment with some oldei technologies, the pei-
foimance hit foi UNIX can be quite noticeable.
You can tuin off TOS-based packet soiting when you compile the Iinux keinel. Just
say no to the option "IP: use TOS value as iouting key."
When IP masqueiading (NAT) is enabled, it tells the keinel to ieassemble packet
fiagments into a complete packet befoie foiwaiding them, even if the keinel must
immediately iefiagment the packet to send it on its way. This ieassembly can cost
quite a few CPU cycles, but CPUs aie fast enough now that it shouldn't ieally be an
issue on modein machines.
Iinux lets you change the VAC-level addiesses of ceitain types of netwoik inteifaces:
redla¹$ ífconfíg eth1
e¹ll Lirl erca¡.L¹lerre¹ HVaddr 00.02.B3.lº.C8.8¯
BROADCAST MULTlCAST MTU.lS00 Me¹ric.l
RX ¡acle¹s.0 errors.0 dro¡¡ed.0 overrurs.0 írane.0
TX ¡acle¹s.0 errors.0 dro¡¡ed.0 overrurs.0 carrier.0
collisiors.0 ¹xqueueler.l00
lr¹erru¡¹.¯ Base address.0xee80
redla¹$ sudo ífconfíg eth1 hw ether 00:02:B3:19:C8:21
12.15 Recommended reading 331
redla¹$ ífconfíg eth1
e¹ll Lirl erca¡.L¹lerre¹ HVaddr 00.02.B3.lº.C8.2l
BROADCAST MULTlCAST MTU.lS00 Me¹ric.l
RX ¡acle¹s.0 errors.0 dro¡¡ed.0 overrurs.0 írane.0
TX ¡acle¹s.0 errors.0 dro¡¡ed.0 overrurs.0 carrier.0
collisiors.0 ¹xqueueler.l00
lr¹erru¡¹.¯ Base address.0xee80
This is a dangeious featuie that tends to bieak things. It can be handy, but it use it
only as a last iesoit.
12.15 k£CUMM£N0£0 k£A0IN6
STEVENS, W. RICHARI. 1CI/II Illustrated, Vclume One. 1he Irctcccls. Reading, MA:
Addison-Wesley, 1994.
WRICHT, CARY R., ANI W. RICHARI STEVENS. 1CI/II Illustrated, Vclume 1wc. 1he
Implementaticn. Reading, MA: Addison-Wesley, 1995.
These two books aie an excellent and thoiough guide to the TCP/IP piotocol stack.
A bit dated, but still solid.
STEVENS, W. RICHARI. UNIX Netwcrk Ircgramming. Uppei Saddle Rivei, NJ: Pien-
tice Hall, 1990.
STEVENS, W. RICHARI, BIII FENNER, ANI ANIREW M. RUIOFF. UNIX Netwcrk Irc-
gramming, Vclume 1, 1he Scckets Netwcrking AII (3rd Lditicn). Uppei Saddle Rivei,
NJ: Pientice Hall PTR, 200?.
STEVENS, W. RICHARI. UNIX Netwcrk Ircgramming, Vclume 2. Interprccess Ccmmu-
nicaticns (2nd Lditicn). Uppei Saddle Rivei, NJ: Pientice Hall PTR, 1999.
These books aie the student's bibles in netwoiking classes that involve piogiamming.
If you need only the Beikeley sockets inteiface, the oiiginal edition is a fine iefeience.
If you need the STREAMS inteiface too, then the thiid edition, which includes IPv6,
is a good bet. All thiee aie cleaily wiitten in typical Rich Stevens style.
TANENBAUM, ANIREW. Ccmputer Netwcrks (4th Lditicn). Uppei Saddle Rivei, NJ:
Pientice Hall PTR, 200?.
This was the fiist netwoiking text, and it is still a classic. It contains a thoiough de-
sciiption of all the nitty-giitty details going on at the physical and link layeis of the
piotocol stack. The latest edition includes coveiage on wiieless netwoiks, gigabit
Etheinet, peei-to-peei netwoiks, voice ovei IP, and moie.
SAIUS, PETER H. Casting the Net, Frcm ARIANL1 tc IN1LRNL1 and Beycnd. Read-
ing, MA: Addison-Wesley Piofessional, 1995.
This is a lovely histoiy of the ARPANET as it giew into the Inteinet, wiitten by a
histoiian who has been hanging out with INIX people long enough to sound like one
of them!
332 Chaþter 12 - JCP/lP Networking
COMER, IOUCIAS. Internetwcrking with 1CI/II Vclume 1. Irinciples, Irctcccls, and
Architectures (5th Lditicn). Uppei Saddle Rivei, NJ: Peaison Pientice Hall, 2006.
Ioug Comei's Internetwcrking with 1CI/II seiies was foi a long time the standaid
iefeience foi the TCP/IP piotocols. The books aie designed as undeigiaduate text-
books and aie a good intioductoiy souice of backgiound mateiial.
HEIRICK, CHARIES. "Intioduction to the Inteinet Piotocols." Rutgeis Univeisity,
1987.
This document is a gentle intioduction to TCP/IP. It does not seem to have a peima-
nent home, but it is widely distiibuted on the web; seaich foi it.
HUNT, CRAIC. 1CI/II Netwcrk Administraticn (3rd Lditicn). Sebastopol, CA:
O'Reilly Media, 2002.
Iike othei books in the nutshell seiies, this book is diiected at administiatois of
UNIX systems. Half the book is about TCP/IP, and the iest deals with highei-level
UNIX facilities such as email and iemote login.
An excellent collection of documents about the histoiy of the Inteinet and its vaii-
ous technologies can be found at www.isoc.oig/inteinet/histoiy.
12.16 £X£kCIS£S
E12.1 How could listening to (i.e., obeying) ICMP iediiects allow an unautho-
iized usei to compiomise the netwoik?
E12.2 What is the MTU of a netwoik link? What happens if the MTU foi a
given link is set too high? Too low?
E12.3 Explain the concept of subnetting and explain why it is useful. What aie
netmasks? How do netmasks ielate to the split between the netwoik and
host sections of an IP addiess?
E12.4 The netwoik 1?4.122.0.0/16 has been subdivided into /19 netwoiks.
a) How many /19 netwoiks aie theie? Iist them. What is theii netmask?
b) How many hosts could theie be on each netwoik?
c) Ieteimine which netwoik the IP addiess 1?4.122.67.124 belongs to.
d) What is the bioadcast addiess foi each netwoik?
12.16 lxercises 333
E12.5 Host 128.1?8.2.4 on netwoik 128.1?8.2.0/24 wants to send a packet to
host 128.138.129.12 on netwoik 128.138.129.0/24. Assume the following:
· Host 128.1?8.2.4 has a default ioute thiough 128.1?8.2.1.
· Host 128.138.2.4 has just booted and has not sent oi ieceived any packets.
· All othei machines on the netwoik have been iunning foi a long time.
· Routei 128.1?8.2.1 has a diiect link to 128.1?8.129.1, the gateway foi
the 128.1?8.129.0/24 subnet.
a) Iist all the steps that aie needed to send the packet. Show the souice
and destination Etheinet and IP addiesses of all packets tiansmitted.
b) If the netwoik weie 128.1?8.0.0/16, would youi answei change? How
oi why not?
c) If the 128.1?8.2.0 netwoik weie a /26 netwoik instead of a /24, would
youi answei change? How oi why not?
E12.6 Aftei installing a new Iinux system, how would you addiess the secuiity
issues mentioned in this chaptei? Check to see if any of the secuiity
pioblems have been dealt with on the Iinux systems in youi lab. (May
iequiie ioot access.)
E12.7 What steps are needed to add a new machine to the network in your lab
enviionment? In answeiing, use paiameteis appiopiiate foi youi net-
woik and local situation. Assume that the new machine alieady iuns
Iinux.
E12.8 Show the configuiation file needed to set up a IHCP seivei that assigns
addiesses in the iange 128.1?8.192.[1-55]. Use a lease time of two houis
and make suie that the host with Etheinet addiess 00:10:5A:C7:4B:89
always ieceives IP addiess 128.1?8.192.55.
334

kout/oq
Keeping tiack of wheie netwoik tiaffic should flow next is no easy task. Chaptei 12
biiefly intioduced IP packet foiwaiding. In this chaptei, we examine the foiwaiding
piocess in moie detail and investigate seveial netwoik piotocols that allow iouteis
to automatically discovei efficient ioutes. Routing piotocols not only lessen the day-
to-day administiative buiden of maintaining iouting infoimation, but they also al-
low netwoik tiaffic to be iediiected quickly if a ioutei oi netwoik should fail.
It's impoitant to distinguish between the piocess of actually foiwaiding IP packets
and the management of the iouting table that diives this piocess, both of which aie
commonly called "iouting." Packet foiwaiding is simple, wheieas ioute computation
is tricky; consequently, the second meaning is used more often in practice. This chap-
ter desciibes only unicast iouting; multicast iouting involves an aiiay of veiy diffei-
ent pioblems and is beyond the scope of this book.
Foi the vast majoiity of cases, the infoimation coveied in Chaptei 12, 1CI/II Net-
wcrking, is all that you need to know about iouting. If the appiopiiate netwoik in-
fiastiuctuie is alieady in place, you can set up a single static ioute (as desciibed in
the Rcuting section staiting on page 29?) and voila, you have enough infoimation to
ieach just about anywheie on the Inteinet. If you must suivive within a complex
netwoik topology oi if you aie using a Iinux system foi pait of the netwoik infia-
stiuctuie, then this chaptei's infoimation about dynamic iouting piotocols and
tools can come in handy.
Conventional wisdom says that IP iouting is exceptionally difficult, undeistood only
by a few long-haiied hippies living in the steam tunnels undei the Iawience Beikeley
Routing
13.1 Packet forwarding: a closer look 335
Iaboiatoiies campus in Califoinia. In ieality, this is not the case, as long as you un-
deistand the basic piemise that IP iouting is "next hop" iouting. At any given point,
you only need to deteimine the next host oi ioutei in a packet's jouiney to its final
destination. This is a diffeient appioach fiom that of many legacy piotocols that
deteimine the exact path a packet will tiavel befoie it leaves its oiiginating host, a
scheme known as souice iouting.
1
13.1 PACk£1 IUkwAk0IN6. A CLUS£k LUUk
Befoie we jump into the management of iouting tables, let's take a moie detailed look
at how the tables aie used. Considei the netwoik shown in Exhibit A.
£xh|b|t A £xamµ|e network
Routei R1 connects the two netwoiks, and ioutei R2 connects one of the nets to the
outside woild. (Foi now, we assume that R1 and R2 aie Linux computeis iathei
than dedicated iouteis.) Iet's look at some iouting tables and some specific packet
foiwaiding scenaiios. Fiist, host A's iouting table:
A$ netstat -rn
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
lºº.loS.l4S.0 0.0.0.0 2SS.2SS.2SS.0 U 0 0 0 e¹l0
l2¯.0.0.0 0.0.0.0 2SS.0.0.0 U 0 0 0 lo
0.0.0.0 lºº.loS.l4S.24 0.0.0.0 UG 0 0 0 e¹l0
See page 299 fcr
mcre infcrmaticn
abcut |fconf|g.
Host A has the simplest iouting configuiation of the foui machines. The fiist two
ioutes desciibe the machine's own netwoik inteifaces in standaid iouting teims.
These entiies exist so that foiwaiding to diiectly connected netwoiks need not be
handled as a special case. eth0 is host A's Etheinet inteiface, and lo is the loopback
inteiface, a viitual netwoik inteiface emulated in softwaie. Entiies such as these aie
noimally added automatically by ifconfig when a netwoik inteiface is configuied.
1. IP packeis can also be souice-iouied, bui ihis is almosi nevei done. The feaiuie is noi widely suppoiied
because of secuiiiy consideiaiions.
1ºº.1ãî.11î
aetwer|
1ºº.1ãî.11ã
aetwer| J+å.JI
J+å.ì+ J+ã.J
J+ã.+
J+ã.1
hest
k
hest
8 reeter
k1
reeter
kI
ìJã.Jì.JJJ.ò1
te the |nternet
336 Chaþter 13 - Routing
The default ioute on host A foiwaids all packets not addiessed to the loopback ad-
diess oi to the 199.165.145 netwoik to the ioutei R1, whose addiess on this netwoik
is 199.165.145.24. The C flag indicates that this ioute goes to a gateway, not to one of
A's local inteifaces. Cateways must be only one hop away.
See page 279 fcr
mcre infcrmaticn
abcut addressing.
Suppose a piocess on A sends a packet to B, whose addiess is 199.165.146.4. The IP
implementation looks foi a ioute to the taiget netwoik, 199.165.146, but none of the
ioutes match. The default ioute is invoked and the packet is foiwaided to R1. Exhibit
B shows the packet that actually goes out on the Itheinet (the addiesses in the Ithei-
net headei aie the MAC addiesses of A's and R1's inteifaces on the 145 net).
£xh|b|t 8 £thernet µacket
The Etheinet destination haidwaie addiess is that of ioutei R1, but the IP packet
hidden within the Itheinet fiame does not mention R1 at all. When R1 inspects the
packet it has ieceived, it will see fiom the IP destination addiess that it is not the
ultimate destination of the packet. It then uses its own iouting table to foiwaid the
packet to host B without iewiiting the IP headei so that it still shows the packet
coming fiom A.
Heie's the iouting table foi host R1:
Rl$ netstat -rn
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
l2¯.0.0.0 0.0.0.0 2SS.0.0.0 U 0 0 0 lo
lºº.loS.l4S.0 0.0.0.0 2SS.2SS.2SS.0 U 0 0 0 e¹l0
lºº.loS.l4o.0 0.0.0.0 2SS.2SS.2SS.0 U 0 0 0 e¹ll
0.0.0.0 lºº.loS.l4o.3 0.0.0.0 UG 0 0 0 e¹ll
This table is similai to that of host A, except that it shows two physical netwoik intei-
faces. The default ioute in this case points to R2, since that's the gateway thiough
which the Inteinet can be ieached. Packets bound foi eithei of the 199.165 netwoiks
can be deliveied diiectly.
lJnlRNlJ lRAVl
£thernet
header
IP header u0P header and data
lP PACKlJ
u0P PACKlJ
lrom:
Jo:
Jyþe:
199.165.145.11
199.165.146.4
u0P
lrom:
Jo:
Jyþe:
A
R1
lP
13.2 Routing daemons and routing þrotocols 337
Like host A, host B has only one ieal netwoik inteiface. Howevei, B needs an addi-
tional ioute to function coiiectly because it has diiect connections to two diffeient
iouteis. Tiaffic foi the 199.165.145 net must tiavel thiough R1, while othei tiaffic
should go out to the Inteinet thiough R2.
B$ netstat -rn
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
l2¯.0.0.0 0.0.0.0 2SS.0.0.0 U 0 0 0 lo
lºº.loS.l4S.0 lºº.loS.l4o.l 2SS.2SS.2SS.0 U 0 0 0 e¹l0
lºº.loS.l4o.0 0.0.0.0 2SS.2SS.2SS.0 U 0 0 0 e¹l0
0.0.0.0 lºº.loS.l4o.3 0.0.0.0 UG 0 0 0 e¹l0
See page 295 fcr
an explanaticn cf
ICMI redirects.
You can configuie host B with initial knowledge of only one gateway, thus ielying
on the help of ICMP iediiects to eliminate extia hops. Foi example, heie is one pos-
sible initial configuiation foi host B:
B$ netstat -rn
Kerrel lF rou¹ir¡ ¹able
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
l2¯.0.0.0 0.0.0.0 2SS.0.0.0 U 0 0 0 lo
lºº.loS.l4o.0 0.0.0.0 2SS.2SS.2SS.0 U 0 0 0 e¹l0
0.0.0.0 lºº.loS.l4o.3 0.0.0.0 UG 0 0 0 e¹l0
If B then sends a packet to host A (199.165.145.17), no ioute matches and the packet
is foiwaided to R2 foi deliveiy. R2 (which, being a ioutei, piesumably has complete
infoimation about the netwoik) sends the packet on to R1. Since R1 and B aie on the
same netwoik, R2 also sends an ICMP iediiect notice to B, and B enteis a host ioute
foi A into its iouting table:
lºº.loS.l4S.l¯ lºº.loS.l4o.l 2SS.2SS.2SS.2SS UGHD 0 0 0 e¹l0
This ioute sends all futuie tiaffic foi A diiectly thiough R1. Howevei, it does not
affect iouting foi othei hosts on A's netwoik, all of which have to be iouted by sepa-
iate iediiects fiom R2.
Some sites have chosen ICMP iediiects as theii piimaiy iouting "piotocol," thinking
that this appioach is dynamic. Unfoitunately, once the keinel leains a ioute fiom a
iediiect, eithei the ioute must be manually deleted oi the machine must be iebooted
if that infoimation changes. Because of this pioblem and seveial othei disadvan-
tages of iediiects (incieased netwoik load, incieased load on R2, iouting table clut-
tei, dependence on extia seiveis), we don't iecommend the use of iediiects foi con-
figuiations such as this. In a piopeily configuied netwoik, iediiects should nevei
appeai in the iouting table.
13.2 kUu1IN6 0A£MUNS AN0 kUu1IN6 PkU1UCULS
In simple netwoiks such as the one shown in Exhibit A, it is peifectly ieasonable to
configuie iouting by hand. At some point, howevei, netwoiks become too compli-
cated to be managed this way (possibly because of theii giowth iate). Instead of
338 Chaþter 13 - Routing
having to explicitly tell eveiy computei on eveiy netwoik how to ieach eveiy othei
computei and netwoik, it would be nice if the computeis could just put theii heads
togethei and figuie it all out. This is the job of iouting piotocols and the daemons
that implement them.
Routing piotocols have a majoi advantage ovei static iouting systems in that they
can ieact and adapt to changing netwoik conditions. If a link goes down, the iouting
daemons can quickly discovei and piopagate alteinative ioutes to the netwoiks
that link seived, if any such ioutes exist.
Routing daemons collect infoimation fiom thiee souices: configuiation files, the ex-
isting iouting tables, and iouting daemons on othei systems. This infoimation is
meiged to compute an optimal set of ioutes, and the new ioutes aie then fed back
into the system iouting table (and possibly fed to othei systems thiough a iouting
piotocol). Because netwoik conditions change ovei time, iouting daemons must pe-
iiodically check in with one anothei foi ieassuiance that theii iouting infoimation is
still cuiient.
The exact way that ioutes aie computed depends on the iouting piotocol. Two types
of piotocols aie in common use: distance-vectoi piotocols and link-state piotocols.
0|stance-vector µrotoco|s
Iistance-vectoi (aka "gossipy") piotocols aie based on the geneial idea, "If ioutei X
is five hops away fiom netwoik Y, and I'm adjacent to ioutei X, then I must be six
hops away fiom netwoik Y." You announce how fai you think you aie fiom the net-
woiks you know about. If youi neighbois don't know of a bettei way to get to each
netwoik, they maik you as being the best gateway. If they alieady know a shoitei
ioute, they ignoie youi adveitisement.
2
Ovei time, eveiyone's iouting tables aie sup-
posed to conveige to a steady state.
This is ieally a veiy elegant idea. If it woiked as adveitised, iouting would be iela-
tively simple. Unfoitunately, this type of algoiithm does not deal well with changes
in topology. In some cases, infinite loops (e.g., ioutei X ieceives infoimation fiom
ioutei Y and sends it on to ioutei Z, which sends it back to ioutei Y) can pievent
ioutes fiom conveiging at all. Real-woild distance-vectoi piotocols must avoid such
pioblems by intioducing complex heuiistics oi by enfoicing aibitiaiy iestiictions
such as the RIP (Routing Infoimation Piotocol) notion that any netwoik moie than
15 hops away is unieachable.
Even in nonpathological cases, it can take many update cycles foi all iouteis to ieach
a steady state. Theiefoie, to guaiantee that iouting will not jam foi an extended pe-
iiod, the cycle time must be made shoit, and foi this ieason distance-vectoi pioto-
cols as a class tend to be talkative. Foi example, RIP iequiies that iouteis bioadcast
2. Aciually, ii is noi quiie ihis simple, since iheie aie piovisions foi handling changes in iopology ihai
may lengihen exisiing iouies. Some IV pioiocols such as EICRP mainiain infoimaiion aboui muliiple
possible iouies so ihai ihey always have a fallback plan. The exaci deiails aie noi impoiiani.
13.2 Routing daemons and routing þrotocols 339
all theii iouting infoimation eveiy 30 seconds. IGRP and EIGRP send updates ev-
eiy 90 seconds.
Un the othei hand, BCP, the Boidei Cateway Piotocol, tiansmits the entiie table once
and then tiansmits changes as they occui. This optimization substantially ieduces
the potential foi "chatty" (and mostly unnecessaiy) tiaffic.
Table 1?.1 lists the distance-vectoi piotocols that aie in common use today.
L|nk-state µrotoco|s
Iink-state piotocols distiibute infoimation in a ielatively unpiocessed foim. The ie-
coids tiaded among iouteis aie of the foim "Routei X is adjacent to ioutei Y, and the
link is up." A complete set of such iecoids foims a connectivity map of the netwoik
fiom which each ioutei can compute its own iouting table. The piimaiy advantage
that link-state piotocols offei ovei distance-vectoi piotocols is the ability to quickly
conveige on an opeiational iouting solution aftei a catastiophe occuis. The tiadeoff
is that maintaining a complete "map" of the netwoik at each node iequiies memoiy
and CPU powei that would not be needed by a distance-vectoi iouting system.
Because the communications among iouteis in a link-state piotocol aie not pait of
the actual ioute-computation algoiithm, they can be implemented in such a way that
tiansmission loops do not occui. Updates to the topology database piopagate acioss
the netwoik efficiently, at a lowei cost in netwoik bandwidth and CPU time.
Link-state piotocols tend to be moie complicated than distance-vectoi piotocols,
but this complexity can be explained in pait by the fact that link-state piotocols
make it easiei to implement advanced featuies such as type-of-seivice iouting and
multiple ioutes to the same destination. Neithei of these featuies is suppoited on
stock Iinux systems; you must use dedicated iouteis to benefit fiom them.
The common link-state piotocols aie shown in Table 1?.2.
1ab|e 13.1 Common d|stance-vector rout|ng µrotoco|s
Proto Long name Aµµ||cat|on
RlP Routing lnformation Protocol lnternal lANs
lCRP lnterior Cateway Routing Protocol (deþre-
cated)
Small wANs
llCRP lnhanced lnterior Cateway Routing Protocol wANs, corþorate lANs
bCP border Cateway Protocol lnternet backbone routing
1ab|e 13.2 Common ||nk-state rout|ng µrotoco|s
Proto Long name Aµµ||cat|on
0SPl 0þen Shortest Path lirst lnternal lANs, small wANs
lS-lS lntermediate System to lntermediate System lab exþeriments, insane asylums
340 Chaþter 13 - Routing
Cost metr|cs
Foi a iouting piotocol to deteimine which path to a netwoik is shoitest, it has to
define what is meant by "shoitest".
?
Is it the path involving the fewest numbei of
hops? The path with the lowest latency? The laigest minimal inteimediate band-
width? The lowest financial cost?
Foi iouting, the quality of a link is iepiesented by a numbei called the cost metiic. A
path cost is the sum of the costs of each link in the path. In the simplest systems,
eveiy link has a cost of 1, leading to hop counts as a path metiic. But any of the con-
sideiations mentioned above can be conveited to a numeiic cost metiic.
Netwoiking mavens have laboied long and haid to make the definition of cost met-
iics flexible, and some modein piotocols even allow diffeient metiics to be used foi
diffeient kinds of netwoik tiaffic. Neveitheless, in 99% of cases, all this haid woik
can be safely ignoied. The default metiics foi most systems woik just fine.
You may encountei situations in which the actual shoitest path to a destination may
not be a good default ioute foi political ieasons. To handle these cases, you can aiti-
ficially boost the cost of the ciitical links to make them seem less appealing. Ieave the
iest of the iouting configuiation alone.
Inter|or and exter|or µrotoco|s
An "autonomous system" is a gioup of netwoiks undei the administiative and polit-
ical contiol of a single entity. The definition is vague; ieal-woild autonomous systems
can be as laige as a woildwide coipoiate netwoik oi as small as a building oi a single
academic depaitment. It all depends on how you want to manage iouting. The gen-
eial tendency is to make autonomous systems as laige as possible. This convention
simplifies administiation and makes iouting as efficient as possible.
Routing within an autonomous system is somewhat diffeient fiom iouting between
autonomous systems. Piotocols foi iouting among ASes ("exteiioi" piotocols) must
often handle ioutes foi many netwoiks, and they must deal giacefully with the fact
that neighboiing iouteis aie undei othei people's contiol. Exteiioi piotocols do not
ieveal the topology inside an autonomous system, so in a sense they can be thought
of as a second level of iouting hieiaichy that deals with collections of nets iathei than
individual hosts oi cables.
In piactice, small- and medium-sized sites iaiely need to iun an exteiioi piotocol
unless they aie connected to moie than one ISP. With multiple ISPs, the easy division
of netwoiks into local and Inteinet domains collapses, and iouteis must decide
which ioute to the Inteinet is best foi any paiticulai addiess. (Howevei, that is not to
say that every ioutei must know this infoimation. Most hosts can stay stupid and
ioute theii default packets thiough an inteinal gateway that is bettei infoimed.)
While exteiioi piotocols aie not so diffeient fiom theii inteiioi counteipaits, this
chaptei concentiates on the inteiioi piotocols and the daemons that suppoit them.
3. Foiiunaiely, ii does noi have io define whai ihe meaning of "is" is.
13.3 Protocols on þarade 341
If youi site must use an exteinal piotocol as well, see the iecommended ieading list
on page ?48 foi some suggested iefeiences.
13.3 PkU1UCULS UN PAkA0£
Seveial inteiioi iouting piotocols aie in common use. In this section, we intioduce
the majoi playeis and summaiize theii main advantages and weaknesses.
kIP. kout|ng Informat|on Protoco|
RIP, defined in RFC1058, is an old Xeiox piotocol that has been adapted foi IP net-
woiks. It is the piotocol used by routed. RIP is a simple distance-vectoi piotocol that
uses hop counts as a cost metiic. Because RIP was designed in an eia when a single
computei cost hundieds of thousands of dollais and netwoiks weie ielatively small,
RIP consideis any host fifteen oi moie hops away to be unieachable. Theiefoie, laige
local netwoiks that have moie than fifteen iouteis along any single path cannot use
the RIP piotocol.
Although RIP is a iesouice hog because of its oveiuse of bioadcasting, it does a good
job when a netwoik is changing often oi when the topology of iemote netwoiks is
not known. Howevei, it can be slow to stabilize aftei a link goes down.
Many sites use routed in its -q ("quiet") mode, in which it manages the iouting table
and listens foi iouting updates on the netwoik but does not bioadcast any infoima-
tion of its own. At these sites, the actual ioute computations aie usually peifoimed
with a moie efficient piotocol such as OSPF (see next section). The computed ioutes
aie conveited to RIP updates foi consumption by nonioutei machines. routed is
lightweight (in -q mode) and univeisally suppoited, so most machines can enjoy the
benefits of dynamic iouting without any special configuiation.
RIP is widely implemented on non-Iinux platfoims. A vaiiety of common devices
fiom piinteis to SNMP-manageable netwoik components can listen to RIP advei-
tisements to leain about possible gateways. In addition, routed is available foi all
veisions of UNIX and Iinux, so RIP is a de facto lowest common denominatoi iout-
ing piotocol. Often, RIP is used foi IAN iouting, and a moie featuieful piotocol is
used foi wide-aiea connectivity.
kIP-2. kout|ng Informat|on Protoco|, vers|on 2
See page 287 fcr infcr-
maticn abcut classless
addressing, aka CIIR.
RIP-2 is a mild ievision of RIP that adds suppoit foi a few featuies that weie missing
fiom the oiiginal piotocol. The most impoitant change is that RIP-2 distiibutes net-
masks along with next-hop addiesses, so its suppoit foi subnetted netwoiks and
CIDR is bettei than RIP's. A vague gestuie towaids incieasing the secuiity of RIP
was also included, but the definition of a specific authentication system has been left
foi futuie development.
RIP-2 piovides seveial featuies that seem taigeted foi this multipiotocol enviion-
ment. "Next hop" updates allow bioadcasteis to adveitise ioutes foi which they aie
342 Chaþter 13 - Routing
not the actual gateway, and "ioute tags" allow exteinally discoveied ioutes to be
piopagated thiough RIP.
RIP-2 can be iun in a compatibility mode that pieseives most of the new featuies of
RIP-2 without entiiely abandoning vanilla RIP ieceiveis. In most iespects, RIP-2 is
identical to RIP and should be piefeiied ovei RIP if it is suppoited by the systems
you aie using. Howevei, Iinux distiibutions geneially don't suppoit it out of the box.
USPI. Uµen Shortest Path I|rst
OSPF is defined in RFC2?28. It's a link-state piotocol. "Shoitest path fiist" iefeis to
the mathematical algoiithm used to calculate ioutes; "open" is used in the sense of
"nonpiopiietaiy."
OSPF was the fiist link-state iouting piotocol to be bioadly used, and it is still the
most populai. Its widespiead adoption was spuiied in laige pait by its suppoit in
gated, a populai multipiotocol iouting daemon of which we have moie to say latei.
Unfoitunately, the piotocol itself is complex and hence only woithwhile at sites of
significant size (wheie iouting piotocol behavioi ieally makes a diffeience).
The USPF protocol specification does not mandate any particular cost metric. Cisco's
implementation uses hop counts by default and can also be configuied to use net-
woik bandwidth as a cost metiic.
OSPF is an industiial-stiength piotocol that woiks well foi laige, complicated topol-
ogies. It offeis seveial advantages ovei RIP, including the ability to manage seveial
paths to a single destination and the ability to paitition the netwoik into sections
("aieas") that shaie only high-level iouting infoimation.
I6kP and £I6kP. Inter|or 6ateway kout|ng Protoco|
ICRP and its souped-up successoi EICRP aie piopiietaiy iouting piotocols that iun
only on Cisco iouteis. ICRP was cieated to addiess some of the shoitcomings of RIP
befoie iobust standaids like OSPF existed. It has now been depiecated in favoi of
IICRP, although it is still in use at many sites. IICRP is configuied similaily to ICRP,
though it is actually quite diffeient in its undeilying piotocol design. ICRP handles
only ioute announcements that iespect tiaditional IP addiess class boundaiies,
wheieas EIGRP undeistands aibitiaiy CIIR netmasks.
Both ICRP and EICRP aie distance-vectoi piotocols, but they aie designed to avoid
the looping and conveigence pioblems found in othei IV systems. EICRP in paitic-
ulai is widely iegaided as the paiagon of distance-vectoi iouting. Foi most puiposes,
EICRP and OSPF aie equally functional.
In oui opinion, it is best to stick with an established, nonpiopiietaiy, and multiply
implemented iouting piotocol such as OSPF. Moie people aie using and woiking on
OSPF than EICRP, and seveial implementations aie available.
13.4 routed: RlP yourself a new hole 343
IS-IS. the ISU ºstandardº
IS-IS, the Intia-domain Inteimediate System to Inteimediate System Routeing Pio-
tocol, is the Inteinational Oiganization foi Standaidization's answei to OSPF. It was
oiiginally designed to manage "iouteing" foi the USI netwoik piotocols and was latei
extended to handle IP iouting.
Both IS-IS and OSPF weie developed in the eaily 90s when ISO piotocols weie polit-
ically in vogue. Eaily attention fiom the IETF helped to lend IS-IS a veneei of legiti-
macy for IP, but it seems to be falling farther and farther behind USPF in popularity.
Today, IS-IS use is extiemely iaie outside of vendoi ceitification test enviionments.
The piotocol itself is miied with lots of ISU baggage and geneially should be avoided.
MUSPI, 0vMkP, and PIM. mu|t|cast rout|ng µrotoco|s
MOSPF (Multicast OSPF), DVMRP (Distance Vectoi Multicast Routing Piotocol),
and PIV (Piotocol Independent Vulticast) aie piotocols that suppoit IP multicast-
ing, a technology that is not yet widely deployed. You can find pointeis to moie in-
foimation about these piotocols at www.mbone.com.
kouter 0|scovery Protoco|
Routei Discoveiy Piotocol uses ICMP messages sent to the IP multicast addiess
224.0.0.1 to announce and leain about othei iouteis on a netwoik. Unfoitunately,
not all iouteis cuiiently make these announcements, and not all hosts listen to them.
The hope is that someday this piotocol will become moie populai.
13.4 kUu1£0. kIP ¥UukS£LI A N£w hUL£
You may not be iich. You may not be good looking. But you'll always have routed.
routed was foi a long time the standaid INIX iouting daemon, and it's still supplied
with most veisions of UNIX, and Iinux.
Iinux's stock routed speaks only RIP. If you plan to use RIP-2, the Nexus Routing
Iaemon available fiom souicefoige.net/piojects/nx-iouted is an easy-to-configuie
choice. RIP-2 is essential only if you have subnets with masks not on a byte boundaiy.
routed can be iun in seivei mode (-s) oi in quiet mode (-q). Both modes listen foi
bioadcasts, but only seiveis distiibute theii own infoimation. Geneially, only ma-
chines with multiple inteifaces should be seiveis. If neithei -s noi -q is specified,
routed iuns in quiet mode on hosts with one inteiface and in seivei mode on hosts
with moie that one inteiface.
See page 3u3 fcr
mcre abcut route.
routed adds its discoveied ioutes to the keinel's iouting table. Routes must be ie-
heaid at least eveiy foui minutes oi they will be iemoved. Howevei, routed knows
which ioutes it has added and does not iemove static ioutes that weie installed with
the route command.
routed -t can be used to debug iouting. This option makes routed iun in the foie-
giound and piint out all packets it sends oi ieceives.
344 Chaþter 13 - Routing
routed noimally discoveis iouting infoimation dynamically and does not iequiie
configuiation. Howevei, if youi site contains gateways to the Inteinet oi to othei
autonomous systems, you may have to take some additional steps to make these
links woik with routed.
If you have only a single outbound gateway, you can adveitise it as a global default
ioute by iunning its routed with the -g flag. This is analogous to setting the default
ioute on a single machine, except that it is piopagated thioughout youi netwoik.
routed also suppoits a configuiation file, /etc/gateways, which was designed to pio-
vide static infoimation about gateways to "pieload" into the routed iouting table.
13.5 6A1£0. 6UN£ 1U 1h£ 0Akk SI0£
gated was a fantastic and at one time fieely available iouting fiamewoik by which
many diffeient iouting piotocols could be used simultaneously. With gated, you
could piecisely contiol adveitised ioutes, bioadcast addiesses, tiust policies, and
metiics. gated shaied ioutes among seveial piotocols, allowing iouting gateways to
be constiucted between aieas that had standaidized on diffeient iouting systems.
gated also had one of the nicest administiative inteifaces and configuiation file de-
signs of any Iinux administiative softwaie.
gated started out as freely distributable software, but in 1992 it was piivatized and
tuined ovei to the Meiit GateD Consoitium. Commeicial veisions of gated weie
available only to Consoitium membeis. The Consoitium was eventually disbanded,
and the iights to gated weie acquiied by NextHop, an embedded netwoik softwaie
developei. This tiansfei effectively ended gated's life in the open souice woild, leav-
ing a tiail of bittei stoiies behind.
A piomising new pioject, XORP (the eXtensible Open Routei Platfoim), has spiung
up to help fill the void cieated when gated was sucked undei. Although XORP is just
now in beta test, it's being ieadied foi pioduction use and we'ie hoping that it will
giow to fill gated's foimei niche. Check out the latest piogiess at www.xoip.oig.
In the meantime, sites needing multipiotocol iouting can considei using CNU Ze-
bia (www.zebia.oig), a nuts-and-bolts iouting package that iuns on most Iinux
platfoims. Unfoitunately, it lacks most of the featuies, cieatuie comfoits, and de-
tailed documentation iequiied to manage dynamic iouting in a pioduction envi-
ionment. This may be one case in which buying a dedicated ioutei (such as those
made by Junipei oi Cisco) is the best use of youi iesouices.
13.6 kUu1IN6 S1kA1£6¥ S£L£C1IUN CkI1£kIA
Routing foi a netwoik can be managed at essentially foui levels of complexity:
·
No iouting
·
Static ioutes only
·
Mostly static ioutes, but clients listen foi RIP updates
·
Iynamic iouting eveiywheie
13.6 Routing strategy selection criteria 345
The topology of the oveiall netwoik has a diamatic effect on each individual seg-
ment's iouting iequiiements. Iiffeient nets may need veiy diffeient levels of iouting
suppoit. The following iules of thumb can help you choose a stiategy:
·
A stand-alone netwoik iequiies no iouting.
·
If a netwoik has only one way out, clients (nongateway machines) on that
netwoik should have a static default ioute to the lone gateway. No othei
configuiation is necessaiy, except on the gateway itself.
·
A gateway with a small numbei of netwoiks on one side and a gateway to
"the woild" on the othei side can have explicit static ioutes pointing to the
foimei and a default ioute to the lattei. Howevei, dynamic iouting is advis-
able if both sides have moie than one iouting choice.
·
If you use RIP and aie conceined about the netwoik and system load this
entails, avoid using routed in active mode-it bioadcasts eveiything it
knows (coiiect oi not) at shoit inteivals. To have clients listen passively
foi iouting updates without sending out theii own infoimation, use
routed -q.
·
Many people will tell you that RIP is a hoiiible, teiiible piotocol and that
routed is the spawn of Satan. It isn't necessaiily so. If it woiks foi you and
you aie happy with the peifoimance, go ahead and use it. You get no points
foi spending time on an oveiengineeied iouting stiategy.
·
routed listens to eveiyone and believes eveiything it heais. Even if youi
site uses RIP, you might want to manage the exchange of iouting data with
a dedicated ioutei (such as a Cisco) and iun routed only on client hosts.
·
Use dynamic iouting at points wheie netwoiks cioss political oi adminis-
tiative boundaiies.
·
On dynamically iouted netwoiks that contain loops oi iedundant paths,
use OSPF if possible.
·
Routeis connected to the Inteinet thiough multiple upstieam piovideis
must use BCP. Howevei, most iouteis connected to the Inteinet have only
one upstieam path and can theiefoie use a simple static default ioute.
A good iouting stiategy foi a medium-sized site with a ielatively stable local stiuc-
tuie and a connection to someone else's net is to use a combination of static and
dynamic iouting. Machines within the local stiuctuie that do not have a gateway to
exteinal netwoiks can use static iouting, foiwaiding all unknown packets to a default
machine that undeistands the outside woild and does dynamic iouting.
A netwoik that is too complicated to be managed with this scheme should iely on
dynamic iouting. Default static ioutes can still be used on leaf netwoiks, but ma-
chines on netwoiks with moie than one ioutei should iun routed in passive mode.
346 Chaþter 13 - Routing
13.7 CISCU kUu1£kS
Routeis made by Cisco Systems, Inc., aie the de facto standaid foi Inteinet iouting
today. Having captuied ovei 70% of the ioutei maiket, Cisco's pioducts aie well
known, and staff that know how to opeiate them aie ielatively easy to find. Befoie
Cisco, UNIX boxes with multiple netwoik inteifaces weie often used as iouteis. To-
day, dedicated iouteis aie the favoied geai to put in datacom closets and above ceil-
ing tiles wheie netwoik cables come togethei. They'ie cheapei, fastei, and moie se-
cuie than theii UNIX oi Iinux counteipaits.
Most of Cisco's ioutei pioducts iun an opeiating system called Cisco IOS, which is
piopiietaiy and unielated to Iinux. Its command set is iathei laige; the full docu-
mentation set fills up about 4.5 feet of shelf space. We could nevei fully covei Cisco
IOS heie, but knowing a few basics can get you a long way.
IOS defines two levels of access (usei and piivileged), both of which aie passwoid
piotected. By default, you can simply telnet to a Cisco ioutei to entei usei mode.
4
You aie piompted foi the usei-level access passwoid:
$ teInet acme-gw.acme.com
Correc¹ed ¹o acne-¡v.acne.con.
Lsca¡e clarac¹er is ''|'.
User Access Veriíica¹ior
Fassvord.
Upon enteiing the coiiect passwoid, you ieceive a piompt fiom Cisco's EXEC com-
mand inteipietei.
acne-¡v.acne.con>
At this piompt, you can entei commands such as show interfaces to see the ioutei's
netwoik inteifaces oi show ? to get help about the othei things you can see.
To entei piivileged mode, type enable and entei the piivileged passwoid when it is
iequested. Once you have ieached the piivileged level, youi piompt ends in a #:
acne-¡v.acne.con=
BI CARIFII-you can do anything fiom this piompt, including eiasing the ioutei's
configuiation infoimation and its opeiating system. When in doubt, consult Cisco's
manuals oi one of the compiehensive books published by Cisco Piess.
You can type show running to see the cuiient iunning configuiation of the ioutei
and show config to see the cuiient nonvolatile configuiation. Vost of the time, these
aie the same.
4. A vaiieiy of access meihods can be configuied. If youi siie alieady uses Cisco iouieis, coniaci youi nei-
woik adminisiiaioi io find oui which meihods have been enabled.
13.1 Cisco routers 347
Heie's a typical configuiation:
acne-¡v.acne.con= show runníng
Currer¹ coríi¡ura¹ior.
versior l2.l
los¹rane acne-¡v
erable secre¹ xxxxxxxx
i¡ subre¹-zero
ir¹eríace L¹lerre¹0
descri¡¹ior Acne ir¹erral re¹vorl
i¡ address lº2.l08.2l.2S4 2SS.2SS.2SS.0
ro i¡ direc¹ed-broadcas¹
ir¹eríace L¹lerre¹l
descri¡¹ior Acne baclbore re¹vorl
i¡ address lº2.22S.33.2S4 2SS.2SS.2SS.0
ro i¡ direc¹ed-broadcas¹
i¡ classless
lire cor 0
¹rars¡or¹ ir¡u¹ rore
lire aux 0
¹rars¡or¹ ir¡u¹ ¹elre¹
lire v¹y 0 4
¡assvord xxxxxxxx
lo¡ir
erd
The ioutei configuiation can be modified in a vaiiety of ways. Cisco offeis giaphical
tools that iun undei some veisions of UNIX/Iinux and Windows. Real netwoik ad-
ministiatois nevei use these; the command piompt is always the "suie bet." It is also
possible to tftp a config file to oi fiom a ioutei so that you can edit it with youi
favoiite editoi.
5
To modify the configuiation fiom the command piompt, type config term:
acne-¡v.acne.con= confíg term
Lr¹er coríi¡ura¹ior connards, ore ¡er lire. Lrd vi¹l CNTL/Z.
acne-¡v(coríi¡)=
You can then type new configuiation commands exactly as you want them to appeai
in the show running output. Foi example, if we wanted to change the IP addiess of
the Etheinet0 inteiface in the example above, we could entei
ir¹eríace L¹lerre¹0
i¡ address lº2.22S.40.2S3 2SS.2SS.2SS.0
5. Hoi iip: Miciosofi Woid isn'i ihe besi choice foi ihis applicaiion.
348 Chaþter 13 - Routing
When you've finished enteiing configuiation commands, piess <Contiol-Z> to ie-
tuin to the iegulai command piompt. If you'ie happy with the new configuiation,
entei write mem to save the configuiation to nonvolatile memoiy.
Heie aie some tips foi a successful Cisco ioutei expeiience:
·
Name the ioutei with the hostname command. This piecaution helps pie-
vent accidents caused by configuiation changes to the wiong ioutei. The
hostname always appeais in the command piompt.
·
Always keep a backup ioutei configuiation on hand. You can wiite a shoit
expect sciipt that tftps the running configuration over to a Iinux box every
night foi safekeeping.
·
Contiol access to the ioutei command line by putting access lists on the
ioutei's VTYs (VTYs aie like PTYs on a Iinux box). This piecaution pie-
vents unwanted paities fiom tiying to bieak into youi ioutei.
6
·
Contiol the tiaffic flowing among youi netwoiks (and possibly to the out-
side woild) with access lists on each inteiface. See Iacket-filtering firewalls
on page 701 foi moie infoimation about how to set up access lists.
·
Keep iouteis physically secuie. It's easy to ieset the piivileged passwoid if
you have physical access to a Cisco box.
13.8 k£CUMM£N0£0 k£A0IN6
PERIMAN, RAIIA. Interccnnecticns. Bridges, Rcuters, Switches, and Internetwcrking
Irctcccls (2nd Lditicn). Reading, MA: Addison-Wesley, 2000.
This is the definitive woik in this topic aiea. If you buy just one book about netwoik-
ing fundamentals, this should be it. Also, don't evei pass up a chance to hang out with
Radia-she's a lot of fun and holds a shocking amount of knowledge in hei biain.
HUITEMA, CHRISTIAN. Rcuting in the Internet (2nd Lditicn). Uppei Saddle Rivei, NJ:
Pientice Hall PTR, 2000.
This book is a cleai and well-wiitten intioduction to iouting fiom the giound up. It
coveis most of the piotocols in common use and also some advanced topics such as
multicasting.
MOY, JOHN T. OSIF. Anatcmy cf an Internet Rcuting Irctcccl. Reading, MA: Addi-
son-Wesley, 1998.
A thoiough exposition of OSPF by the authoi of the OSPF piotocol standaid.
STEWART, JOHN W. BCI4 Inter-dcmain Rcuting in the Internet. Reading, MA: Addi-
son-Wesley, 1999.
Theie aie many iouting-ielated RFCs. The main ones aie shown in Table 1?.?.
6. Modein veisions of IOS suppoii ihe SSH pioiocol. You should use ihai insiead of ihe siandaid TELNET
inieiface if ii's available in youi enviionmeni.
13.9 lxercises 349
13.9 £X£kCIS£S
E13.1 Investigate the Iinux route command and wiite a shoit desciiption of
what it does. Using route, how would you:
a) Add a default ioute to 128.1?8.129.1 using inteiface eth1?
b) Ielete a ioute to 128.1?8.129.1?
c) Ieteimine whethei a ioute was added by a piogiam such as routed
oi an ICMP iediiect? (Note that this method woiks with the output
of netstat -rn as well.)
E13.2 Compaie static and dynamic iouting, listing seveial advantages and dis-
advantages of each. Iesciibe situations in which each would be appio-
piiate and explain why.
E13.3 Considei the following netstat -rn output. Iesciibe the ioutes and fig-
uie out the netwoik setup. Which netwoik, 10.0.0.0 oi 10.1.1.0, is closei
to the Inteinet? Which piocess added each ioute?
Des¹ira¹ior Ga¹evay Gernasl !la¡s MSS Virdov ir¹¹ líace
l0.0.0.0 0.0.0.0 2SS.2SS.2SS.0 U 40 0 0 e¹ll
l0.l.l.0 0.0.0.0 2SS.2SS.2SS.0 U 40 0 0 e¹l0
0.0.0.0 l0.0.0.l 0.0.0.0 UG 40 0 0 e¹ll
E13.4 Figuie out the iouting scheme that is used at youi site. What piotocols
aie in use? Which machines diiectly connect to the Inteinet? You can
use tcpdump to look foi iouting update packets on the local netwoik
and traceroute to exploie beyond the local net. (Requiies ioot access.)
E13.5 If you weie a medium-sized ISP that piovided dial-in accounts and vir-
tual hosting, what sort of routing setup up would you use? Vake sure that
you considei not only the gateway ioutei(s) between the Inteinet back-
bone and youi own netwoik but also any inteiioi iouteis that may be in
use. Iiaw a netwoik diagiam that outlines youi iouting aichitectuie.
1ab|e 13.3 kout|ng-re|ated kICs
kIC 1|t|e Authors
2328 0SPl version 2 John J. Voy
1058 Routing lnformation Protocol C. nedrick
2453 RlP version 2 Cary Scott Valkin
1256 lCVP Router 0iscovery Vessages Steþhen l. 0eering
1142 0Sl lS-lS lntra-domain Routing Protocol 0avid R. 0ran
1015 0istance vector Vulticast Routing Protocol 0. waitzman et al.
4632 Cl0R: an Address Assignment and Aggregation Strategy vince luller et al.
4211 A border Cateway Protocol 4 (bCP-4) Yakov Rekhter et al.
350

hetwork hardware
Whethei it's video images fiom aiound the globe oi the sound of youi son's voice
fiom down the hall, just about eveiything in the woild we live in is handled in digital
foim. Voving data quickly fiom one place to anothei is on eveiyone's mind. Behind
all this ciaziness is fancy netwoik haidwaie and-you guessed it-a whole bunch of
stuff that oiiginated in the deep, daik caves of UNIX. If theie's one aiea in which
UNIX technology has touched human lives, it's the piactical iealization of laige-scale
packetized data tianspoit.
Keeping up with all these fast-moving bits is a challenge. Of couise the speed and
ieliability of youi netwoik has a diiect effect on youi oiganization's pioductivity, but
today netwoiking is so peivasive that the state of the netwoik can affect oui ability
to peifoim many basic human inteiactions, such as placing a telephone call. A pooily
designed netwoik is a peisonal and piofessional embaiiassment that can have cata-
stiophic social effects. It can also be veiy expensive to fix.
At least foui majoi factois contiibute to a successful installation:
·
Ievelopment of a ieasonable netwoik design
·
Selection of high-quality haidwaie
·
Piopei installation and documentation
·
Competent ongoing opeiations and maintenance
The fiist sections of this chaptei discuss the media that aie commonly used foi local
aiea and wide aiea netwoiking, including Etheinet, ATM, fiame ielay, wiieless, and
ISI. We then covei design issues you aie likely to face on any netwoik, whethei new
oi old.
Net Haidwaie
14.2 lthernet: the common lAN 351
14.1 LAN, wAN, Uk MAN!
We'ie lucky, in a sense, that TCP/IP can easily be tianspoited ovei a vaiiety of me-
dia. In ieality, howevei, the netwoik haidwaie maiket is split into a vaiiety of con-
fusing classifications.
Netwoiks that exist within a building oi gioup of buildings aie geneially iefeiied to
as Local Aiea Netwoiks oi LANs. High-speed, low-cost connections pievail. Wide
Aiea Netwoiks-WANs-aie netwoiks in which the endpoints aie geogiaphically
dispeised, peihaps sepaiated by thousands of kilometeis. In these netwoiks, high
speed usually comes at high cost, but theie aie viitually no bounds to the sites you
can include on the netwoik (Biugge, Belgium to Sitka, Alaska!). MAN is a telecom
maiketing teim foi Metiopolitan Aiea Netwoik, meaning a high-speed, modeiate-
cost access medium used within a city oi clustei of cities. In this chaptei, we exploie
some of the technologies used to implement these beasts.
14.2 £1h£kN£1. 1h£ CUMMUN LAN
Having captuied ovei 90% of the woild-wide IAN maiket, Itheinet can be found just
about eveiywheie in its many foims. It staited as Bob Metcalfe's Ph.I. thesis at MIT.
Bob giaduated and went to Xeiox PARC; togethei with IEC and Intel, Xeiox eventu-
ally developed Itheinet into a pioduct. It was one of the fiist instances in which com-
peting computei companies joined foices on a technical pioject.
1
Etheinet was oiiginally specified at 3 Mb/s (megabits pei second), but it moved to
10 Vb/s almost immediately. In 1994, Etheinet caught attention as it was standaid-
ized at 100 Mb/s. Just aftei tuining 19 yeais old in 1998, it was ieady to fight a new
wai at 1 Gb/s. Now an adult in its late 20s, Itheinet is available ovei fibei at 10 Cb/s,
having eclipsed all of its iivals. A 10 Cb/s standaid foi coppei wiie (802.?an) was
appioved by the IEEE in July 2006. Table 14.1 on the next page highlights the evolu-
tion of the vaiious Etheinet standaids.
2
how £thernet works
Etheinet can be desciibed as a polite dinnei paity at which guests (computeis) don't
inteiiupt each othei but iathei wait foi a lull in the conveisation (no tiaffic on the
netwoik cable) befoie speaking. If two guests stait to talk at once (a collision) they
both stop, excuse themselves, wait a bit, and then one of them staits talking again.
The technical teim foi this scheme is CSMA/CI:
·
Caiiiei Sense: you can tell whethei anyone is talking.
·
Multiple Access: eveiyone can talk.
·
Collision Ietection: you know when you inteiiupt someone else.
1. Bob Meicalfe also aiiiculaied "Meicalfe's Law," which siaies ihai ihe value of ihe neiwoik expands
exponeniially as ihe numbei of useis incieases.
2. We have omiiied a few goofy Eiheinei siandaids ihai have noi pioved populai, such as 100BaseT4 and
100BaseVC-AnyLAN.
352 Chaþter 14 - Network nardware
The actual delay upon collision detection is somewhat iandom. This convention
avoids the scenaiio in which two hosts simultaneously tiansmit to the netwoik, de-
tect the collision, wait the same amount of time, and then stait tiansmitting again,
thus flooding the netwoik with collisions. This was not always tiue!
£thernet toµo|ogy
The Etheinet topology is a bianching bus with no loops; theie is only one way foi a
packet to tiavel between any two hosts on the same netwoik. Thiee types of packets
can be exchanged on a segment: unicast, multicast, and bioadcast. Unicast packets
aie addiessed to only one host. Vulticast packets aie addiessed to a gioup of hosts.
Bioadcast packets aie deliveied to all hosts on a segment.
A "bioadcast domain" is the set of hosts that ieceive packets destined foi the haid-
waie bioadcast addiess, and theie is exactly one bioadcast domain foi each logical
Etheinet segment. Undei the eaily Etheinet standaids and media (such as 10Base5),
1ab|e 14.1 1he evo|ut|on of £thernet
¥ear Sµeed Common name I£££# 0|st Med|a
a
1913 3 Vb/s Xerox lthernet - ! Coax
1916 10 Vb/s lthernet 1 - 500m RC-11 coax
1982 10 Vb/s 0lX lthernet (lthernet ll) - 500m RC-11 coax
1985 10 Vb/s 10base5 (ªJhicknet') 802.3 500m RC-11 coax
1985 10 Vb/s 10base2 (ªJhinnet') 802.3 180m RC-58 coax
1989 10 Vb/s 10baseJ 802.3 100m Category 3 uJP coþþer
1993 10 Vb/s 10basel 802.3 2km
25km
VV fiber
SV fiber
1994 100 Vb/s 100baseJX (ª100 meg') 802.3u 100m Category 5 uJP coþþer
1994 100 Vb/s 100baselX 802.3u 2km
20km
VV fiber
SV fiber
1998 1 Cb/s 1000baseSX 802.3z 260m
550m
62.5-µm VV fiber
50-µm VV fiber
1998 1 Cb/s 1000baselX 802.3z 440m
550m
3km
62.5-µm VV fiber
50-µm VV fiber
SV fiber
1998 1 Cb/s 1000baseCX 802.3z 25m Jwinax
1999 1 Cb/s 1000baseJ (ªCigabit') 802.3ab 100m Cat 5l and 6 uJP coþþer
2002 10 Cb/s 10Cbase-SR
10Cbase-lR
802.3ae 300m
10km
VV fiber
SV fiber
2006 10 Cb/s 10Cbase-J 802.3an 100m Category 1 uJP coþþer
2006
b
100 Cb/s Jb0 Jb0 Jb0 liber
2008
b
1 Jb/s Jb0 Jb0 Jb0 Cw0V fiber
2010
b
10 Jb/s Jb0 Jb0 Jb0 0w0V fiber
a. VV ¬ Vultimode, SV ¬ Single-mode, uJP ¬ unshielded twisted þair,
Cw0V ¬ Coarse wavelength division multiþlexing, 0w0V ¬ 0ense wavelength division multiþlexing
b. lndustry þrojection
14.2 lthernet: the common lAN 353
physical segments and logical segments weie exactly the same since all the packets
tiaveled on one big cable with host inteifaces stiapped onto the side of it.
?
£xh|b|t A A µo||te £thernet d|nner µarty
With the advent of switches, today's logical segments usually consist of many (possi-
bly dozens oi hundieds) physical segments (oi, in some cases, wiieless segments) to
which only two devices aie connected: the switch poit and the host. The switches aie
iesponsible foi escoiting multicast and unicast packets to the physical (oi wiieless)
segments on which the intended iecipients ieside; bioadcast tiaffic is foiwaided to
all poits in a logical segment.
A single logical segment may consist of physical (oi wiieless) segments opeiating at
diffeient speeds (10 Mb/s, 100 Mb/s, 1 Cb/s, oi 10 Cb/s); hence, switches must have
buffeiing and timing capabilities to eliminate potential conflicts.
unsh|e|ded tw|sted µa|r
Inshielded twisted paii (ITP) is the piefeiied cable medium foi Itheinet. It is based
on a stai topology and has seveial advantages ovei othei media:
·
It uses inexpensive, ieadily available coppei wiie. (Sometimes, existing
phone wiiing can be used.)
·
UTP wiie is much easiei to install and debug than coax oi fibei. Custom
lengths aie easily made.
·
UTP uses RJ-45 connectois, which aie cheap, ieliable, and easy to install.
·
The link to each machine is independent (and piivate!), so a cabling piob-
lem on one link is unlikely to affect othei hosts on the netwoik.
3. No kidding! Aiiaching a new compuiei involved boiing a hole inio ihe ouiei sheaih of ihe cable wiih a
special diill io ieach ihe ceniei conducioi. A "vampiie iap" ihai bii inio ihe ouiei conducioi was ihen
clamped on wiih sciews.
354 Chaþter 14 - Network nardware
The geneial "shape" of a UTP netwoik is illustiated in Exhibit B.
£xh|b|t 8 A u1P |nsta||at|on
ITP wiie suitable foi use in modein IANs is commonly bioken down into eight clas-
sifications. The peifoimance iating system was fiist intioduced by Anixtei, a laige
cable suppliei. These standaids weie foimalized by the Telecommunications Indus-
tiy Association (TIA) and aie known today as Categoiy 1 thiough Categoiy 7, with
a special Categoiy 5E in the middle.
The Inteinational Oiganization foi Standaidization (ISO) has also jumped into the
exciting and highly piofitable woild of cable classification and piomotes standaids
that aie exactly oi appioximately equivalent to the highei-numbeied TIA categoiies.
Foi example, TIA Categoiy 5 cable is equivalent to ISO Class I cable. Foi the geeks
in the audience, Table 14.2 illustiates the majoi diffeiences among the vaiious mod-
ein-day classifications. This is good infoimation to memoiize so you can impiess
youi fiiends at paities.
PUNISH£R 2000
C.> C.>
PUNISH£R 2000
C.>
u1P sw|tch
workstat|on workstat|on
||nk to backbone
£thernet µr|nter
Power
1ab|e 14.2 u1P cab|e character|st|cs
Parameter
a
Category 5
C|ass 0
b
Category 5£
Category 6
C|ass £
Category 7
C|ass I
lrequency range 100 Vnz 100 Vnz 250 Vnz 600 Vnz
Attenuation 24 db 24 db 21.1 db 20.8 db
NlXJ 21.1 db 30.1 db 39.9 db 62.1 db
ACR 3.1 db 6.1 db 18.2 db 41.3 db
llllXJ 11 db 11.4 db 23.2 db !
c
Return loss 8 db 10 db 12 db 14.1 db
Proþagation delay 548 ns 548 ns 548 ns 504 ns
a. NlXJ ¬ Near-end crosstalk, ACR ¬ Attenuation-to-crosstalk ratio, llllXJ ¬ lqual level far-end xtalk
b. lncludes additional JlA and lS0 requirements JSb95 and l0AV 2, resþectively
c. Currently unsþecified þending further study.
14.2 lthernet: the common lAN 355
In piactice, Categoiy 1 and Categoiy 2 cables aie suitable only foi voice applications
(if that). Categoiy ? cable is as low as you can go foi a IAN; it is the standaid foi 10
Mb/s 10BaseT. Categoiy 4 cable is something of a oiphan, not exactly suited foi any
paiticulai application. It is occasionally used foi 16 Vb/s ITP token iing oi foi fancy
10BaseT installations. Categoiy 5 cable can suppoit 100 Mb/s and is the most com-
mon standaid cuiiently in use foi data cabling. Categoiy 5E and Categoiy 6 cabling
suppoit 1 Gb/s. Categoiy 7 is intended foi 10 Gb/s use once the 10 Gb/s Etheinet-
ovei-coppei standaid is ieady.
See page 3õõ fcr
mcre infcrmaticn
abcut wiring.
10BaseT connections iequiie two paiis of Categoiy 3 wiie, and each link is limited to
a length of 100 meteis; 100BaseTX has the same length limitation but iequiies two
paiis of Categoiy 5 wiie. Both PVC-coated and Teflon-coated wiie aie available. Youi
choice of jacketing should be based on the enviionment in which the cable will be
installed. Inclosed aieas that feed into the building's ventilation system ("ietuin aii
plenums") typically iequiie Teflon.
4
PVC is less expensive and easiei to woik with.
RJ-45 connectois wiied with pins 1, 2, 3, and 6 aie used to make the connections.
Although only two paiis of wiie aie needed foi a woiking 10 Mb/s oi 100 Mb/s con-
nection, we iecommend that when installing a new netwoik, you use foui-paii Cate-
goiy 5E wiie and connect all eight pins of the RJ-45 jack.
See page 844 fcr mcre
infcrmaticn abcut the
RS-232 standard.
Foi teiminating the foui-paii ITP cable at patch panels and RJ-45 wall jacks, we sug-
gest that you use the TIA/EIA-568A RJ-45 wiiing standaid. This standaid, which is
compatible with othei uses of RJ-45 (e.g., RS-2?2), is a convenient way to keep the
wiiing at both ends of the connection consistent, iegaidless of whethei you can eas-
ily access the cable paiis themselves. The 568A standaid is detailed in Table 14.?.
Existing building wiiing may oi may not be suitable foi netwoik use, depending on
how and when it was installed. Many old buildings weie ietiofitted with new cable in
the 1950s and 1960s. Unfoitunately, this cable usually won't suppoit even 10 Mb/s.
Connect|ng and exµand|ng £thernets
Itheinets can be logically connected at seveial points in the seven-layei ISU netwoik
model. At layei 1, the physical layei, you can use eithei haidwaie connectois oi ie-
peateis (commonly called hubs in modein times). They tiansfei the signal diiectly,
much like two tin cans connected by stiing.
4. Check wiih youi fiie maishall oi local fiie depaiimeni io deieimine ihe iequiiemenis in youi aiea.
1ab|e 14.3 1IA/£IA-568A standard for w|r|ng four-µa|r u1P to an kJ-45 jack
Pa|r Co|ors w|red to Pa|r Co|ors w|red to
1 white/blue Pins 5/4 3 white/Creen Pins 1/2
2 white/0range Pins 3/6 4 white/brown Pins 1/8
356 Chaþter 14 - Network nardware
At layei 2, the data link layei, switches aie used. Switches tiansfei fiames in accoi-
dance with the haidwaie souice and destination addiesses, much like deliveiing a
message in a bottle by ieading only the label on the outside of the bottle.
At layei ?, the netwoik layei, iouteis aie used. Routeis tiansfei messages to the next
hop accoiding to the location of the final iecipient, iathei like looking at the message
in a bottle to see who it's ieally addiessed to.
Hubs
Hubs (which aie also iefeiied to as concentiatois) aie active devices that connect
physical segments in UTP Etheinet netwoiks. They iequiie exteinal powei. Acting
as a iepeatei, a hub ietimes and ieconstitutes Etheinet fiames but does not inteipiet
them; it has no idea wheie packets aie going oi what piotocol they aie using.
The two faithest points on the netwoik must nevei be moie than foui hubs apait.
Etheinet veisions 1 and 2 specified at most two hubs in seiies pei netwoik. The IEEE
802.? standaid extended the limit to foui foi 10 Mb/s Etheinets. 100 Mb/s Etheinets
allow two iepeateis, 1000BaseT Etheinets allow only one, and 10 Cb/s netwoiks do
not allow them at all. Exhibit C shows both a legal and an illegal configuiation foi a
10 Mb/s netwoik.
£xh|b|t C Count the hubs
Hubs occasionally iequiie attention fiom a system administiatoi, so they should not
be kept in obscuie oi haid-to-ieach locations. Powei cycling usually allows them to
iecovei fiom a wedged state.
Switches
Switches connect Etheinets at the data link layei (layei 2) of the ISO model. Theii
puipose is to join two physical netwoiks in a way that makes them seem like one big
physical netwoik. Switches aie the industiy standaid foi connecting Etheinet de-
vices today.
Just f|ne ¥ou must be µun|shed
= hub
= host A
= host 8
h
A
8
h
h h
8 h
A
h
h h
h
8
h
A
14.2 lthernet: the common lAN 357
Switches ieceive, iegeneiate, and ietiansmit packets in haidwaie.
5
Vost switches use
a dynamic leaining algoiithm. They notice which souice addiesses come fiom one
poit and which fiom anothei. They foiwaid packets between poits only when nec-
essaiy. At fiist all packets aie foiwaided, but in a few seconds the switch has leained
the locations of most hosts and can be moie selective.
Since not all packets aie foiwaided between netwoiks, each segment of cable is less
satuiated with tiaffic than it would be if all machines weie on the same cable. Civen
that most communication tends to be localized, the inciease in appaient bandwidth
can be diamatic. And since the logical model of the netwoik is not affected by the
piesence of a switch, few administiative consequences iesult fiom installing one.
Switches can sometimes become confused if youi netwoik contains loops. The con-
fusion aiises because packets fiom a single host appeai to be on two (oi moie) poits
of the switch. A single Etheinet cannot have loops, but as you connect seveial Ethei-
nets with iouteis and switches, the topology can include multiple paths to a host.
Some switches can handle this situation by holding alteinative ioutes in ieseive in
case the piimaiy ioute goes down. They peifoim a piuning opeiation on the net-
woik they see until the iemaining sections piesent only one path to each node on the
netwoik. Some switches can also handle duplicate links between the same two net-
woiks and ioute tiaffic in a iound iobin fashion.
Switches keep getting smaitei as moie functionality is built into theii fiimwaie.
Some can be used to monitoi secuiity on the netwoik. They iecoid any foieign Ithei-
net addiesses they see, theieby detecting and iepoiting newly connected machines.
Since they opeiate at the Itheinet layei, switches aie piotocol independent and can
handle any mix of high-level packet types (foi example, IP, AppleTalk, oi NetBIII).
Switches must scan eveiy packet to deteimine if it should be foiwaided. Theii pei-
foimance is usually measuied by both the packet scanning iate and the packet foi-
waiding iate. Many vendois do not mention packet sizes in the peifoimance figuies
they quote; theiefoie, actual peifoimance may be less than adveitised.
Although Etheinet switching haidwaie is getting fastei all the time, it is still not a
ieasonable technology foi connecting moie than a hundied hosts in a single logical
segment. Pioblems such as "bioadcast stoims" often plague laige switched netwoiks
since bioadcast tiaffic must be foiwaided to all poits in a switched segment. To solve
this pioblem, use a ioutei to isolate bioadcast tiaffic between switched segments,
theieby cieating moie than one logical Itheinet.
Iaige sites can benefit fiom switches that can paitition theii poits (thiough soft-
waie configuiation) into subgioups called Viitual Iocal Aiea Netwoiks oi VIANs.
A VIAN is a gioup of poits that belong to the same logical segment, as if the poits
weie connected to theii own dedicated switch. Such paititioning incieases the ability
5. Because packeis aie iegeneiaied and ieiimed, fully swiiched neiwoiks do noi suffei fiom ihe "iepeaiei
couni" limiiaiions shown in Exhibii C.
358 Chaþter 14 - Network nardware
of the switch to isolate tiaffic, and that capability has beneficial effects on both se-
cuiity and peifoimance.
Tiaffic between VIANs is handled by a ioutei, oi in some cases, by a iouting module
oi iouting softwaie layei within the switch. An extension of this system known as
"VIAN tiunking" (such as is specified by the IEEE 802.1Q piotocol) allows physi-
cally sepaiate switches to seivice poits on the same logical VIAN.
Choosing a switch can be difficult. The switch maiket is a highly competitive seg-
ment of the computei industiy, and it's plagued with maiketing claims that aien't
even paitially tiue. When selecting a switch vendoi, you should iely on independent
evaluations ("bakeoffs" such as those that appeai in magazine compaiisons) iathei
than any data supplied by vendois themselves. In iecent yeais, it has been common
foi one vendoi to have the "best" pioduct foi a few months, but then completely
destioy its peifoimance oi ieliability when tiying to make impiovements, thus ele-
vating anothei manufactuiei to the top of the heap.
In all cases, make suie that the backplane speed of the switch is adequate-that's
the numbei that ieally counts at the end of a veiy long day. A well-designed switch
should have a backplane speed that exceeds the sum of the speeds of all its poits.
Rcuters
Routeis aie dedicated computeis-in-a-box that contain two oi moie netwoik intei-
faces; they diiect tiaffic at layei ? of the ISO piotocol stack (the netwoik layei). They
shuttle packets to theii final destinations in accoidance with the infoimation in the
TCP/IP piotocol headeis. In addition to simply moving the packets fiom one place to
anothei, iouteis can also peifoim othei functions such as packet filteiing (foi secu-
iity), piioiitization (foi quality of seivice), and big-pictuie netwoik topology discov-
eiy. See all the goiy details of how iouting ieally woiks in Chaptei 13.
Haidwaie inteifaces of many diffeient types (e.g., SONET, Etheinet, and ATM) can
be found on a single ioutei. On the softwaie side, some iouteis can also handle non-
IP tiaffic such as IPX oi AppleTalk. In these configuiations, the ioutei and its intei-
faces must be configuied foi each piotocol you want it to handle. These days, it's
geneially a good idea to migiate away fiom these legacy piotocols and just suppoit
TCP/IP ieally well instead.
Routeis take one of two foims: fixed configuiation and modulai. Fixed configuiation
iouteis have specific netwoik inteifaces peimanently installed at the factoiy. They
aie usually suitable foi small, specialized applications. Foi example, a ioutei with a
T1 inteiface and an Etheinet inteiface might be a good choice to connect a small
company to the Inteinet.
Modulai iouteis have a slot oi bus aichitectuie to which inteifaces can be added by
the end user. Although this approach is usually more expensive, it ensures greater
flexibility down the ioad.
14.3 wireless: nomad's lAN 359
Iepending on youi ieliability needs and expected tiaffic load, a dedicated ioutei may
oi may not be cheapei than a Iinux system configuied to act as a ioutei. Howevei,
the dedicated ioutei usually iesults in supeiioi peifoimance and ieliability. This is
one aiea of netwoik design in which it's usually advisable to spend the extia money
up fiont to avoid headaches latei.
14.3 wIk£L£SS. NUMA0'S LAN
Wiieless netwoiking is a hot giowth aiea, and pioduction-giade pioducts aie avail-
able at affoidable piices. Civen the iecent advances in wiied netwoik technology, the
speeds of wiieless netwoiks (usually ianging fiom 2 Mb/s to 54 Mb/s) may seem a
bit inadequate foi coipoiate use. In fact, these speeds aie peifectly fine foi many
puiposes. An 11 Mb/s wiieless netwoik in a home oi small business enviionment
can be a system administiatoi's dieam. At 54 Mb/s, wiieless can be acceptable in a
coipoiate enviionment. In addition, wiieless access foi tiade shows, coffee shops,
maiinas, aiipoits, and othei public places can ieally tuin an out-of-touch day into a
hypeiconnected day foi many people.
The most piomising wiieless standaids today aie the IIII 802.11g and 802.11a spec-
ifications. 802.11g opeiates in the 2.4 CHz fiequency band and piovides IAN-like
access at up to 54 Mb/s. Opeiating iange vaiies fiom 100 meteis to 40 kilometeis,
depending on equipment and teiiain. 802.11a piovides up to 54 Mb/s of bandwidth
as well, but uses the 5.4 CHz fiequency band. Some cuiient equipment can aggie-
gate two channels to piovide 108 Mb/s of bandwidth. The up-and-coming 802.11n
standaid is expected to piovide moie than 200 Mb/s of bandwidth in late 2006.
Although both 802.11g and 802.11a aie adveitised to opeiate at 54 Mb/s, theii in-
tents and iealized bandwidths can be quite diffeient. 802.11g is piimaiily aimed at
the consumei maiketplace. It is typically less expensive than 802.11a and piovides
thiee nonoveilapping data channels veisus 802.11a's twelve. Channels aie much like
the lanes on a highway: the moie channels available, the gieatei the numbei of cli-
ents that can iealize theii full bandwidth potential.
Foi small sites, eithei standaid is piobably acceptable. Iaigei sites oi campus-wide
deployments may want to considei 802.11a because of its gieatei amount of spec-
tium. In ieality, most cuiient wiieless iadios can be used with eithei type of netwoik.
Today, 802.11b (11 Mb/s), 802.11g, and 802.11a netwoiks aie all quite common-
place. The caids aie inexpensive and available foi (oi built into) most laptop and
desktop PCs. As with wiied Etheinet, the most common aichitectuie foi an 802.11
netwoik uses a hub (called an "access point" in wiieless pailance) as the connection
point foi multiple clients. Access points can be connected to tiaditional wiied net-
woiks oi wiielessly connected to othei access points, a configuiation known as a
"wiieless mesh".
You can configuie a Iinux box to act as an 802.11a/b/g access point if you have the
iight haidwaie and diivei. We aie awaie of at least one chipset that suppoits this
360 Chaþter 14 - Network nardware
configuiation, the Inteisil Prism II. An excellent standalone 802.11b/g wireless base
station foi the home oi small office is Apple's AiiPoit Expiess, a wall-wait-like piod-
uct that is inexpensive (aiound $150) and highly functional.
6
Buy one today!
Iiteially dozens of vendois aie hawking wiieless access points. You can buy them at
Home Iepot and even at the gioceiy stoie. Piedictably, the adage that "you get what
you pay foi" applies. El cheapo access points (those in the $50 iange) aie likely to
peifoim pooily when handling laige file tiansfeis oi moie than one active client.
Iebugging a wiieless netwoik is something of a black ait. You must considei a wide
iange of vaiiables when dealing with pioblems. If you aie deploying a wiieless net-
woik at an enteipiise scale, you'll piobably need to invest in a wiieless netwoik
analyzei. We highly iecommend the analysis pioducts made by AiiMagnet.
w|re|ess secur|ty
The secuiity of wiieless netwoiks has tiaditionally been veiy pooi. Wiied Equiva-
lent Piivacy (WIP) is a piotocol used in conjunction with 802.11b netwoiks to en-
able 40-bit, 104-bit, oi 128-bit enciyption foi packets tiaveling ovei the aiiwaves.
Unfoitunately, this standaid contains a fatal design flaw that iendeis it little moie
than a speed bump foi snoopeis. Someone sitting outside youi building oi house
can access youi netwoik diiectly and undetectably.
Moie iecently, the Wi-Fi Piotected Access (WPA) secuiity standaids have engen-
deied new confidence in wiieless secuiity. Today, WPA should be used instead of
WEP in all new installations.Without WPA, wiieless netwoiks-both with and
without WEP-should be consideied completely insecuie.
802.11i, aka WPA2, is a moie iecent alteinative to WPA. It adds moie authentication
mechanisms foi enteipiise wiieless netwoiks.
w|re|ess sw|tches
In much the same way that Etheinet hubs giew up to become Etheinet switches,
wiieless pioducts aie undeigoing a giadual makeovei foi use in laige enteipiises. A
numbei of vendois (such as Aiiespace) aie now pioducing "wiieless switches" that
woik in conjunction with a fleet of access points deployed thioughout a campus. The
theoiy is that hoides of inexpensive access points can be deployed and then centially
managed by an "intelligent" switch. The switch maintains the WAPs' configuiation
infoimation and smoothly suppoits authentication and ioaming.
If you need to piovide ubiquitous wiieless seivice thioughout a medium-to-laige-
sized oiganization, it's definitely woith the time to evaluate this categoiy of piod-
ucts. Not only do they deciease management time but most also include a means to
monitoi and manage the quality of seivice deliveied to useis.
6. In faci, ii will also conneci io youi sieieo io play music wiielessly fiom youi PC oi lapiop.
14.4 l00l: the disaþþointing, exþensive, and outdated lAN 361
One paiticulaily neat tiick is to deploy an 802.11a/b/g netwoik thioughout youi fa-
cility and use it to suppoit hand-held VoIP phones foi staff. It's like a cellulai net-
woik foi fiee!
14.4 I00I. 1h£ 0ISAPPUIN1IN6, £XP£NSIv£, AN0 Uu10A1£0 LAN
At 10 Mb/s, the Etheinet of the 1980s didn't offei enough bandwidth foi some net-
woiking needs, such as connecting woikgioups thiough a coipoiate (oi campus)
backbone. In an effoit to offei highei-bandwidth options, the ANSI X?T9.5 commit-
tee pioduced the Fibei Iistiibuted Iata Inteiface (FIII) standaid as an alteinative
to Etheinet.
7
Iesigned and maiketed as a 100 Mb/s token iing, FIII once looked
like it would be the easy solution to many oiganizations' bandwidth needs.
Unfoitunately, FIII has been a disappointment in absolutely eveiy way, and at this
time shouldn't be consideied foi any pioduction use. We include infoimation about
FIII heie foi histoiical peispective only (and in case you happen to find it still in
use in some daik coinei of youi netwoik).
See page 278 fcr
mcre abcut M7Us.
Foi good peifoimance, FIII needs a much highei MTU than the default, which is
tuned foi Etheinet. An MTU value of 4,?52 (set with ifconfig) is about iight.
The FDDI standaid specifies a 100 Mb/s token-passing, dual-iing, all-singing, all-
dancing IAN using a fibei optic tiansmission medium, as shown in Exhibit I. The
dual-iing aichitectuie includes a piimaiy iing that's used foi data tiansmission and
a secondaiy iing that's used as a backup in the event the iing is cut (eithei physically
oi electionically).
£xh|b|t 0 I00I dua| token r|ng
Hosts can be connected eithei to both iings (they aie then iefeiied to as class A oi
"dual attached" hosts) oi to just the piimaiy iing (class B oi "single attached" hosts).
Most commonly, backbone iouteis and concentiatois aie dual attached, and woik-
stations aie single attached, usually thiough a "concentiatoi," a soit of fibei hub.
7. FIII has also been accepied as an ISO siandaid.
Norma| oµerat|on
8
A
0
C
8
A
0
C
host C |s down 6as |eak
C
A
8
0
362 Chaþter 14 - Network nardware
One advantage of token iing systems is that access to the netwoik is contiolled by a
deteiministic piotocol. Theie aie no collisions, so the peifoimance of the netwoik
does not degiade undei high load, as it can with Etheinet. Many token iing systems
can opeiate at 90% to 95% of theii iated capacity when seiving multiple clients.
Foi physical media, the FIII standaid suggests two types of fibei: single-mode and
multimode. "Vodes" aie essentially bundles of light iays that entei the fibei at a pai-
ticulai angle. Single-mode fibei allows exactly one fiequency of light to tiavel its path
and thus iequiies a lasei as an emitting souice.
8
Multimode fibei allows foi multiple
paths and is usually diiven by less expensive and less dangeious IEIs. Single-mode
fibei can be used ovei much longei distances than multimode. In piactice, 62.5 µm
multimode fibei is most commonly used foi FIII.
Seveial fibei connectoi standaids aie used with FIII, and they vaiy fiom vendoi to
vendoi. Regaidless of what connectois you use, keep in mind that a clean fibei con-
nection is essential foi ieliable opeiation. Although self-seivice fibei teimination
kits aie available, we suggest that whenevei possible you have a piofessional wiiing
fiim install the ends on fibei segments.
14.5 A1M. 1h£ PkUMIS£0 (8u1 SUk£L¥ 0£I£A1£0) LAN
ATM stands foi Asynchionous Tiansfei Mode, but some folks insist on Anothei
Technical Mistake. One datacomm industiy spokesman desciibes it as "an attempt
by the phone company to tuin youi netwoiking pioblem into something they know
how to taiiff."
ATM is technically "special" because it piomotes the philosophy that small, fixed-
size packets (called "cells") aie the most efficient way to implement gigabit netwoiks.
ATV also piomises capabilities that haven't tiaditionally been piomised by othei me-
dia, including bandwidth ieseivation and quality-of-seivice guaiantees.
On top of ATM's 5?-byte cells, five ATM Adaptation Iayeis (AAIs) aie desciibed foi
cell tianspoit. The puipose of each adaptation layei is summaiized in Table 14.4.
8. Mosi (bui noi all) laseis used in fibei opiic neiwoiking aie Class 1 devices, which can mean eiihei "safe
io shine in youi eyes" oi "noi safe io shine in youi eyes, bui ihe device has been designed io pieveni
exposuie duiing noimal use." Unfoiiunaiely, "noimal use" piobably doesn'i include messing aiound
wiih seveied cables, so iheie is ieally no guaianiee of safeiy. Ion'i shine ihe lasei in youi eye, even if all
ihe cool kids seem io be doing ii.
1ab|e 14.4 A1M adaµtat|on |ayers
AAL Aµµ||cat|on
1 Constant bit-rate aþþlications, like voice (requires bounded delay)
2 variable bit-rate aþþlications requiring bounded delay
3 Connection-oriented data aþþlications
4 Connectionless data aþþlications
5 Ceneral data transþort (esþecially lP traffic, reþlaces 3 and 4)
14.6 lrame relay: the sacrificial wAN 363
It is uncleai how AAI 2 would evei be used in ieal life. Cuiiently, theie is no defined
standaid foi it. AALs 3 and 4 tuined out to be veiy similai and weie combined. A
gioup of vendois that had to implement ATV weie unhappy with AAIs 3 and 4 be-
cause of theii high oveihead. They developed theii own solution, the Simple and Iffi-
cient Adaptation Iayei (SEAI), which soon became AAI 5.
ATM was widely maiketed in the 1990s as an all-in-one switched netwoik medium
that could be used foi LAN, WAN, and MAN needs. Today, ATM is mostly dead,
pieseived only in WAN enviionments in which laige telco coipoiations aie still tiy-
ing to leveiage theii misguided investments in ATM haidwaie.
ATM switch vendois continue to aggiessively maiket theii pioducts, and it is possi-
ble to oidei an ATM ciicuit in many locales. Howevei, it is piobably a good idea to
considei technologies othei than ATM foi new netwoik deployments.
14.6 IkAM£ k£LA¥. 1h£ SACkIIICIAL wAN
Fiame ielay is a WAN technology that offeis packet-switched data seivice, usually foi
a ieasonable cost. Although the claim is not 100% accuiate, fiame ielay is often said
to be iemaiketed X.25, a scaiy packet-switched technology fiom the mid-1970s. Foi-
tunately, it's in such widespiead use that the equipment, softwaie, and staff that sup-
poit it have evolved to be iobust and to peifoim well.
Tiaditionally, useis who wished to connect to iemote sites would puichase a dedi-
cated ciicuit fiom the phone company, such as a 56 Kb/s IIS line oi a T1 line. These
aie point-to-point data ciicuits that aie connected 24 houis a day. Infoitunately, this
type of connection is often expensive since it iequiies that the phone company ded-
icate equipment and bandwidth to the link.
In contiast, fiame ielay is an "economy of scale" appioach. The phone company cie-
ates a netwoik (often iefeiied to as a "cloud"
9
) that connects its cential offices. Useis
submit data in small packets foi iemote sites. The phone company switches the
packets thiough the appiopiiate cential offices, ultimately deliveiing them to theii
destinations. In this model, you and the phone company aie gambling that at any
given second, the total amount of tiaffic won't exceed the bandwidth of the netwoik
(a condition known euphemistically as "being oveisubsciibed").
A ioutei encapsulates IP tiaffic ovei fiame ielay connections. Packets aie switched
ovei invisible "peimanent viitual ciicuits" (PVCs), which allow youi packets to tiavel
only to the sites you've paid foi them to ieach. These PVCs affoid some degiee of
piivacy piotection fiom the othei sites connected to the fiame ielay netwoik.
The biggest advantage of fiame ielay is that it is usually inexpensive. But in the woild
of "you get what you pay foi," you may find that fiame ielay's peifoimance is some-
times pooi. Fiame ielay connections have some packet switching oveihead, and link
speed may degiade duiing peiiods of heavy use.
9. An all-ioo-appiopiiaie name-ii's nevei quiie cleai whai ihe weaihei foiecasi will be in a fiame ielay
neiwoik. Sioim? Rain? Sleei? Hail?
364 Chaþter 14 - Network nardware
14.7 IS0N. 1h£ IN0I6£NUuS wAN
Integiated Seivices Iigital Netwoik (ISIN) is a phone company offeiing that takes
many foims. In its most common and usable foim, called Basic Rate Inteiface (BRI)
ISIN, it is essentially an all-digital phone line that piovides two dial-up 64 Kb/s "B"
channels and a single 16 Kb/s signaling "I" channel. Each B channel can be used foi
eithei voice oi data (a voice line can be caiiied on a single 64 Kb/s channel).
ISIN offeis a ielatively high-speed digital line at a ieasonable cost ($?0-$150 pei
month, depending on wheie you live). Ievices called teiminal adaptois conveit the
phone line into a moie familiai inteiface such as RS-2?2. They aie used (and piiced)
much like modems. Most adaptois can aggiegate the two B channels, yielding a
128 Kb/s data channel.
ISIN can be used in place of noimal dial-up netwoiking and also as a wide-aiea
technology that uses a ioutei oi biidge to connect iemote sites acioss the line.
Although many U.S. phone companies have installed switches that aie compatible
with ISIN, they still haven't figuied out how to maiket oi suppoit them.
10
Only in a
few aieas can you just call up the phone company and oidei an ISIN line. Some tips:
make suie you deal with the bianch of the phone company that handles business
seivices, since that is how ISIN is usually classified. In many iegions, you will have
to aigue youi way past seveial waves of diones befoie you ieach someone who has
heaid of ISIN befoie, even if the seivice ieally is available.
14.8 0SL AN0 CA8L£ MU0£MS. 1h£ P£UPL£'S wAN
It's easy to move laige amounts of data among businesses and othei laige data facil-
ities. Caiiiei-piovided technologies such as T1, T3, SONET, ATM, and fiame ielay
piovide ielatively simple conduits foi moving bits fiom place to place. Howevei,
these technologies aie not iealistic options foi connecting individual houses and
home offices. They cost too much, and the infiastiuctuie they iequiie is not uni-
veisally available.
Iigital Subsciibei Iine (ISI) uses oidinaiy coppei telephone wiie to tiansmit data
at speeds of up to 7 Mb/s (although typical ISI connections yield between 256 Kb/s
and 3 Mb/s). Since most homes alieady have existing telephone wiiing, DSL is a
viable way to complete the "last mile" of connectivity fiom the telephone company
to the home. DSL connections aie usually teiminated in a box that acts as a TCP/IP
ioutei and connects to othei devices within the home ovei an Etheinet. ISI is typi-
cally both cheapei and fastei than ISIN, so it is now the piefeiied technology foi
home useis.
Unlike iegulai POTS (Plain Old Telephone Seivice) and ISIN connections, which
iequiie you to "dial up" an endpoint, most ISI implementations supply a dedicated
10. Hence ihe inieipieiaiion: Ii Siill Ioes Noihing.
14.9 where is the network going! 365
seivice that is always connected. This featuie makes ISI even moie attiactive be-
cause theie is no setup oi connection delay when a usei wants to tiansfei data.
ISI comes in seveial foims, and as a iesult it's often iefeiied to as xISI, with the x
iepiesenting a specific subtechnology such as A foi asymmetiic, S foi symmetiic, H
foi high speed, RA foi iate adaptive, and I foi ISI-ovei-ISIN (useful foi locations
too fai fiom the cential office to suppoit fastei foims of ISI). The exact technology
vaiiants and data tiansfei speeds available in youi aiea depend on the cential office
equipment that youi telephone company oi caiiiei has chosen to deploy.
The iace foi "last mile" connectivity to hundieds of millions of homes is a hot one.
It's also highly politicized, well capitalized, and oveipublicized. The ISI appioach
leveiages the coppei infiastiuctuie that is common among the Incumbent Iocal Ex-
change Caiiieis (IIECs), who favoied highei piofit maigins ovei investments in in-
fiastiuctuie as the netwoiking ievolution of the 1980s and 90s passed them by.
Cable television companies, which alieady have fibei infiastiuctuie in most neigh-
boihoods, aie piomoting theii own "last mile" solutions, which yield similai (though
asymmetiic) high-bandwidth connections to the home. The cable modem industiy
has iecently become enlightened about data standaids and is cuiiently piomoting
the Data Ovei Cable Seivice Inteiface Specification (DOCSIS) standaid. This stan-
daid defines the technical specs foi both the cable modems and the equipment used
at the cable company, and it allows vaiious biands of equipment to inteiopeiate.
All in all, the fight between cable modem and ISI technologies laigely boils down
to "my maiketing budget is biggei than youis." ISI has something of an inheient
advantage in that, in most cases, each connection is piivate to the paiticulai cus-
tomei; cable modems in a neighboihood shaie bandwidth and can sometimes
eavesdiop on each othei's tiaffic.
14.9 wh£k£ IS 1h£ N£1wUkk 6UIN6!
When you look closely at the technologies desciibed above, you'll see one thing in
common: the simple, inexpensive ones aie succeeding, wheieas the complex and ex-
pensive ones aie dying quickly. Wheie does this put us down the ioad?
Etheinet has pummeled its iivals because it is inciedibly inexpensive. It's so simple
to implement that today you can even buy miciowave ovens with Itheinet inteifaces.
Etheinet has scaled well: in many oiganizations, 10 Mb/s Etheinet infiastiuctuie
fiom the eaily 1980s is still in pioduction use, connected into 100 Mb/s and 1 Gb/s
segments. 10 Cb/s Etheinet ovei fibei is alieady available and we'll soon see it in
widespiead use on coppei cables. We expect to see this tiend continue, with fastei
and fastei switching haidwaie to connect it all.
On the "connectivity to the home" fiont, ISI offeis new life to the tiied old Ma Bell
coppei plant. The piolifeiation of cable modems has biought high-speed access (and
ieal secuiity pioblems) within ieach of millions of homes.
366 Chaþter 14 - Network nardware
What's gieat about all of these new developments is that iegaidless of the medium oi
its speed, TCP/IP is compatible with it.
14.10 N£1wUkk 1£S1IN6 AN0 0£8u66IN6
Une majoi advantage of the laige-scale migiation to Itheinet (and othei ITP-based
technologies) is the ease of netwoik debugging. Since these netwoiks can be analyzed
link by link, haidwaie pioblems can often be isolated in seconds iathei than days.
The key to debugging a netwoik is to bieak it down into its component paits and test
each piece until you've isolated the offending device oi cable. The "idiot lights" on
switches and hubs (such as "link status" and "packet tiaffic") often hold immediate
clues to the souice of the pioblem. Top-notch documentation of youi wiiing
scheme is essential foi making these indicatoi lights woik in youi favoi.
As with most tasks, having the iight tools foi the job is a big pait of being able to get
the job done iight and without delay. The maiket offeis two majoi types of netwoik
debugging tools (although they aie quickly giowing togethei).
The fiist is the hand-held cable analyzei. This device can measuie the electiical chai-
acteiistics of a given cable, including its length (with a gioovy technology called
"time domain ieflectiometiy"). Usually, these analyzeis can also point out simple
faults such as a bioken oi miswiied cable. Oui favoiite pioduct foi IAN cable analy-
sis is the Fluke IanMetei. It's an all-in-one analyzei that can even peifoim IP pings
acioss the netwoik. High-end veisions have theii own web seivei that can show you
histoiical statistics. Foi WAN (telco) ciicuits, the T-BERD line analyzei is the cat's
meow. The T-BERD and its high-end LAN-testing companion, the FIREBERD se-
iies, aie made by Acteina (www.acteina.com).
The second type of debugging tool is the netwoik sniffei. This device disassembles
netwoik packets to look foi piotocol eiiois, misconfiguiations, and geneial snafus.
Commeicial sniffeis aie available, but we find that the fieely available piogiam Wiie-
shaik (www.wiieshaik.oig) iunning on a fat laptop is usually the best option.
11
14.11 8uIL0IN6 wIkIN6
Whethei you'ie iunning gigabit Etheinet oi just seiial cables, we iecommend that
you use the highest possible quality of wiie. It will inciease the chances that you can
still use the same wiie ten yeais down the ioad. It's cheapest to wiie an entiie build-
ing at once iathei than to wiie it one connection at a time.
u1P cab||ng oµt|ons
Categoiy 5E wiie typically offeis the best piice vs. peifoimance tiadeoff in today's
maiket. Its noimal foimat is foui paiis pei sheath, which is just iight foi a vaiiety of
data connections fiom RS-2?2 to gigabit Etheinet.
11. Like so many populai piogiams, Wiieshaik is ofien ihe subjeci of aiiack by hackeis. Make suie you
siay up io daie wiih ihe cuiieni veision.
14.11 building wiring 367
Categoiy 5I specifications iequiie that the twist be maintained to within half an inch
of the connection to the punchdown block. This implies that any wiie with moie than
foui paiis pei sheath will have to be taped oi secuied to maintain the twist, since it
feeds moie than one connection.
You must use Categoiy 5E teimination paits in addition to Categoiy 5E wiie. We've
had the best luck using paits manufactuied by Siemon.
Connect|ons to off|ces
One connection pei office is cleaily not enough. But should you use two oi foui? We
iecommend foui, foi seveial ieasons:
·
They can be used foi seiial connections (modem, piintei, etc.).
·
They can be used with voice telephones.
·
They can be used to accommodate visitois oi demo machines.
·
The cost of the mateiials is typically only 5%-10% of the total cost.
·
Youi best guess doubled is often a good estimate.
·
It's much cheapei to do it once iathei than adding wiies latei.
·
When poits iun low, people add 10 Mb/s hubs puichased fiom the neaiest
office supply stoie, then complain to the help desk about connection speed.
If you'ie in the piocess of wiiing youi entiie building, you might considei installing
a few outlets in the hallways, confeience iooms, lunch iooms, bathiooms, and of
couise, ceilings foi wiieless access points. Ion't foiget to keep secuiity in mind,
howevei, and place publicly accessible poits on a "guest" VIAN that doesn't have
access to youi inteinal netwoik iesouices.
w|r|ng standards
Vodein buildings often iequiie a laige and complex wiiing infiastiuctuie to suppoit
all the vaiious activities that take place inside. Walking into the aveiage telecommu-
nications closet is usually a shocking expeiience foi the weak of stomach, as identi-
cally coloied, unlabeled wiies often covei the walls.
In an effoit to inciease tiaceability and standaidize building wiiing, the Telecom-
munications Industiy Association ieleased the TIA/EIA-606 Administiation Stan-
daid foi the telecommunication infiastiuctuie of commeicial buildings in Febiuaiy,
199?. EIA-606 specifies iequiiements and guidelines foi the identification and doc-
umentation of telecommunications infiastiuctuie.
Items coveied by EIA-606 include
·
Teimination haidwaie
·
Cables
·
Cable pathways
·
Equipment spaces
·
Infiastiuctuie coloi coding
·
Symbols foi standaid components
368 Chaþter 14 - Network nardware
In paiticulai, it specifies standaid colois to be used foi wiiing. The occult details aie
ievealed in Table 14.5.
Pantone now sells softwaie to map between the Pantone systems foi ink-on-papei,
textile dyes, and coloied plastic. Hey, you could coloi-cooidinate the wiiing, the uni-
foims of the installeis, and the wiiing documentation! On second thought .
14.12 N£1wUkk 0£SI6N ISSu£S
This section addiesses the logical and physical design of the netwoik. It's taigeted at
medium-sized installations. The ideas piesented heie will scale up to a few hundied
hosts but aie oveikill foi thiee machines and inadequate foi thousands. We also
assume that you have an adequate budget and aie staiting fiom sciatch, which is
piobably only paitially tiue.
Most of netwoik design consists of the specification of
·
The types of media that will be used
·
The topology and iouting of cables
·
The use of switches and iouteis
Anothei key issue in netwoik design is congestion contiol. Foi example, file shaiing
piotocols such as NFS and CIFS tax the netwoik quite heavily, and so file seiving on
a backbone cable is undesiiable.
The issues piesented in the following sections aie typical of those that must be con-
sideied in any netwoik design.
Network arch|tecture vs. bu||d|ng arch|tecture
The netwoik aichitectuie is usually moie flexible than the building aichitectuie, but
the two must coexist. If you aie lucky enough to be able to specify the netwoik befoie
1ab|e 14.5 £IA-606 co|or chart
1erm|nat|on tyµe Co|or Code
a
Comments
0emarcation þoint 0range 150C Central office terminations
Network connections Creen 353C Also used for aux circuit terminations
Common equiþment
b
Purþle 264C Vajor switching/data eqþt. terminations
lirst-level backbone white - Cable terminations
Second-level backbone Cray 422C Cable terminations
Station blue 291C norizontal cable terminations
lnterbuilding backbone brown 465C Camþus cable terminations
Viscellaneous Yellow 101C Vaintenance, alarms, etc.
Key teleþhone systems Red 184C -
a. According to the Pantone Vatching System
-
b. PbXes, hosts, lANs, muxes, etc.
14.12 Network design issues 369
the building is constiucted, be lavish. Foi most of us, both the building and a facili-
ties management depaitment alieady exist and aie somewhat iigid.
In existing buildings, the netwoik must use the building aichitectuie, not fight it.
Vodein buildings often contain utility iaceways foi data and telephone cables in ad-
dition to high-voltage electiical wiiing and watei oi gas pipes. They often use diop
ceilings, a boon to netwoik installeis. Vany campuses and oiganizations have undei-
giound utility tunnels that facilitate netwoik installation.
The integiity of fiie walls
12
must be maintained; if you ioute a cable thiough a fiie
wall, the hole must be snug and filled in with a noncombustible substance. Respect
ietuin aii plenums in youi choice of cable. If you aie caught violating fiie codes, you
may be fined and will be iequiied to fix the pioblems you have cieated, even if that
means teaiing down the entiie netwoik and iebuilding it coiiectly.
Youi netwoik's logical design must fit into the physical constiaints of the buildings it
seives. As you specify the netwoik, keep in mind that it is easy to diaw a logically
good solution and then find that it is physically difficult oi impossible to implement.
£x|st|ng networks
Computei netwoiks aie the focus of this discussion, but many oiganizations alieady
have CATV netwoiks and telephone netwoiks capable of tiansmitting data. Often,
these include fibei links. If youi oiganization is ieady to install a new telephone sys-
tem, buy lots of extia fibei and have it installed at the same time.
We had that oppoitunity seveial yeais ago and asked the contiactois if they would
stiing some fibei foi us. They said, "Suie, no chaige" and weie a bit miffed when we
showed up with a tiuckload of fibei foi them to install.
£xµans|on
It is veiy difficult to piedict needs ten yeais into the futuie, especially in the com-
putei and netwoiking fields. It is impoitant, theiefoie, to design the netwoik with
expansion and incieased bandwidth in mind. As cable is being installed, especially
in out-of-the-way, haid-to-ieach places, pull thiee to foui times the numbei of paiis
you actually need. Remembei: the majoiity of installation cost is laboi, not mateiials.
Even if you have no plans to use fibei, it's wise to install some when wiiing youi
building, especially if it is haid to install cables latei. Run both multimode and sin-
gle-mode fibei; the kind you need in the futuie is always the kind you didn't install.
Congest|on
A netwoik is like a chain: it is only as good as its weakest oi slowest link. The pei-
foimance of Etheinet, like that of many othei netwoik aichitectuies, degiades non-
lineaily as the netwoik gets loaded.
12. This iype of fiie wall is a concieie, biick, oi flame-ieiaidani wall ihai pievenis flames fiom spieading
and buining down a building. While much diffeieni fiom a neiwoik secuiiiy fiiewall, ii's piobably jusi
as impoiiani.
370 Chaþter 14 - Network nardware
Oveitaxed switches, mismatched inteifaces, and low-speed links can all lead to con-
gestion. It is helpful to isolate local tiaffic by cieating subnets and by using inteicon-
nection devices such as iouteis. Subnets can also be used to coidon off machines
that aie used foi expeiimentation. It's difficult to iun an expeiiment that involves
seveial machines if theie is no easy way to isolate those machines both physically
and logically fiom the iest of the netwoik.
Ma|ntenance and documentat|on
We have found that the maintainability of a netwoik coiielates highly with the qual-
ity of its documentation. Accuiate, complete, up-to-date documentation is absolutely
indispensable.
Cables should be labeled at all teimination points and also eveiy few feet so that they
can easily be identified when discoveied in a ceiling oi wall.
1?
It's a good idea to post
copies of local cable maps inside communications closets so that the maps can be
updated on the spot when changes aie made. Unce eveiy few weeks, someone should
copy down the changes foi entiy into an wiiing database.
Joints between majoi population centeis in the foim of switches oi iouteis can fa-
cilitate debugging by allowing paits of the netwoik to be isolated and debugged sepa-
iately. It's also helpful to put joints between political and administiative domains.
14.13 MANA6£M£N1 ISSu£S
If the netwoik is to woik coiiectly, some things need to be centialized, some distiib-
uted, and some local. Reasonable giound iules and "good citizen" guidelines need
to be foimulated and agieed on.
A typical enviionment includes
·
A backbone netwoik among buildings
·
Iepaitmental subnets connected to the backbone
·
Cioup subnets within a depaitment
·
Connections to the outside woild (e.g., Inteinet oi field offices)
Seveial facets of netwoik design and implementation must have site-wide contiol,
iesponsibility, maintenance, and financing. Netwoiks with chaige-back algoiithms
foi each connection giow in veiy bizaiie but piedictable ways as depaitments tiy to
minimize theii own local costs. Piime taigets foi cential contiol aie
·
The netwoik design, including the use of subnets, iouteis, switches, etc.
·
The backbone netwoik itself, including the connections to it
·
Host IP addiesses, hostnames, and subdomain names
·
Piotocols, mostly to ensuie that they inteiopeiate
·
Routing policy to the Inteinet
13. Some cable manufaciuieis will pielabel spools of cable eveiy few feei foi you.
14.14 Recommended vendors 371
Domain names, IP addiesses, and netwoik names aie in some sense alieady con-
tiolled centially by authoiities such as ARIN and ICANN. Howevei, youi site's use of
these items must be cooidinated locally as well.
A cential authoiity has an oveiall view of the netwoik: its design, capacity, and ex-
pected giowth. It can affoid to own monitoiing equipment (and the staff to iun it)
and to keep the backbone netwoik healthy. It can insist on coiiect netwoik design,
even when that means telling a depaitment to buy a ioutei and build a subnet to
connect to the campus backbone netwoik. Such a decision might be necessaiy so
that a new connection does not adveisely impact the existing netwoik.
If a netwoik seives many types of machines, opeiating systems, and piotocols, it is
almost essential to have a veiy smait ioutei (e.g., Cisco) as a gateway between nets.
14.14 k£CUMM£N0£0 v£N0UkS
In the past 15+ yeais of installing netwoiks aiound the woild, we've gotten buined
moie than a few times by pioducts that didn't quite meet specs oi weie misiepie-
sented, oveipiiced, oi otheiwise failed to meet expectations. Below is a list of ven-
dois in the United States that we still tiust, iecommend, and use ouiselves today.
Cab|es and connectors
AMP (now pait of Tyco) Black Box Coipoiation
(800) 522-6752 (724) 746-5500
www.amp.com www.blackbox.com
Anixtei Newaik Electionics
(800) 264-98?7 (800) 46?-9275
www.anixtei.com www.newaik.com
Belden Cable Siemon
(800) 2?5-??61 (860) 945-4?95
(765) 98?-5200 www.siemon.com
www.belden.com
1est equ|µment
Fluke Acteina
(800) 44?-585? (866) 228-?762
www.fluke.com www.acteina.com
Siemon
(860) 945-4?95
www.siemon.com
372 Chaþter 14 - Network nardware
kouters/sw|tches
Cisco Systems
(415) ?26-1941
www.cisco.com
14.15 k£CUMM£N0£0 k£A0IN6
BARNETT, IAVII, IAVII CROTH, ANI JIM MCBEE. Cabling. 1he Ccmplete Cuide tc
Netwcrk Wiring (3rd Lditicn). San Fiancisco: Sybex, 2004.
SEIFERT, RICH. Cigabit Lthernet. 1echnclcgy and Applicaticns fcr High Speed IANs.
Reading, MA: Addison-Wesley, 1998.
ANSI/TIA/EIA-568-A, Ccmmercial Building 1eleccmmunicaticns Cabling Standard,
and ANSI/TIA/EIA-606, Administraticn Standard fcr the 1eleccmmunicaticns Infra-
structure cf Ccmmercial Buildings, aie the telecommunication industiy's standaids
foi building wiiing. Unfoitunately, they aie not fiee. See www.tiaonline.oig.
SPURCEON, CHARIES. "Cuide to Etheinet." www.etheimanage.com/etheinet
14.16 £X£kCIS£S
E14.1 Today, most office buildings house computei netwoiks and aie wiied
with UTP Etheinet. Some combination of hubs and switches is needed
to suppoit these netwoiks. In many cases, the two types of equipment
aie inteichangeable. Iist the advantages and disadvantages of each.
E14.2 Iiaw a simple, imaginaiy netwoik diagiam that connects a machine in
youi computei lab to Amazon.com. Include IAN, MAN, and WAN
components. Show what technology is used foi each component. Show
some hubs, switches, and iouteis.
E14.3 Research WPA's Temporal Key Integrity Protocol. Ietail what advantages
this has over WIP, and what types of attacks it prevents.
E14.4 TTCP is a tool that measuies TCP and UIP peifoimance (look foi it at
www.ipmfind.net). Install TTCP on two netwoiked machines and mea-
suie the peifoimance of the link between them. What happens to the
bandwidth if you adjust buffei sizes up oi down? How do youi obseived
numbeis compaie with the theoietical capacity of the physical medium?
373

0h5: Ibe 0oma/o hame 5,stem
7illions of hosts are connected to the Internet. How do we keep track of them all when
they belong to so many diffeient countiies, netwoiks, and administiative gioups?
Two key pieces of infiastiuctuie hold eveiything togethei: the Iomain Name System
(DNS), which keeps tiack of who the hosts aie, and the Inteinet iouting system,
which keeps tiack of how they aie connected.
This chaptei (mini-book, some would say) is about INS. Although INS has come
to seive seveial diffeient puiposes, its piimaiy job is to map between hostnames
and IP addiesses. Useis and usei-level piogiams like to iefei to machines by name,
but low-level netwoik softwaie undeistands only numbeis. INS piovides the glue
that keeps eveiyone happy. It has also come to play an impoitant iole in the iouting
of email, web seivei access, and many othei seivices.
INS is a distiibuted database. "Iistiibuted" means that my site stoies the data about
my computeis, youi site stoies the data about youi computeis, and somehow, oui
sites automatically coopeiate and shaie data when one site needs to look up some of
the othei's data.
Oui INS coveiage can be divided into thiee majoi sections:
·
The INS system in geneial: its histoiy, data, and piotocols
·
BINI, a specific implementation of the INS system
·
The opeiation and maintenance of BINI seiveis, including ielated topics
such as secuiity
INS
374 Chaþter 15 - 0NS: Jhe 0omain Name System
If you need to set up oi maintain youi site's INS seiveis and alieady have a geneial
idea how INS woiks, feel fiee skip ahead.
Befoie we stait in on the geneial backgiound of INS, let's fiist take a biief detoui to
addiess eveiyone's most fiequently asked question: how do I add a new machine to a
netwoik that's alieady using BINI? What follows is a cookbook-style iecipe that does
not define oi explain any teiminology and that piobably does not fit exactly with
youi local sysadmin policies and pioceduies. Use with caution.
15.1 0NS IUk 1h£ IMPA1I£N1. A00IN6 A N£w MAChIN£
If youi netwoik is set up to use the Iynamic Host Configuiation Piotocol (IHCP)
you may not need to peifoim any manual configuiation foi INS. When a new com-
putei is connected, the IHCP seivei infoims it of the INS seiveis it should use foi
queiies. Hostname-to-IP-addiess mappings foi use by the outside woild weie most
likely set up when the IHCP seivei was configuied and aie automatically enteied
thiough INS's dynamic update facility.
Foi netwoiks that do not use DHCP, the following iecipe shows how to update the
DNS configuiation by copying and modifying the iecoids foi a similai computei.
Step 1: Choose a hostname and IP addiess foi the new machine in conjunction with
local sysadmins oi youi upstieam ISP.
Step 2: Identify a similai machine on the same subnet. You'll use that machine's
iecoids as a model foi the new ones. In this example, we'll use a machine called
templatehost.my.domain as the model.
Step 3: Iog in to the mastei name seivei machine.
Step 4: Iook thiough the name seivei configuiation file, usually /etc/named.conf:
·
Within the o¡¹iors statement, find the direc¹ory line that tells wheie zone
data files aie kept at youi site (see page 424). The zone files contain the
actual host and IP addiess data.
·
Fiom the zore statements, find the filenames foi the foiwaid zone file and
reverse zone file appropriate for your new IP address (page 432).
·
Veiify fiom the zore statements that this seivei is in fact the mastei seivei
foi the domain. The foiwaid zone statement should look like this:
zore ¨ny.donair¨ ¦
¹y¡e nas¹er,
...
Step 5: Go to the zone file diiectoiy and edit the foiwaid zone file. Find the iecoids
foi the template host you identified eailiei. They'll look something like this:
¹en¡la¹elos¹ lN A l28.l38.243.l00
lN MX l0 nail-lub
lN MX 20 ¹en¡la¹elos¹
15.2 Jhe history of 0NS 375
Youi veision might not include the MX lines, which aie used foi mail iouting.
Step 6: Duplicate those iecoids and change them appiopiiately foi youi new host.
The zone file might be soited by hostname; follow the existing convention. Be suie
to also change the seiial numbei in the SOA iecoid at the beginning of the file (it's
the fiist of the five numbeis in the SOA iecoid). The seiial numbei should only in-
ciease; add 1 if youi site uses an aibitiaiy seiial numbei, oi set the field to the cui-
ient date if youi site uses that convention.
Step 7: Edit the ieveise zone file,
1
duplicate the iecoid foi the template host, and
update it. It should look something like this:
l00 lN FTR ¹en¡la¹elos¹.ny.donair.
Note that theie is a tiailing dot aftei the hostname; don't omit it. You must also up-
date the seiial numbei in the SOA iecoid of the ieveise zone file.
If youi ieveise zone file shows moie than just the last byte of each host's IP addiess,
you must entei the bytes in ieveise oidei. Foi example, the iecoid
l00.243 lN FTR ¹en¡la¹elos¹.ny.donair.
coiiesponds to the IP addiess 128.1?8.24?.100 (heie, the ieveise zone is ielative to
1?8.128.in-addi.aipa iathei than 24?.1?8.128.in-addi.aipa).
Step 8: While still logged in to the mastei name seivei machine, iun rndc reload, oi
if it's a busy seivei, just ieload the domains (oi views) that you changed:
= rndc reIoad joruuruzorcrumc
= rndc reIoad rc·crsczorcrumc
Step 9: Check the configuiation with dig; see page 47?. You can also tiy to ping oi
traceroute to youi new host's name, even if the new host has not yet been set up. A
"host unknown" message means you goofed; "host not iesponding" means that ev-
eiything is piobably OK.
The most common eiiois aie
·
Forgetting to update the serial number and reload the name server, and
·
Foigetting to add a dot at the end of the hostname in the PTR ieveise entiy.
15.2 1h£ hIS1Uk¥ UI 0NS
INS was foimally specified by Paul Mockapetiis in RFCs 882 and 88? (198?) and
updated in RFCs 10?4 and 10?5 (1987). It contained two key concepts: hieiaichical
hostnames and distiibuted iesponsibility.
1. The ieveise zone mighi be mainiained elsewheie (e.g., ai youi ISP's siie). If so, ihe ieveise eniiy will
have io be enieied iheie.
376 Chaþter 15 - 0NS: Jhe 0omain Name System
8IN0 |mµ|ementat|ons
The oiiginal UNIX implementation was done by foui giaduate students at Beikeley
(Iouglas Teiiy, Vaik Paintei, Iavid Riggle, and Songnian 7hou) in 1984. It was then
added to the Beikeley UNIX distiibution by Kevin Dunlap in the mid-1980s and
became known as BIND, the Beikeley Inteinet Name Domain system. Paul Vixie
and ISC, the Inteinet Systems Consoitium (www.isc.oig, known as the Inteinet
Softwaie Consoitium befoie 2004) cuiiently maintain BIND. It is an open souice
pioject. In 2000 and 2001, ISC developed a totally new veision of BINI-BINI 9-
with funding fiom seveial vendois, goveinment agencies, and othei oiganizations.
ISC also piovides vaiious types of suppoit foi these pioducts, including help with
configuiation, classes on BINI and INS, and even custom piogiamming. These
seivices aie a boon foi sites that must have a suppoit contiact befoie they can use
open souice softwaie. Seveial companies use seivice contiacts as a way to contiib-
ute to the ISC-they buy expensive contiacts but nevei call foi help.
Thanks to a poit by Noitel, BINI is available foi Windows as well as UNIX/Iinux.
Since the INS piotocol is standaidized, UNIX and non-UNIX INS implementa-
tions can all inteiopeiate and shaie data. Many sites iun UNIX seiveis to piovide
INS seivice to theii Windows desktops; this combination woiks well.
RFCs 10?4 and 10?5 aie still consideied the baseline specification foi INS, but moie
than 40 othei RFCs have supeiseded and elaboiated on vaiious aspects of the pioto-
col and data iecoids ovei the last decade (see the list at the end of this chaptei).
Cuiiently, no single standaid oi RFC biings all the pieces togethei in one place. His-
toiically, INS has moie oi less been defined as "what BINI implements," although
this is becoming less accuiate as othei INS seiveis emeige.
Uther |mµ|ementat|ons of 0NS
In the beginning, BINI was the only INS implementation in widespiead use. Today
theie aie seveial, both open souice and commeicial. Many do not implement all the
specifications defined by the many DNS RFCs that aie winding theii way thiough
the standaidization piocess. Table 15.1 lists the moie populai INS implementations
and shows wheie to go foi moie infoimation.
1ab|e 15.1 Some µoµu|ar |mµ|ementat|ons of 0NS
Name Author Source Comments
blN0 lSC isc.org Authoritative or caching
NS0 Nlnet labs www.nlnetlabs.nl Authoritative only
Power0NS Power0NS bv www.þowerdns.com Authoritative only
djbdns
a
0an bernstein tinydns.org violates some RlCs
Vicrosoft 0NS Vicrosoft microsoft.com Cuilty of a myriad of sins
ANS, CNS Nominum www.nominum.com Authoritative or caching
a. Also known as tinydns, which is the server comþonent of the djbdns þackage
15.3 who needs 0NS! 377
ISC's ongoing domain suivey keeps tiack of the vaiious INS implementations and
the numbei of name seiveis using each. To see the cuiient population demogiaph-
ics, go to www.isc.oig, click ISC Inteinet Iomain Suivey, click Iatest Suivey Results,
and finally click Iomain Seivei Softwaie Iistiibution. INS appliances such as In-
foblox (www.infoblox.com) aie used by some laige sites but do not yet show up in
the suivey's fingeipiinting.
In this book we discuss only BINI, which is consideied the iefeience implementa-
tion of INS. It is by fai the most widely used implementation and is an appiopiiate
choice foi most sites. BINI tiacks all the standaids and pioposed standaids of the
IETF, often implementing featuies befoie theii specifications aie complete. This is
good because some standaids-tiack featuies tuin out to be flawed-theii inclusion
in BINI allows pioblems to be iecognized and fixed befoie being wiitten into "law."
NSI, the name seivei daemon, was developed in 2003 by NInet Iabs in Amsteidam.
It piovides a fast, secuie, authoiitative name seivei appiopiiate foi use by ioot and
top-level domain seiveis. (An authoiitative seivei is appiopiiate foi pioviding the
answeis to queiies about hosts in youi domain, but it cannot answei useis' queiies
about othei domains.)
PoweiINS is an open souice authoiitative name seivei that piovides a uniquely
flexible back-end system. The INS data can come fiom files oi fiom a long list of
othei souices: MySQI, Oiacle (8i and 9i), IBM's IB2, PostgieSQI, Miciosoft's SQI
Seivei, IIAP, OIBC, XIB, oi even a UNIX pipe.
djbdns is an alteinative name seivei package that consists of an authoiitative seivei
called tinydns and a caching seivei called dnscache. It claims to be secuie and veiy
fast, although some of the measuiement data we aie awaie of is inconclusive. Its
main diawback is that it violates the INS standaids fiequently and intentionally,
making inteiopeiation with othei INS seiveis difficult.
Miciosoft piovides a INS seivei foi Windows, but the Miciosoft implementation
has its own special quiiks and diffeiences. It inteiopeiates with BINI but also tends
to cluttei the net with unnecessaiy and malfoimed packets.
Nominum, the contiactoi that wiote the initial veision of BINI 9 foi ISC, sells its
own name seiveis and netwoik management tools. The Nominum seiveis aie blind-
ingly fast and include most of the cuiiently pioposed standaids.
15.3 whU N££0S 0NS!
INS defines
·
A hieiaichical namespace foi hosts and IP addiesses
·
A distiibuted database of hostname and addiess infoimation
·
A "iesolvei" to queiy this database
·
Impioved iouting foi email
·
A mechanism foi finding seivices on a netwoik
·
A piotocol foi exchanging naming infoimation
378 Chaþter 15 - 0NS: Jhe 0omain Name System
To be full citizens of the Inteinet, sites need INS. Maintaining a local /etc/hosts file
with mappings foi eveiy host you might evei want to contact is not possible.
Each site maintains one oi moie pieces of the distiibuted database that makes up the
woild-wide INS system. Youi piece of the database consists of text files that contain
iecoids foi each of youi hosts. Each iecoid is a single line consisting of a name (usu-
ally a hostname), a iecoid type, and some data values. The name field can be omit-
ted if its value is the same as that of the pievious line.
Foi example, the lines
barl lN A 20o.lo8.lº8.20º
lN MX l0 nailserver.a¹rus¹.con.
in the "foiwaid" file, and
20º lN FTR barl.a¹rus¹.con.
in the "ieveise" file associate "baik.atiust.com" with the IP addiess 206.168.198.209
and ieioute email addiessed to this machine to the host mailseivei.atiust.com.
INS is a client/seivei system. Seiveis ("name seiveis") load the data fiom youi INS
files into memoiy and use it to answei queiies both fiom inteinal clients and fiom
clients and othei seiveis out on the Inteinet. All of youi hosts should be INS clients,
but ielatively few need to be INS seiveis.
If youi oiganization is small (a few hosts on a single netwoik), you can iun a seivei
on one host oi ask youi ISP to supply INS seivice on youi behalf. A medium-sized
site with seveial subnets should iun multiple INS seiveis to ieduce queiy latency
and impiove ieliability. A veiy laige site can divide its INS domain into subdomains
and iun seveial seiveis foi each subdomain.
15.4 1h£ 0NS NAM£SPAC£
The INS namespace is oiganized into what mathematicians call a tiee; each domain
name coiiesponds to a node in the tiee. One bianch of the INS naming tiee maps
hostnames to IP addiesses, and a second bianch maps IP addiesses back to host-
names. The foimei bianch is called the "foiwaid mapping," and the BINI data files
associated with it aie called "foiwaid zone files." The addiess-to-hostname bianch
is the "ieveise mapping," and its data files aie called "ieveise zone files." Sadly, many
sites do not maintain theii ieveise mappings.
Each domain iepiesents a distinct chunk of the namespace and is loosely managed
by a single administiative entity. The ioot of the tiee is called "." oi dot, and beneath
it aie the top-level (oi ioot-level) domains.
Foi histoiical ieasons, two types of top-level domain names aie in cuiient use. In the
United States, top-level domains oiiginally desciibed oiganizational and political
stiuctuie and weie given thiee-lettei names such as com and edu. Some of these do-
mains (piimaiily com, oig, and net) aie used outside the United States as well; they
aie called the geneiic top-level domains oi gTIIs foi shoit.
15.4 Jhe 0NS namesþace 379
The top-level domains weie ielatively fixed in the past, but ICANN appioved seven
new ones in late 2000: biz, info, name, pio, museum, aeio, and coop.
2
Moie iecently,
"jobs" was added to the gTII list. These domains aie now opeiational and available
foi use. The biz, info, and name domains aie called "unsponsoied" gTIIs and aie
open to anyone; museum, aeio, jobs, pio, and coop aie "sponsoied" TIIs that aie
limited to specific types of iegistiants.
Table 15.2 lists the most impoitant gTIIs along with theii oiiginal puiposes. When
good names in the com domain became scaice, the iegistiies began to offei names
in oig and net without iegaid to those domains' oiiginal iestiictions. The domains
in the left column of Table 15.2 aie the oiiginals, dating fiom about 1988; the iight
column includes the new domains added since 2001.
Foi most domains outside the United States, two-lettei ISO countiy codes aie used.
These domains aie known as ccTIIs, oi "countiy code top-level domains." Both the
geogiaphical and the oiganizational TIIs coexist within the same global
namespace. Table 15.? shows some common countiy codes.
Some countiies outside the United States build an oiganizational hieiaichy with sec-
ond-level domains. Naming conventions vaiy. Foi example, an academic institution
might be in edu in the United States and in ac.jp in Japan.
2. ICANN is ihe Inieinei Coipoiaiion foi Assigned Names and Numbeis. See page 273 foi moie infoima-
iion aboui ICANN.
1ab|e 15.2 6ener|c toµ-|eve| doma|ns
0oma|n what |t's for 0oma|n what |t's for
com Commercial comþanies aero Air transþort industry
edu u.S. educational institutions biz businesses
gov u.S. Covernment agencies cooþ Cooþeratives
mil u.S. military agencies info unrestricted use
net Network þroviders jobs numan resources folks
org Nonþrofit organizations museum Vuseums
int lnternational organizations name lndividuals
arþa Anchor for lP address tree þro Accountants, lawyers, etc.
1ab|e 15.3 Common country codes
Code Country Code Country Code Country
au Australia fi linland hk nong Kong
ca Canada fr lrance ch Switzerland
br brazil jþ Jaþan mx Vexico
de Cermany se Sweden hu nungary
380 Chaþter 15 - 0NS: Jhe 0omain Name System
The top-level domain "us" is also sometimes used in the United States, piimaiily
with locality domains; foi example, bvsd.k12.co.us, the Bouldei Valley School Dis-
tiict in Coloiado. The "us" domain is nevei combined with an oiganizational do-
main-theie is no "edu.us" (yet). The advantage of "us" domain names is that they
aie inexpensive to iegistei; see www.nic.us foi moie details. The iestiictions on sec-
ond-level domains beneath "us" (which weie foimeily limited to U.S. states) have
been ielaxed, and domain names like evi-nemeth.us aie possible.
Iomain meicenaiies have in some cases bought an entiie countiy's namespace. Foi
example, the domain foi Voldovia, "md", is now being maiketed to doctois and ies-
idents of the state of Vaiyland (VI) in the Inited States. Anothei example is Tuvalu,
foi which the countiy code is "tv". The fiist such sale was Tonga ("to"), the most
active is cuiiently Niue ("nu"), and peihaps the most attiactive is "tm" fiom Tuik-
menistan. These deals have sometimes been faii to the countiy with the desiiable
two-lettei code and sometimes not.
Iomain squatting is also widely piacticed: folks iegistei names they think will be
iequested in the futuie and then iesell them to the businesses whose names they
have snitched. Yeais ago, domain names foi all the Coloiado ski aieas weie iegis-
teied to the same individual, who made quite a bit of money ieselling them to indi-
vidual ski aieas as they became web-awaie.
The going iate foi a good name in the com domain is between seveial thousand and
a few million dollais. We weie offeied $50,000 foi the name admin.com, which we
obtained yeais ago when sysadmin.com had alieady been taken by /Sys/Admin mag-
azine. The highest piice so fai (oi at least, the highest on public iecoid) was the $7.5
million paid foi business.com duiing the heyday of the tech stock boom. $20,000 to
$100,000 is a moie common iange these days, but multimillion dollai sales aie still
occuiiing, an example being the July 2004 sale of CieditCaids.com foi $2.75 million.
Inteinet entiepieneui Ian Paiisi was expected to ieceive seveial million dollais foi
foimei poin site whitehouse.com, which was placed on the block in eaily 2004. A
seiies of diffeient businesses have used the name since then, but the exact teims and
financial details weie nevei made public.
Cuiiently, valid domain names consist only of letteis, numbeis, and dashes. Each
component of a name can be no longei than 6? chaiacteis, and names must be
shoitei than 256 chaiacteis oveiall. Inteinationalization of the INS system and sup-
poit foi non-ASCII chaiactei sets will eventually change all the naming iules, but
foi now names fiom othei chaiactei sets aie mapped back to ASCII; see page ?88.
Iomain names aie case insensitive. "Coloiado" is the same as "coloiado", which is
the same as "COIORAIO" as fai as INS is conceined. Cuiient INS implementa-
tions must ignoie case when making compaiisons but piopagate case when it is sup-
plied. In the past it was common to use capital letteis foi top-level domains and an
15.4 Jhe 0NS namesþace 381
initial capital foi second-level domains. These days, fingeis aie weaiy fiom typing
and all-loweicase is the noim.
?
An Inteinet host's fully qualified name is foimed by appending its domain name to
its hostname. Foi example, bouldei.coloiado.edu is the fully qualified name foi the
host bouldei at the Univeisity of Coloiado. Othei sites can use the hostname boul-
dei without colliding, because the fully qualified names will be diffeient.
Within the INS system, fully qualified names aie teiminated by a dot; foi example,
"bouldei.coloiado.edu.". The lack of a final dot may indicate a ielative addiess. Ie-
pending on the context in which a ielative addiess is used, additional components
might be added. The final dot convention is geneially hidden fiom eveiyday useis of
INS. In fact, some systems (such as mail) will bieak if you supply the dot youiself.
It's common foi a host to have moie than one name. The host bouldei.coloiado.edu
could also be known as www.coloiado.edu oi ftp.coloiado.edu if we wanted to make
its name ieflect the seivices it piovides. In fact, it's a good piactice to make seivice
hostnames such as www be "mobile," so that you can move seiveis fiom one ma-
chine to anothei without changing any machine's piimaiy name. You can assign ex-
tia names by using the CNAME constiuct; see page ?99.
When we weie issued the name coloiado.edu, we weie guaianteed that coloiado was
unique within the edu domain. We have fuithei divided that domain into subdo-
mains along depaitment lines. Foi example, the host anchoi in the computei science
depaitment is called anchoi.cs.coloiado.edu on the Inteinet.
The cieation of each new subdomain must be cooidinated with the administiatois
of the domain above to guaiantee uniqueness. Entiies in the configuiation files foi
the paient domain delegate authoiity foi the namespace to the subdomain.
Masters of the|r doma|ns
Management of the top-level domains com, oig, net, and edu was foimeily cooidi-
nated by Netwoik Solutions, Inc., undei contiact with the National Science Founda-
tion. This monopoly situation has now changed, and othei oiganizations aie allowed
to iegistei domain names in those gTIIs. Uthei top-level domains, such as those foi
individual countiies, aie maintained by iegional oiganizations.
Theie have been vaiious pioposals to allow piivate companies to opeiate theii own
top-level domains, and it is likely that additional top-level domains will be available
in the neai futuie. Consult www.icann.oig foi up-to-date infoimation.
Most ISPs offei fee-based domain name iegistiation seivices. They deal with the
top-level domain authoiity on youi behalf and configuie theii INS seiveis to han-
dle name lookups within youi domain. The disadvantage of ielying on an ISP's seiv-
eis is that you lose diiect contiol ovei the administiation of youi domain.
3. BINI pieseives case, but some implementations (e.g., Viciosoft's and djbdns) change case accoiding to
theii own piefeience. So much foi tight standaids.
382 Chaþter 15 - 0NS: Jhe 0omain Name System
See page 287 fcr
mcre infcrmaticn
abcut CIIR.
To manage youi own INS seivices, you must still cooidinate with youi ISP. Most
ISPs supply ieveise INS mappings foi IP addiesses within theii CIIR blocks. If you
take ovei INS management of youi addiesses, make suie that youi ISP disables its
seivice foi those addiesses and delegates that iesponsibility to you.
A domain's foiwaid and ieveise mappings should be managed in the same place
whenevei possible. Some ISPs aie happy to let you manage the foiwaid files but aie
ieluctant to ielinquish contiol of the ieveise mappings. Such split management can
lead to synchionization pioblems. See page 400 foi an elegant (?) hack that makes
delegation woik even foi tiny pieces of addiess space.
INS domains should (must, in fact; see RFC1219) be seived by at least two seiveis.
One common aiiangement is foi a site to opeiate its own mastei seivei and to let the
ISP's seiveis act as slaves. Once the system has been configuied, the ISP's seiveis
automatically download host data fiom the mastei seivei. Changes made to the INS
configuiation aie ieflected on the slave seiveis without any explicit woik on the pait
of eithei site's administiatoi.
Ion't put all of youi INS seiveis on the same netwoik. When INS stops woiking,
the netwoik effectively stops foi youi useis. Spiead youi INS seiveis aiound so that
you don't end up with a fiagile system with a single point of failuie. INS is quite
iobust if configuied caiefully.
Se|ect|ng a doma|n name
Oui advice used to be that names should be shoit and easy to type and that they
should identify the oiganization that used them. These days, the ieality is that all the
good, shoit names have been taken, at least in the com domain. It's tempting to
blame this state of affaiis on squatteis, but in fact most of the good names aie in
actual use. In 2004, ovei 60% of the iegisteied names weie in use; histoiically, less
than half of iegisteied names weie actually used.
0oma|n b|oat
INS was designed to map an oiganization's domain name to a name seivei foi that
oiganization. In that mode it needs to scale to the numbei of oiganizations in the
woild. Now that the Inteinet has become a conduit of mass cultuie, howevei, do-
main names aie being applied to eveiy pioduct, movie, spoiting event, Inglish noun,
etc. Iomain names such as twinkies.com aie not (diiectly) ielated to the company
that makes the pioduct; they'ie simply being used as adveitisements. It's not cleai
that DNS can continue to scale in this way. The ieal pioblem heie is that the DNS
naming tiee is a moie efficient data stiuctuie when it has some hieiaichy and is not
totally flat. With each oiganization naming hundieds oi thousands of pioducts at
the top level of the tiee, hieiaichy is doomed.
What we ieally need is a diiectoiy seivice that maps biand and maiketing names to
oiganizations, leaving INS fiee to deal with IP addiesses. Anothei possible solution
is to enfoice hieiaichy in the system; foi example, twinkies.hostess-foods.com. But
this will nevei happen-we've alieady gone too fai down the maiketing path.
15.5 now 0NS works 383
Sony does things the iight way fiom DNS's peispective-all of its pioducts aie sub-
domains of sony.com. It might take an extia click oi two to find the pioducts you
want, but INS appieciates the hieiaichy.
keg|ster|ng a second-|eve| doma|n name
To obtain a second-level domain name, you must apply to a iegistiai foi the appio-
piiate top-level domain. ICANN acciedits vaiious agencies to be pait of its shaied
iegistiy pioject foi iegisteiing names in the gTIIs. As of this wiiting, you have
something like 500 choices of iegistiai. Check www.icann.oig foi the definitive list.
To iegistei foi a ccTII name in Iuiope, contact the Council of Iuiopean National
Top-level Iomain Registiies at www.centi.oig to identify youi local iegistiy and ap-
ply foi a domain name. Foi the Asia-Pacific iegion, the appiopiiate body is the
Asia-Pacific Netwoik Infoimation Centei, www.apnic.net.
To complete the domain iegistiation foims, you must identify a technical contact
peison, an administiative contact peison, and at least two hosts that will be seiveis
foi youi domain.
Creat|ng your own subdoma|ns
The pioceduie foi cieating a subdomain is similai to that foi cieating a second-level
domain, except that the cential authoiity is now local (oi moie accuiately, within
youi own oiganization). Specifically, the steps aie as follows.
·
Choose a name that is unique in the local context.
·
Identify two oi moie hosts to be seiveis foi youi new domain.
·
Cooidinate with the administiatoi of the paient domain.
Paient domains should check to be suie that a child domain's name seiveis aie up
and running before performing the delegation. If the servers are not working, a "lame
delegation" iesults, and you might ieceive nasty email asking you to clean up youi
DNS act. Page 475 coveis lame delegations in moie detail.
15.5 hUw 0NS wUkkS
Iach host that uses INS is eithei a client of the system oi simultaneously a client and
a seivei. If you do not plan to iun any INS seiveis, it's not essential that you iead the
next few sections (skip ahead to Resclver ccnfiguraticn on page 418), although they
will help you develop a moie solid undeistanding of the aichitectuie of INS.
0e|egat|on
All name seiveis iead the identities of the ioot seiveis fiom a local config file. The
ioot seiveis in tuin know about com, net, fi, de, and othei top-level domains. Fai-
thei down the chain, edu knows about coloiado.edu, com knows about admin.com,
and so on. Each zone can delegate authoiity foi its subdomains to othei seiveis.
384 Chaþter 15 - 0NS: Jhe 0omain Name System
Iet's inspect a ieal example. Suppose we want to look up the addiess foi the machine
vangogh.cs.beikeley.edu fiom the machine laii.cs.coloiado.edu. The host laii asks its
local name seivei, ns.cs.coloiado.edu, to figuie out the answei. Exhibit A illustiates
the subsequent events.
£xh|b|t A 0NS query µrocess for vangogh.cs.berke|ey.edu
The numbeis on the aiiows between seiveis show the oidei of events, and a lettei
indicates the type of tiansaction (queiy, iefeiial, oi answei). We assume that none
of the iequiied infoimation was cached befoie the queiy, except foi the names and
IP addiesses of the seiveis of the ioot domain.
The local name seivei doesn't know the addiess; fuitheimoie, it doesn't know any-
thing about cs.beikeley.edu oi beikeley.edu oi even edu. It does know some seiveis
foi the ioot domain, howevei, and since it is a iecuisive seivei, it queiies a ioot seivei
about vangogh.cs.beikeley.edu and ieceives a iefeiial to the seiveis foi edu.
The local name seivei then sends its queiy to an edu seivei (asking, as always, about
vangogh.cs.beikeley.edu) and gets back a iefeiial to the seiveis foi beikeley.edu. It
then iepeats the queiy in the beikeley.edu domain. If the Beikeley seivei doesn't
have the answei cached, it ietuins a iefeiial to cs.beikeley.edu. The cs.beikeley.edu
seivei is authoiitative foi the iequested infoimation and ietuins vangogh's addiess.
When the dust settles, ns.cs.coloiado.edu has cached vangogh's addiess. It has also
cached data on the seiveis foi edu, beikeley.edu, and cs.beikeley.edu.
Cach|ng and eff|c|ency
Caching incieases the efficiency of lookups: a cached answei is almost fiee and is
usually coiiect because hostname-to-addiess mappings typically change infie-
quently. An answei is saved foi a peiiod of time called the "time to live" (TTI), which
is specified by the ownei of the data iecoid in question. Most queiies aie foi local
hosts and can be iesolved quickly. Useis also inadveitently help with efficiency be-
cause they iepeat many queiies; aftei the fiist instance of a queiy, the iest aie "fiee."
kecurs|ve Non-recurs|ve
1-µ
10-A
4-µ
5-k
2-µ
3-k
6-µ
7-k
9-A 8-µ
ns.cs.co|orado.edu |a|r edu
root (º.º)
cs.berke|ey.edu
berke|ey.edu
= query
= Answer
= keferra|
q
A
k

15.5 now 0NS works 385
Foi a long time, caching was only applied to positive answeis. If a host's name oi
addiess could not be found, that fact was not saved. A scheme foi negative DNS
caching was desciibed in RFC10?4, but it was incomplete and was not widely imple-
mented. A bettei scheme was outlined in RFC2?08 in 1998. This scheme was imple-
mented in BINI 8.2 as an optional featuie and is now mandatoiy in BINI 9.
One measuiement at the RIPE ioot seivei in Euiope showed that 60% of INS que-
iies weie foi nonexistent data (many queiies weie foi 127.in-addi.aipa oi foi Mi-
ciosoft seivices as hostnames). Caching this infoimation faithei down the DNS
tiee should diamatically ieduce the load on the ioot seiveis.
Negative caching saves answeis of the following types:
·
No host oi domain matches the name queiied.
·
The type of data iequested does not exist foi this host.
·
The seivei to ask is not iesponding.
·
The seivei is unieachable because of netwoik pioblems.
The fiist two types of negative data aie cached foi 10 minutes by default. You can
inciease this duiation to thiee houis with a paiametei in the SOA iecoid discussed
on page ?92 and to one week with a BINI option.
Most implementations do not peifoim the last two types of negative caching. How-
evei, BINI does penalize uniesponsive seiveis and will not queiy them as long as
othei choices aie available. If all of a zone's seiveis fail to iespond, BINI does not
cache that fact.
Nonauthoiitative answeis may be cached; authoiitative negative answeis must be
cached. BINI follows these guidelines fiom the RFCs, but Windows machines seem
to implement the TTIs selectively, at least foi negative caching. They use the coiiect
default value (the minimum fiom the SOA iecoid) the fiist time a queiy ietuins
NXIOMAIN (no such domain), then ieset the TTI to 15 minutes and let it time out
noimally fiom theie.
A name seivei often ieceives multiple INS iecoids in iesponse to a queiy. Foi ex-
ample, a queiy foi the name seiveis of the ioot domain would ieceive a iesponse
that listed all 1? ioot seiveis. Which one should youi seivei queiy?
When the BINI name seivei must choose among seveial iemote seiveis, all of which
aie authoiitative foi a domain, it fiist deteimines the netwoik iound tiip time (RTT)
to each seivei. It then soits the seiveis into "buckets" accoiding to theii RTTs and
selects a seivei fiom the fastest bucket. Seiveis within a bucket aie tieated as equals
and aie used in a iound iobin fashion.
You can achieve a piimitive but effective foim of load balancing by assigning a single
hostname to seveial IP addiesses (which in ieality aie diffeient machines):
vvv lN A lº2.lo8.0.l
lN A lº2.lo8.0.2
lN A lº2.lo8.0.3
386 Chaþter 15 - 0NS: Jhe 0omain Name System
Busy web seiveis such as Yahoo! oi Coogle aie not ieally a single machine; they'ie
just a single name in the INS.
4
A name seivei that has multiple iecoids foi the same
name and iecoid type ietuins all of them to the client, but in iound iobin oidei. Foi
example, iound iobin oidei foi the A iecoids above would be 1, 2, ? foi the fiist
queiy; 2, ?, 1 foi the next; ?, 1, 2 foi the thiid, and so on.
1he extended 0NS µrotoco|
The oiiginal DNS piotocol definition dates fiom the late 1980s and uses both UDP
and TCP on poit 5?. UIP is typically used foi queiies and iesponses, and TCP foi
zone tiansfeis between mastei seiveis and slave seiveis. Unfoitunately, the maxi-
mum packet size that's guaianteed to woik in all UIP implementations is 512 bytes,
which is much too small foi some of the new INS featuies (e.g., INSSIC) that must
include digital signatuies in each packet.
The 512-byte constiaint also affects the numbei and names of the ioot seiveis. So
that all ioot seivei data will fit in a 512-byte UIP packet, the numbei of ioot seiveis
is limited to 1?, and each seivei is named with a single lettei of the alphabet.
Many iesolveis issue a UDP queiy fiist; then, if they ieceive a tiuncated iesponse,
they ieissue the queiy ovei TCP. This pioceduie gets aiound the 512-byte limit, but
it is inefficient. You might think that INS should just bail on UIP and use TCP all
the time, but TCP connections aie much moie expensive. A UDP name seivei ex-
change can be as shoit as two packets: one queiy and one iesponse. A TCP exchange
involves at least seven packets: a thiee-way handshake to initiate the conveisation, a
queiy, a iesponse, and a final handshake to close the connection.
15.6 whA1'S N£w IN 0NS
The latest developments on the INS fiont fall in the political domain iathei than the
technical domain. VeiiSign, the iegistiy company that used to have a monopoly on
iegisteiing domain names and that is cuiiently iesponsible foi the com and net
zones, added a wild caid addiess iecoid to those zones. This caused eveiy usei who
mistyped a domain name to be diiected to a site maintained by one of VeiiSign's
adveitiseis. The seivice was known as Site Findei.
The Inteinet community scieamed about the unfaiiness of it all, so ISC added a
dele¡a¹ior-orly option to BINI that iecognized these wild caid iesults and ie-
tuined a moie accuiate "no such domain" iesponse instead of the addiesses blessed
by VeiiSign. This coiiection was fine foi most top-level domains, but not all, so an
exceptions clause was latei added to piovide finei contiol. We covei these new BINI
options on page 429. Aftei about a month of complaints, VeiiSign iemoved the wild
caid iecoid and tuined off the seivice. Aftei the lawyeis soit things out, they will
piobably tuin it back on again. The IETF may eventually tighten the specifications
to allow no wild caid iecoids at all.
4. Lasi iime we checked, Coogle was moie ihan 400,000 Linux machines (ihey won'i iell, bui we googled
foi an esiimaie), and Yahoo! consisied of moie ihan 100,000 FieeBSI machines.
15.6 what's new in 0NS 387
Seveial significant technical changes have been made to INS ovei the last few yeais.
In paiticulai, the INS-ielated standaids foi IPv6 and INS secuiity have been iadi-
cally alteied by the IETF, iendeiing the coveiage of these topics in eailiei editions of
this book totally wiong. Table 15.4 lists the majoi changes and piovides a ioad map
to the pages wheie they aie coveied in moie detail.
Some of these new featuies aie enoimous piojects that the IETF has not yet finished
standaidizing. The woiking gioups that aie wiiting the standaids have good wiiteis
but lack vigilant code waiiiois; some of the moie iecent specifications may be diffi-
cult oi even impossible to implement. The cuiient ielease of BINI (9.4) includes
most of the new featuies.
IIvõ is described
in mcre detail in
Chapter 12.
Two massive new featuies, IPv6 suppoit and INSSEC, waiiant a bit of commentaiy.
IPv6 incieases the length of IP addiesses fiom ?2 bits to 128 bits. If evei fully imple-
mented, it will have an enoimous impact on the Inteinet. BINI 9 suppoits the pieces
of IPv6 that have been standaidized so fai, but it appeais unlikely that IPv6 will be
widely deployed duiing the lifetime of this book. Theiefoie, oui coveiage of BINI 9's
IPv6 suppoit is biief. Theie's enough in this chaptei to give you the geneial flavoi, but
not enough to let you migiate youi site to IPv6 and configuie INS foi it.
The INSSEC standaid adds authentication data to the INS database and its seiveis.
It uses public key ciyptogiaphy to veiify the souice and integiity of INS data and
uses INS to distiibute keys as well as host data.
Simplei authentication mechanisms have also been intioduced, such as suppoit foi
authentication thiough the use of a "shaied seciet." Howevei, the shaied seciet must
be distiibuted to each paii of seiveis that wants to peifoim mutual authentication.
1ab|e 15.4 kecent deve|oµments |n 0NS and 8IN0
Page kICs 0escr|µt|on
388 3492 lnternationalized domain names via Punycode
389 2611 l0NS0, þrotocol changes and extensions
394 1996 Asynchronous notification of zone changes
400 2311 Classless in-addr delegation (the CNAVl hack)
402 2182, 3958 SRv records for the location of services
404 - AAAA records for lPv6 addresses (A6 is obsolete)
405 2612-3 0NAVl records abandoned
405 - iþ6.arþa for reverse lPv6 maþþings, iþ6.int abandoned
- 3596, 3646 lPv6 suþþort
441 1995 lncremental zone transfers
448 2136 0ynamic uþdate (for sites that use 0nCP)
453 2845, 2930, 3645 JSlC/JKlY transaction signatures and key exchange
456 3225-6, 4033-5 0NSSlC, authentication, and security for zone data
a
a. Jotally redone in 2004
388 Chaþter 15 - 0NS: Jhe 0omain Name System
Although that's fine foi a local site with a handful of seiveis, it doesn't scale to the
level of the Inteinet. BINI 9 implements both the INSSIC public key system and the
TSIC (tiansaction signatuies) shaied-seciet system.
BINI ieleases staiting with 9.? have included the new specifications foi INSSEC.
Howevei, as people staited to expeiiment with signed zones a couple of yeais ago,
they iealized that the oiiginal INSSEC system was impiactical. Undei the oiiginal
system, a paient zone signed the key of a child zone, and copies of the signed key
weie kept in both zones. If the child wanted to change its key, it had to negotiate with
the paient and iequest that its new key be signed. Fine. Howevei, if the paient wanted
to change its key, it had to update all the child keys stoied both within its own zone
and in all its child zones. This opeiation pioved to be unmanageable foi laige zones
such as com because some child zones would invaiiably be unieachable duiing an
update. Theii keys would go out of sync and leave INS unable to veiify signatuies.
The cuiient solution is to have each child's signed key live only in the child zone, but
to intioduce a new iesouice iecoid in the paient: IS, the delegation signei. We covei
INSSEC in detail beginning on page 456.
The intioduction of inteinationalized domain names, which allow the use of non-
English chaiacteis, is pioceeding by way of a hack that maps Unicode chaiacteis
back to ASCII. A system called Punycode peifoims the mapping uniquely and ie-
veisibly by using an algoiithm known as Bootstiing; see RFC?492 foi details. As of
2005, iegistiais have begun publishing the Punycode names and most biowseis have
implemented some foim of the system. Unfoitunately, a few Punycode-ielated
spoofing and secuiity issues have also manifested themselves. In addition, inteina-
tionalized domain names effectively ieduce the maximum length (both pei-compo-
nent and total) allowed foi INS names.
The cuiient inteinationalization scheme has skiited a key issue, antialiasing, be-
cause it is veiy difficult to addiess. Antialiasing involves iesolving inconsistencies in
the mapping between Asian language chaiacteis and the Punycode-encoded Uni-
code that iepiesents them in the INS. If a chaiactei can mean one of ? oi 4 oi 10
diffeient things in Unicode, then language expeits must agiee on tianslation stan-
daids, and chaiacteis displayed on a computei scieen must be designed to diffeien-
tiate among the vaiious meanings.
Each of these thiee big issues (IPv6, INSSEC, and inteinationalization) significantly
incieases the size of INS data iecoids, theieby making it moie likely that INS will
bump into limits on UIP packet sizes.
In the mid-1990s, the INS piotocol was amended to include inciemental zone tians-
feis (like a diff between old and new zone files, inspiied by Iaiiy Wall's patch pio-
giam), asynchionous notifications (to tell slaves when the mastei's data files have
been updated), and dynamic updates (foi IHCP hosts). These changes added fea-
tuies but did not ieally addiess the fundamental tianspoit pioblem.
15.1 Jhe 0NS database 389
In the late 1990s, EDNS0 (Extended DNS, veision 0) addiessed some of the shoit-
comings of the INS piotocol in today's Inteinet. It lets speakeis adveitise theii ieas-
sembly buffei size, suppoited options, and piotocol veisions spoken. If the ieceiving
name seivei iesponds with an eiioi message, the sendei diops back to the oiiginal
INS piotocol. BINI 9 implements EINS0 in both the seivei and the iesolvei.
15.7 1h£ 0NS 0A1A8AS£
A domain's INS database is a set of text files maintained by the system administia-
toi on the domain's mastei name seivei. These text files aie often called zone files.
They contain two types of entiies: paisei commands (things like $ORlGlN and
$TTL) and "iesouice iecoids," oi RRs as they aie sometimes called. Only the ie-
souice iecoids aie ieally pait of the database; the paisei commands just piovide
some shoithand ways to entei iecoids.
We stait this section by desciibing the INS iesouice iecoids, which aie defined in
RFCs 10?5, 118?, 1876, 22?0, 2782, 29?0, ?596, and ?658. We defei discussion of the
paisei commands until page 405.
kesource records
Iach zone of the INS hieiaichy has a set of iesouice iecoids associated with it. The
basic foimat of a iesouice iecoid is
[rumc| [íí¦| [c¦uss| íy¡c uuíu
Fields aie sepaiated by whitespace (tabs oi spaces) and can contain the special chai-
acteis shown in Table 15.5.
The name field identifies the entity (usually a host oi domain) that the iecoid de-
sciibes. If seveial consecutive iecoids iefei to the same entity, the name can be omit-
ted aftei the fiist iecoid as long as the subsequent iecoids begin with whitespace. If
it is piesent, the name field must begin in column one.
A name can be eithei ielative oi absolute. Absolute names end with a dot and aie
complete. Inteinally, the softwaie deals only with absolute names; it appends the cui-
ient domain and a dot to any name that does not alieady end in a dot. This featuie
allows names to be shoitei, but it also invites mistakes.
1ab|e 15.5 Sµec|a| characters used |n kks
Character Mean|ng
, lntroduces a comment
@ Jhe current zone name
( ) Allows data to sþan lines
' wild card
a
(name field only)
a. See þage 399 for some cautionary statements.
390 Chaþter 15 - 0NS: Jhe 0omain Name System
Foi example, in the cs.coloiado.edu domain, the name "anchoi" would be inteipieted
as "anchoi.cs.coloiado.edu.". If the name weie enteied as "anchoi.cs.coloiado.edu",
the lack of a final dot would still imply a ielative name, and the default domain would
be appended, iesulting in the name "anchoi.cs.coloiado.edu.cs.coloiado.edu.". This
is a veiy common mistake.
The ttl (time to live) field specifies the length of time, in seconds, that the data item
can be cached and still be consideied valid. It is often omitted, except in the ioot
seivei hints file. It defaults to the value set by the $TTL diiective at the top of the
data file foi the zone. In BINI 9, the $TTL diiective is iequiied. If theie is no $TTL
diiective in BINI 8, the ttl defaults to a pei-zone value set in the zone's SOA iecoid.
See Chapter 17 fcr
mcre infcrmaticn
abcut NIS.
Incieasing the value of the ttl paiametei to about a week substantially ieduces net-
woik tiaffic and INS load. Howevei, once iecoids have been cached outside youi
local netwoik, you cannot foice them to be discaided. If you plan a massive ienum-
beiing, set the $TTL value low (e.g., an houi) so that stale iecoids that have been
cached elsewheie on the Inteinet expiie quickly.
Some sites set the TTI on the iecoids foi Inteinet-facing seiveis to a low value so
that if a seivei expeiiences pioblems (netwoik failuie, haidwaie failuie, denial of
seivice attack, etc.), the administiatois can iespond by changing the seivei's name-
to-IP-addiess mapping. Because the oiiginal TTIs weie low, the new values will
piopagate quickly. Foi example, the name google.com has a five-minute TTI, but
Coogle's name seiveis have a TTI of foui days (?45,600 seconds):
¡oo¡le.con. 300 lN A 2lo.23º.3¯.ºº
¡oo¡le.con. 34So00 lN NS rsl.¡oo¡le.con.
rsl.¡oo¡le.con. 34So00 lN A 2lo.23º.32.l0
We used the dig command (dig ©ns1.google.com google.com) to iecovei this
data; the output is tiuncated heie.
BIND 9 enfoices a concept known as TTL haimonization, which foices all iecoids
in an RRset (that is, all iecoids of the same type that peitain to a single node) to
have the same TTL. The value that's actually used is that of the fiist iesouice iecoid
foi the node/type paii.
The class specifies the netwoik type. Thiee values aie iecognized:
·
IN foi the Inteinet
·
HS foi Hesiod
·
CH foi ChaosNet
The default value foi the class is IN. It is often specified explicitly in zone data files
even though it is the default. Hesiod, developed at MIT, is a database seivice built on
top of BINI. ChaosNet is an obsolete netwoik piotocol foimeily used by Symbolics
Iisp machines.
Today, only two pieces of identification data aie noimally tucked away in the Chaos-
Net class: the veision numbei of the name seivei softwaie and the name of the host
15.1 Jhe 0NS database 391
on which the seivei is iunning. These data nuggets can be extiacted with dig as
shown on page 410. Administiatois use the name seivei veision numbei to identify
seiveis in need of upgiades, and they use the host identification to debug name seiv-
eis that aie ieplicated thiough the use of anycast iouting. Making this infoimation
available thiough the CH class was oiiginally a featuie (some might say "hack") of
the BINI implementations, but it is now being standaidized by the IETF as pait of
INS piopei.
5
Many diffeient types of INS iecoids aie defined, but fewei than 10 aie in common
use; IPv6 adds a few moie. We divide the iesouice iecoids into foui gioups:
·
Zone iecoids - identify domains and theii name seiveis
·
Basic iecoids - map names to addiesses and ioute mail
·
Secuiity iecoids - add authentication and signatuies to zone files
·
Optional iecoids - piovide extia infoimation about hosts oi domains
The contents of the data field depend on the iecoid type. Table 15.6 lists the com-
mon iecoid types.
Some iecoid types aie obsolete, expeiimental, oi not widely used. See the BIND
documentation foi a complete list. Vost iecoids aie maintained by hand (by editing
text files), but the secuiity iesouice iecoids iequiie ciyptogiaphic piocessing and so
5. Unfoiiunaiely, iheie is some dispuie aboui ihe name undei which ihis daia should be filed. Should ii be
veision.bind, hosiname.bind, id-seivei, oi.
1ab|e 15.6 0NS record tyµes
1yµe Name Iunct|on
Z
o
n
e S0A Start 0f Authority 0efines a 0NS zone
NS Name Server ldentifies zone servers, delegates subdomains
b
a
s
i
c
A lPv4 Address Name-to-address translation
AAAA
a
lPv6 Address Name-to-lPv6-address translation
PJR Pointer Address-to-name translation
VX Vail lxchanger Controls email routing
S
e
c
u
r
i
t
y
0S 0elegation Signer nash of signed child zone's key-signing key
0NSKlY Public Key Public key for a 0NS name
NSlC Next Secure used with 0NSSlC for negative answers
RRSlC Signature Signed, authenticated resource record set
0
þ
t
i
o
n
a
l
CNAVl Canonical Name Nicknames or aliases for a host
l0C location Ceograþhic location and extent
SRv Services Cives locations of well-known services
JXJ Jext Comments or untyþed information
a. Jhe AAAA and A6 lPv6 address records have been sþarring þartners in the llJl for the þast few years.
AAAA eventually won and went from obsolete to standard. A6 is now labeled exþerimental.
392 Chaþter 15 - 0NS: Jhe 0omain Name System
must be managed with softwaie tools. These iecoids aie desciibed in the INSSEC
section beginning on page 456.
The oidei of iesouice iecoids is almost aibitiaiy. The SUA iecoid foi a zone foimeily
had to be fiist, but that iequiiement has now been ielaxed. The SOA is typically
followed by the NS iecoids. The iecoids foi each host aie usually kept togethei. It's
common piactice to soit by the name field, although some sites soit by IP addiess
so that it's easiei to identify unused addiesses.
As we desciibe each type of iesouice iecoid in detail in the next sections, we will
inspect some sample iecoids fiom cs.coloiado.edu's data files. The default domain
in this context is "cs.coloiado.edu.", so a host specified as "anchoi" ieally means
"anchoi.cs.coloiado.edu.".
1he SUA record
An SOA iecoid maiks the beginning of a zone, a gioup of iesouice iecoids located at
the same place within the INS namespace. This node of the INS tiee is also called a
delegation point oi zone cut. As we discuss in gieatei detail on page ?96, the data foi
a INS domain usually includes at least two zones: one foi tianslating hostnames to
IP addiesses, and otheis that map in the ieveise diiection. The DNS tiee has a foi-
waid bianch oiganized by name and a ieveise bianch oiganized by IP addiess.
Each zone has exactly one SOA iecoid. The SOA iecoid includes the name of the
zone, a technical contact, and vaiious timeout values. An example:
, S¹ar¹ oí au¹lori¹y record íor cs.colorado.edu
@ lN SOA rs.cs.colorado.edu. los¹nas¹er.cs.colorado.edu. (
2004lll300 , Serial runber
¯200 , Reíresl (2 lours)
l800 , Re¹ry (30 niru¹es)
o04800 , Lx¡ire (l veel)
¯200 ) , Mirinun (2 lours)
Heie, the name field contains the symbol @, which is shoithand foi the name of the
cuiient zone. In this example, "cs.coloiado.edu." could have been used instead. The
value of @ is the domain name specified in the zore statement in the name seivei
configuiation file; it can be changed fiom within the zone file with the $ORlGlN
paisei diiective (see page 405).
This example has no ttl field. The class is IN foi Inteinet, the type is SOA, and the
iemaining items foim the data field.
"ns.cs.coloiado.edu." is the zone's mastei name seivei.
"hostmastei.cs.coloiado.edu." is the email addiess of the technical contact in the
foimat "user.hcst." iathei than the standaid userChcst. Just ieplace that fiist dot with
an u and iemove the final dot if you need to send mail to a domain's administiatoi.
Sites often use an alias such as admin oi hostmastei in place of an actual login name.
The sysadmin iesponsible foi hostmastei duties may change, and it's easiei to change
15.1 Jhe 0NS database 393
one entiy in the aliases file (see page 544) than to change all youi zone files when
you need to update the contact peison.
The paientheses continue the SOA iecoid ovei seveial lines. Theii placement is not
aibitiaiy in BINI 4 oi 8-we tiied to shoiten the fiist line by splitting it befoie the
contact addiess, but then BINI failed to iecognize the SOA iecoid. In some imple-
mentations, paientheses aie only iecognized in SOA and TXT iecoids. BINI 9 has a
bettei paisei and paientheses can be used anywheie.
The fiist numeiic paiametei is the seiial numbei of the zone's configuiation data.
The seiial numbei is used by slave seiveis to deteimine when to get fiesh data. It can
be any 32-bit integei and should be inciemented eveiy time the data file foi the zone
is changed. Many sites encode the file's modification date in the seiial numbei. Foi
example, 2004111?00 is the fiist change to the zone on Novembei 1?, 2004.
Seiial numbeis need not be continuous, but they must inciease monotonically. If by
accident you set a ieally laige value on the mastei seivei and that value is tiansfeiied
to the slaves, then coiiecting the seiial numbei on the mastei will not woik. The
slaves iequest new data only if the mastei's seiial numbei is laigei than theiis.
Theie aie thiee ways to fix this pioblem; only the fiist two woik in BINI 9.
·
One way to fix the pioblem is to exploit the piopeities of the sequence
space in which the seiial numbeis live. This pioceduie involves adding a
laige value (2
?1
) to the bloated seiial numbei, letting all the slave seiveis
tiansfei the data, and then setting the seiial numbei to just what you want.
This weiid aiithmetic, with explicit examples, is coveied in detail in the
O'Reilly INS book; RFC1982 desciibes the sequence space.
·
A sneaky but moie tedious way to fix the pioblem is to change the seiial
numbei on the mastei, kill the slave seiveis, iemove the slaves' backup
data files so they aie foiced to ieload fiom the mastei, and iestait the
slaves. It does not woik to just iemove the files and ieload; you must kill
and iestait the slave seiveis.
·
BINI 4.9 and BINI 8 include a hack that lets you set the seiial numbei to
zeio foi one iefiesh inteival and then iestait the numbeiing. The zeio
always causes a ieload, so don't foiget to set it to a ieal value aftei each of
the slaves has ieloaded the zone with seiial numbei 0.
It is a common mistake to change the data files but foiget to update the seiial num-
bei. Youi name seivei will punish you by failing to piopagate youi changes to the
slave seiveis.
The next foui entiies in the SOA iecoid aie timeout values, in seconds, that contiol
how long data can be cached at vaiious points thioughout the woild-wide INS data-
base. Times can also be expiessed in units of minutes, houis, days, oi weeks by addi-
tion of a suffix of n, l, d, oi v, iespectively. Foi example, ll30n means 1 houi and
?0 minutes. Timeout values iepiesent a tiadeoff between efficiency (it's cheapei to
394 Chaþter 15 - 0NS: Jhe 0omain Name System
use an old value than to fetch a new one) and accuiacy (new values should be moie
accuiate).
Heie's anothei copy of that same example SOA iecoid, just so you don't have to keep
tuining back to the pievious page:
, S¹ar¹ oí au¹lori¹y record íor cs.colorado.edu
@ lN SOA rs.cs.colorado.edu. los¹nas¹er.cs.colorado.edu. (
2004lll300 , Serial runber
¯200 , Reíresl (2 lours)
l800 , Re¹ry (30 niru¹es)
o04800 , Lx¡ire (l veel)
¯200 ) , Mirinun (2 lours)
The fiist timeout is the refresh timeout, which specifies how often slave seiveis
should check with the mastei to see if the seiial numbei of the zone's configuiation
has changed. Whenevei the zone changes, slaves must update theii copy of the zone's
data. The slave compaies the seiial numbeis; if the mastei's seiial numbei is laigei,
the slave iequests a zone tiansfei to update the data. Common values foi the refresh
timeout iange fiom one to six houis (?,600 to 21,600 seconds).
Instead of just waiting passively foi slave seiveis to time out, BINI seiveis now no-
tify theii slaves eveiy time a zone changes, unless the ro¹iíy paiametei is specifically
tuined off in the configuiation file. Slaves that undeistand the notification immedi-
ately iefiesh themselves. It's possible foi an update notification to be lost due to net-
woik congestion, so the iefiesh timeout should always be set to a ieasonable value.
If a slave seivei tiies to check the mastei's seiial numbei but the mastei does not
iespond, the slave tiies again aftei the retry timeout peiiod has elapsed. Oui expeii-
ence suggests that 20-60 minutes (1,200-?,600 seconds) is a good value.
If a mastei seivei is down foi a long time, slaves will tiy to iefiesh theii data many
times but always fail. Each slave should eventually decide that the mastei is nevei
coming back and that its data is suiely out of date. The expire paiametei deteimines
how long the slaves will continue to seive the domain's data authoiitatively in the
absence of a mastei. The system should be able to suivive if the mastei seivei is down
foi a few days, so this paiametei should have a longish value. We iecommend a week
to a month.
The minimum paiametei in the SUA iecoid sets the time to live foi negative answeis
that aie cached.
6
The default foi positive answeis (i.e., actual iecoids) is specified at
the top of the zone file with the $TTL diiective. Expeiience suggests values of seveial
houis to a few days foi $TTL and a couple of houis to a day foi the minimum. The
$TTL value must be laigei than oi equal to the minimum.
The $TTL, expire, and minimum paiameteis eventually foice eveiyone that uses
INS to discaid old data values. The initial design of INS ielied on the fact that host
6. Piioi io BINI 8.2, ihe  paiameiei sei ihe defauli iime io live foi iesouice iecoids. Ii was
included wiih each iecoid and used io expiie ihe cached iecoids on nonauihoiiiaiive seiveis.
15.1 Jhe 0NS database 395
data was ielatively stable and did not change often. Howevei, IHCP and mobile hosts
have changed the iules. BINI is despeiately tiying to cope by pioviding the dynamic
update and inciemental zone tiansfei mechanisms desciibed staiting on page 447.
Foi moie infoimation about TTIs and a concept called TTI haimonization, see
page ?90.
NS records
NS (name seivei) iecoids identify the seiveis that aie authoiitative foi a zone (that
is, all the mastei and slave seiveis) and delegate subdomains to othei oiganizations.
NS iecoids usually follow the SOA iecoid.
The foimat is
zorc [íí¦| lN NS rosírumc
Foi example:
cs.colorado.edu. lN NS rs.cs.colorado.edu.
cs.colorado.edu. lN NS arclor.cs.colorado.edu.
cs.colorado.edu. lN NS rs.cs.u¹al.edu.
Since the zone name is the same as the name field of the SOA iecoid that piecedes
these NS iecoids, it can be left blank. Thus, the lines
lN NS rs.cs.colorado.edu.
lN NS arclor.cs.colorado.edu.
lN NS rs.cs.u¹al.edu.
immediately following the SOA iecoid foi cs.coloiado.edu aie equivalent.
To be visible to the outside woild, an authoiitative seivei of cs.coloiado.edu should
be listed both in the zone file foi cs.coloiado.edu and in the file foi the paient zone,
coloiado.edu. Caching-only seiveis cannot be authoiitative; do not list them. No pa-
iametei in the NS iecoids specifies whethei a seivei is a mastei oi a slave. That infoi-
mation is specified in the name seivei configuiation file.
BINI uses a zone's NS iecoids to identify slave seiveis when it wants to send out
notifications of changes to the zone. Those same NS iecoids inside the paient zone
(coloiado.edu) define the cs subdomain and delegate authoiity foi it to the appiopii-
ate name seiveis. If the list of name seiveis in the paient zone is not kept up to date
with those in the zone itself, any new seiveis that aie added become "stealth seiveis"
and aie not used to answei queiies fiom the outside woild. This configuiation oc-
cuis sometimes thiough design and sometimes thiough foigetfulness. It is not seii-
ously wiong as long as the paient has at least one valid NS iecoid foi the child zone.
See page 4u7 fcr
mcre infcrmaticn
abcut delegaticn.
A quick look at oui own delegations ievealed a majoi seivei foi coloiado.edu that the
edu domain knew nothing about. Io as we say and not as we do: check youi delega-
tions with dig to be suie they specify an appiopiiate set of seiveis (see page 47?).
396 Chaþter 15 - 0NS: Jhe 0omain Name System
A records
A (addiess) iecoids aie the heait of the DNS database. They piovide the mapping
fiom hostnames to IP addiesses that was foimeily specified in the /etc/hosts file. A
host usually has one A iecoid foi each of its netwoik inteifaces. The foimat is
rosírumc [íí¦| lN A i¡uuur
Foi example:
arclor lN A l28.l38.243.l00
A machine with multiple netwoik inteifaces can use a single hostname associated
with all inteifaces oi can have sepaiate hostnames foi each inteiface.
P1k records
PTR (pointei) iecoids peifoim the ieveise mapping fiom IP addiesses to hostnames.
As with A iecoids, a host must have one PTR iecoid foi each netwoik inteiface. Be-
foie we desciibe PTR iecoids, howevei, we need to digiess and talk about a special
top-level domain called in-addi.aipa.
Fully qualified hostnames can be viewed as a notation in which the "most significant
pait" is on the iight. Foi example, in the name anchoi.cs.coloiado.edu, anchoi is in
cs, cs is in coloiado, and coloiado is in edu. IP addiesses, on the othei hand, have the
"most significant pait" on the left. In the addiess 128.138.243.100, host 100 is on sub-
net 24?, which is pait of netwoik 128.1?8.
The in-addi.aipa domain was cieated to allow one set of softwaie modules and one
naming tiee to map fiom IP addiesses to hostnames as well as fiom hostnames to IP
addiesses. Iomains undei in-addi.aipa aie named like IP addiesses with theii bytes
ieveised. Foi example, the zone foi oui 24? subnet is 24?.1?8.128.in-addi.aipa.
The geneial foimat of a PTR iecoid is
uuur [íí¦| lN FTR rosírumc
Foi example, the PTR iecoid in the 24?.1?8.128.in-addi.aipa zone that coiiesponds
to anchoi's A iecoid above is
l00 lN FTR arclor.cs.colorado.edu.
The name 100 does not end in a dot and theiefoie is ielative. But ielative to what? Not
"cs.coloiado.edu.". Foi this sample iecoid to be accuiate, the default domain has to
be "24?.1?8.128.in-addi.aipa.".
You can set the domain by putting the PTR iecoids foi each subnet in theii own file,
as in this example. The default domain associated with the file is set in the name
seivei configuiation file. Anothei way to do ieveise mappings is to include iecoids
such as
l00.243 lN FTR arclor.cs.colorado.edu.
15.1 Jhe 0NS database 397
with a default domain of 1?8.128.in-addi.aipa. Some sites put all ieveise iecoids in
the same file and use $ORlGlN diiectives to specify the subnet. Note that the host-
name anchoi.cs.coloiado.edu must end with a dot to pievent 1?8.128.in-addi.aipa
fiom being appended to its name.
Since cs.coloiado.edu and 24?.1?8.128.in-addi.aipa aie diffeient iegions of the INS
namespace, they constitute two sepaiate zones. Each zone must have its own SOA
iecoid and iesouice iecoids. In addition to defining an in-addi.aipa zone foi each
real network, you should also define one that takes care of the loopback network,
127.0.0.0.
This all woiks fine if the subnets aie on byte boundaiies. But how do you handle the
ieveise mappings foi a subnet such as 128.1?8.24?.0/26? An elegant hack defined in
RFC2?17 exploits CNAME iesouice iecoids to accomplish this feat; see page 400.
The ieveise mappings piovided by PTR iecoids aie used by any piogiam that au-
thenticates inbound netwoik tiaffic. Foi example, sshd may allow iemote logins
without a passwoid if the machine of oiigin is listed, by name, in a usei's ~/.shosts
file. When the destination host ieceives a connection iequest, it knows the souice
machine only by IP addiess. It uses DNS to conveit the IP addiess to a hostname,
which is then compaied to the appiopiiate file. netstat, tcpd, sendmail, sshd, X
Windows, and ftpd all do ieveise mappings to get hostnames fiom IP addiesses.
It is impoitant that A iecoids match theii coiiesponding PTR iecoids. Mismatched
and missing PTR iecoids cause authentication failuies that can slow youi system to
a ciawl. This pioblem is annoying in itself; it can also facilitate denial of seivice at-
tacks against any application that requires the reverse mapping to match the A record.
MX records
The mail system uses mail exchangei iecoids to ioute mail moie efficiently. An VX
iecoid pieempts the destination of a message, in most cases diiecting it to a mail hub
at the iecipient's site iathei than to the iecipient's own woikstation.
The foimat of an MX iecoid is
rumc [íí¦| lN MX ¡rcjcrcrcc rosí .
Two examples aie shown below, one foi a host that ieceives its own mail unless it is
down, and one foi a host that can't ieceive mail at all:
¡i¡er lN MX l0 ¡i¡er
lN MX 20 naillub
lN MX S0 boulder.colorado.edu.
x¹ernl lN MX l0 naillub
lN MX 20 arclor
lN MX S0 boulder.colorado.edu.
Hosts with low piefeience values aie tiied fiist: 0 is the most desiiable, and 65,5?5 is
as bad as it gets. In this example, mail addiessed to bobuxteim1 would be sent to
mailhub if it weie accessible, to anchoi as a second choice, and if both mailhub and
398 Chaþter 15 - 0NS: Jhe 0omain Name System
anchoi weie down, to bouldei. Note that bouldei's name must be fully qualified since
it is not a membei of the default zone (heie, "cs.coloiado.edu.").
The list of piefeiences and hosts can all be on the same line, but sepaiate lines aie
easiei to iead. Ieave numeiic "space" between piefeience values so you don't have to
ienumbei if you need to squeeze in a new destination.
MX iecoids aie useful in many situations:
·
When you have a cential mail hub
·
When the destination host is down
·
When the destination host isn't diiectly ieachable fiom the Inteinet
·
When the destination host doesn't speak SMTP
·
When the local sysadmin knows wheie mail should be sent bettei than
youi coiiespondents do
In the fiist of these situations, mail is iouted to the mail hub, the machine wheie most
useis iead mail. In the second case, mail is iouted to a neaiby host and foiwaided
when the destination comes back up.
Hosts that aie not diiectly accessible fiom the (public) Inteinet can still have VX
iecoids. Such MX-only hosts might be machines behind a fiiewall, domain names
hosted by an ISP oi hosting seivice, oi machines that aie not tuined on all the time.
sendmail can't connect to the destination host, but it can get the mail closei by con-
necting to one of the destination's VX hosts.
The final and most impoitant ieason to use MX iecoids is that the local sysadmins
piobably know the mail aichitectuie much bettei than youi coiiespondents. They
need to have the final say on how youi site channels its mail stieam.
Lvery hcst that the cutside wcrld kncws abcut shculd have MX reccrds. Foi minoi
hosts, one oi two alteinates aie enough. A majoi host should have seveial iecoids.
Foi example, the following set of iecoids might be appiopiiate foi a site at which
each host sends and ieceives its own mail:
·
One foi the host itself, as fiist choice
·
A depaitmental mail hub as second choice
·
A cential mail hub foi the domain oi paient domain as a backup
The domain itself should have an MX iecoid to a mail hub machine so that mail to
userCdcmain will woik. Of couise, this configuiation does iequiie that usei names
be unique acioss all machines in the domain. Foi example, to be able to send mail to
eviucs.coloiado.edu, we need a machine called cs, MX iecoids in cs.coloiado.edu,
oi peihaps both.
cs lN MX l0 naillub.cs.colorado.edu.
lN MX 20 arclor.cs.colorado.edu.
lN MX S0 boulder.colorado.edu.
15.1 Jhe 0NS database 399
A machine that accepts mail foi anothei host must list that othei host in its sendmail
configuiation files; see page 574 foi a discussion of sendmail's use_cv_íile featuie
and the file local-host-names.
Wild caid MX iecoids aie also sometimes seen in the INS database:
lN MX l0 naillub.cs.colorado.edu.
At fiist glance, this iecoid seems like it would save lots of typing and add a default
MX iecoid foi all hosts. But wild caid iecoids don't quite woik as you might expect.
They match anything in the name field of a iesouice iecoid that is nct alieady listed
as an explicit name in anothei iesouice iecoid.
Thus, you cannct use a stai to set a default value foi all youi hosts. But peiveisely, you
can use it to set a default value foi names that aie not youi hosts. This setup causes
lots of mail to be sent to youi hub only to be iejected because the hostname match-
ing the stai ieally does not belong to youi domain. Iigo, avoid wild caid VX iecoids.
CNAM£ records
CNAME iecoids assign additional names to a host. These nicknames aie commonly
used eithei to associate a function with a host oi to shoiten a long hostname. The
ieal name is sometimes called the canonical name (hence, "CNAME").
Some examples:
í¹¡ lN CNAML arclor
lb lN CNAML libblesrbi¹s
The foimat of a CNAME iecoid is
ricrrumc [íí¦| lN CNAML rosírumc
When the INS softwaie encounteis a CNAME iecoid, it stops its queiy foi the nick-
name and switches to the ieal name. If a host has a CNAME iecoid, othei iecoids (A,
VX, NS, etc.) foi that host must iefei to its ieal name, not its nickname.
7
Foi example,
the following lines aie OK:
colo-¡v lN A l28.l38.243.2S
noo¡ie lN CNAML colo-¡v
vvv lN CNAML noo¡ie
Howevei, assigning an addiess oi mail piioiity (with an A oi MX iecoid) to eithei
www oi moogie in this example would be wiong.
CNAME iecoids can nest eight deep in BINI. That is, a CNAME iecoid can point to
anothei CNAME, and that CNAME can point to a thiid CNAME, and so on, up to
seven times; the eighth taiget must be the ieal hostname with an A iecoid.
Usually you can avoid CNAMEs altogethei by just using A iecoids foi the host's ieal
name and its nicknames.
7. This iule foi CNAMEs was expliciily ielaxed foi INSSEC, which adds digiial signaiuies io each INS
iesouice iecoid sei. The RRSIC iecoid foi ihe CNAME iefeis io ihe nickname.
400 Chaþter 15 - 0NS: Jhe 0omain Name System
1he CNAM£ hack
See page 287 fcr
mcre infcrmaticn
abcut CIIR.
CNAMEs aie also used to toituie the existing semantics of INS into suppoiting ie-
veise zones foi netwoiks that aie not subnetted on a byte boundaiy. Befoie CIDR
addiessing was commonplace, most subnet assignments weie on byte boundaiies oi
within the same oiganization, and the ieveise delegations weie easy to manage. Foi
example, if the class B netwoik 128.1?8 was subnetted into a set of class C-like net-
woiks, each subnet would make a tidy package foi the in-addi.aipa domain. The ie-
veise zone foi the 24? subnet would be 24?.1?8.128.in-addi.aipa.
But what happens if the 243 subnet is fuithei divided into, say, foui pieces as a /26
netwoik? If all foui pieces aie assigned to the same oiganization, theie is actually no
pioblem. The foui subnets can still shaie a single file that contains all theii PTR
iecoids. Howevei, if the 243 subnet is assigned to an ISP that wants to delegate each
/26 netwoik to a diffeient customei, a moie complicated solution is necessaiy. The
ISP must eithei maintain the ieveise iecoids on behalf of each client, oi it must find a
way to take the thiid octet of the IP addiess (243 in this case) and divide it into foui
diffeient pieces that can be delegated independently.
When an administrative boundary falls in the middle of a byte, you have to be sneaky.
You must also woik closely with the domain above oi below you. The tiick is this: foi
each possible host addiess in the natuial in-addi.aipa zone, add a CNAVI that de-
flects the lookup to a zone contiolled by the ownei of the appiopiiate subnet. This
scheme makes foi messy zone files on the paient, but it does let you delegate authoi-
ity to the actual useis of each subnet.
Heie is the scheme in goiy detail. The paient oiganization (in oui case, the ISP) cie-
ates CNAVI iecoids foi each possible IP addiess with an extia fake component (dot-
sepaiated chunk) that iepiesents the subnet. Foi example, in the /26 scenaiio just
desciibed, the fiist quaitei of the addiesses would have a "0-63" component, the sec-
ond quaitei would have a "64-127" component, and so on. Heie's what it looks like:
$ORlGlN 243.l38.l28.ir-addr.ar¡a.
l lN CNAML l.0-o3
2 lN CNAML 2.0-o3
.
o3 lN CNAML o3.0-o3
o4 lN CNAML o4.o4-l2¯
oS lN CNAML oS.o4-l2¯
.
To delegate the 0-6? piece of the ieveise zone to the customei that has been assigned
that subnet, we'd add the following NS iecoids:
0-o3 lN NS rsl.cus¹onerl.con.
0-o3 lN NS rs2.cus¹onerl.con.
.
customei1.com's site would have a zone file that contained the ieveise mappings foi
the 0-6?.24?.1?8.128.in-addi.aipa zone.