You are on page 1of 6

SPECIAL REPORT

JANUARY 2010

Networking

Deep
Dive

i In This Deep Dive

The Networking
Primer
Copyright 2010 InfoWorld Media Group. All rights reserved.

Networking Deep Dive

The networking primer


Everything you need to know about building solid, reliable networks

i By Paul Venezia

ALTHOUGH ALMOST EVERY PART of a modern


datacenter can be considered mission critical, the network is the absolute foundation of all communications.
Thats why it must be designed and built right the first
time. After all, the best servers and storage in the world
cant do anything without a solid network.
To that end, here are a variety of design points and
best practices to help tighten up the bottom end.

CORE CONSIDERATIONS
The term network applies to everything from LAN
to SAN to WAN. All these variations require a network
core, so lets start there.
The size of the organization will determine the size
and capacity of the core. In most infrastructures, the
datacenter core is constructed differently from the LAN
core. If we take a hypothetical network that has to serve
the needs of a few hundred or a thousand users in a
single building, with a datacenter in the middle, its not
uncommon to find that there are big switches in the
middle and aggregation switches at the edges.
Ideally, the core is composed of two modular switching platforms that carry data from the edge over gigabit
fiber, located in the same room as the server and storage
infrastructure. Two gigabit fiber links to a closet of, say,
100 switch ports is sufficient for most business purposes.
In the event that its not, youre likely better off bonding
multiple 1Gbit links rather than upgrading to 10G for
those closets. As 10G drops in price, this will change, but
for now its far cheaper to bond several 1Gbit ports than
to add 10G capability to both the core and the edge.
In the likely event that VoIP will be deployed, it may
be beneficial to implement small modular switches at
the edge as well, allowing PoE (Power over Ethernet)
modules to be installed in the same switch as the nonPoE ports. Alternatively, deploying trunked PoE ports to
INFOWORLD.COM DEEP DIVE SERIES

each user is also a possibility. This allows a single port to


be used for VoIP and desktop access tasks.
In the familiar hub-and-spoke model, the core connects to the edge aggregation switches with at least two
links, either connecting to the server infrastructure with
direct copper runs or through server aggregation switches
in each rack. This decision must be determined site by
site, due to the distance limitations of copper cabling.
Either way, its cleaner to deploy server aggregation
switches in each rack and run only a few fiber links
back to the core than try to shoehorn everything into a
few huge switches. In addition, using server aggregation
switches will allow redundant connections to redundant
cores, which will eliminate the possibility of losing server
communications in the event of a core switch failure. If
you can afford it and your layout permits it, use server
aggregation switches.
Regardless of the physical layout method, the core
switches need to be redundant in every possible way:
redundant power, redundant interconnections, and
redundant routing protocols. Ideally, they should have
redundant control modules as well, but you can make
do without them if you cant afford them.
Core switches will be responsible for switching nearly
every packet in the infrastructure, so they need to be
balanced accordingly. Its a good idea to make ample
use of HSRP (Hot Standby Routing Protocol) or VRRP
(Virtual Routing Redundancy Protocol). These allow
two discrete switches to effectively share a single IP and
MAC address, which is used as the default route for a
VLAN. In the event that one core fails, those VLANs
will still be accessible.
Finally, proper use of STP (Spanning-Tree Protocol) is
essential to proper network operation. A full discussion
of these two technologies is beyond the scope of this
guide, but correct configuration of these two elements
will have a significant effect on the resiliency and proper
operation of any Layer-3 switched network.
J A N U A R Y 2 010

Networking Deep Dive


MINDING THE STORAGE
Once the core has been built, you can take on storage
networking. Although other technologies are available,
when you link servers to storage arrays, your practical
choice will probably boil down to a familiar one: Fibre
Channel or iSCSI?
Fibre Channel is generally faster and delivers lower
latency than iSCSI, but its not truly necessary for most
applications. Fibre Channel requires specific FC switches
and costly FC HBAs in each server ideally two for
redundancy while iSCSI can perform quite well with
standard gigabit copper ports. If you have transactionoriented applications such as large databases with thousands of users, you can probably choose iSCSI without
affecting performance and save a bundle.
Fibre Channel networks are unrelated to the rest of
the network. They exist all on their own, linked only
to the main network via management links that do not
carry any transactional traffic. iSCSI networks can be
built using the same Ethernet switches that handle normal network traffic although iSCSI networks should
be confined into their own VLAN at the least, and possibly built on a specific set of Ethernet switches that
separate this traffic for performance reasons.
Make sure to choose the switches used for an iSCSI
storage network carefully. Some vendors sell switches that
perform well with a normal network load but bog down
with iSCSI traffic due to the internal structure of the
switch itself. Generally, if a switch claims to be enhanced
for iSCSI, it will perform well with an iSCSI load.
Either way, your storage network should mirror the
main network and be as redundant as possible: redundant switches and redundant links from the servers
(whether FC HBAs, standard Ethernet ports, or iSCSI
accelerators). Servers do not appreciate having their storage suddenly disappear, so redundancy here is at least
as important as it is for the network at large.

GOING VIRTUAL
Speaking of storage networking, youre going to need
some form of it if you plan on running enterprise-level
virtualization. The ability for virtualization hosts to migrate
virtual servers across a virtualization farm absolutely
requires stable and fast central storage. This can be FC,
iSCSI, or even NFS in most cases, but the key is that all the
host servers can access a reliable central storage network.
INFOWORLD.COM DEEP DIVE SERIES

Networking virtualization hosts isnt like networking a normal server, however. While a normal server
might have a front-end and a back-end link, a virtualization host might have six or more Ethernet interfaces.
One reason is performance: A virtualization host pushes
more traffic than a normal server due to the simple fact
that as many as dozens of virtual machines are running
on a single host. The other reason is redundancy: With
so many VMs on one physical machine, you dont want
one failed NIC to take a whole bunch of virtual servers
offline at once.
To combat this problem, virtualization hosts should
be constructed with at least two dedicated front-end
links, two back-end links, and ideally a single management link. If this infrastructure will service hosts that live
in semi-secure networks (such as a DMZ), then it may
be reasonable to add physical links for those networks
as well, unless youre comfortable passing semi-trusted
packets through the core as a VLAN. Physical separation
is still the safest bet and less prone to human error. If
you can physically separate that traffic by adding interfaces to the virtualization hosts, then do so.
Each pair of interfaces should be bonded using some
form of link aggregation, such as LACP (Link Aggregation Control Protocol) or 802.3ad. Either should suffice,
though your switch may support only one form or the
other. Bonding these links establishes load-balancing
as well as failover protection at the link level, and is an
absolute requirement, especially since youd be hardpressed to find a switch that doesnt support it.
In addition to bonding these links, the front-end
bundle should be trunked with 802.1q. This allowed
multiple VLANs to exist on a single logical interface
and makes deploying and managing virtualization farms
significantly simpler. You can then deploy virtual servers
on any VLAN or mix of VLANs on any host without
worrying about virtual interface configuration. You also
dont need to add physical interfaces to the hosts just
to connect to a different VLAN.
The virtualization host storage links dont necessarily need to be either bonded or trunked unless your
virtual servers will be communicating with a variety of
back-end storage arrays. In most cases, a single storage
array will be used, and bonding these interfaces will not
necessarily result in performance improvements on a
per-server basis.
J A N U A R Y 2 010

Networking Deep Dive


However, if you require significant back-end serverto-server communication, such as front-end Web servers
and back-end database servers, its advisable to dedicate
that traffic to a specific set of bonded links. They will
likely not need to be trunked, but bonding those links
will again provide load-balancing and redundancy on a
host-by-host basis.
While a dedicated management interface isnt truly
a requirement, it can certainly make managing virtualization hosts far simpler, especially when modifying
network parameters. Modifying links that also carry the
management traffic can easily result in a loss of communication to the virtualization host.
So if youre keeping count, you can see how you
might have seven or more interfaces in a busy virtualization host. Obviously, this increases the number of
switchports required for a virtualization implementation,
so plan accordingly. The increasing popularity of 10G
networking and the dropping cost of 10G interfaces
may enable you to drastically reduce the cabling requirements so that you can simply use a pair of trunked and
bonded 10G interfaces per host with a management
interface. If you can afford it, do it.

WIDE-AREA NETWORKING
When organizations have multiple locations, connecting
those locations with fast and reliable links can have a significant impact on the users at that site. Unfortunately,
no tried-and-true method of WAN interconnectivity can
be applied to every organization. The approach you use
depends on the services available at the main datacenter
and the remote office site.
In an ideal situation, both sites are served by a single
carrier that can drop in fiber links at each location. This
will provide the highest bandwidth and lowest latency
of any solution, and will probably be cheaper to boot. If
this option is available to you, be sure to treat the link as
untrusted and use a VPN across the pipe to encrypt the
traffic which, after all, will be flowing across someone
elses network.
Without the same carrier on both ends, youll need
alternative connection methods. The most popular of
these is MPLS (multiprotocol label switching). This is
somewhat related to the traditional frame-relay networking model, but is generally cheaper and offers
higher bandwidth for a lower cost.
INFOWORLD.COM DEEP DIVE SERIES

An MPLS network is composed of various links


from remote sites into the providers cloud. In addition,
the main site is linked to the same cloud, usually with
enough bandwidth to ensure that all edge sites can fully
saturate their link without overwhelming the link to the
datacenter. Unlike frame-relay, however, the main site is
not a hub site. This means that traffic between remote
sites does not pass through the main site; it simply flows
through the providers cloud.
For example, you might have three remote sites with
a single T1 or fractional T3 connection each, and the
main site with either a fiber handoff or a fractional T3
as well. All these sites are connected via routing protocols running on the MPLS routers, sharing information
with the providers routed network. This means that
even though the MPLS provider is passing the packets,
they all appear to be part of your internal network,
using your internal network numbering, exactly like
other WAN models. Nonetheless, its still a good idea
to encrypt the traffic passing over an MPLS network to
ensure security.
In many cases, MPLS providers will assume management control over this network, and take responsibility
for the configuration and maintenance of the routers
running the network. If you dont have the skills necessary to manage an MPLS network in-house, this is
a very low-cost way to implement a WAN. Its also a
double-edged sword, however, because modifications
to the network may take a while to be completed, and
youll always have a part of your own network that you
cannot access or modify.
Another, less popular method of WAN connectivity
is to use a traditional VPN over Internet circuits. This is
by far the cheapest method, but is also the most problematic from a performance and reliability standpoint.
Its constructed by adding a cable, DSL, or T1 circuit to
each remote office and terminating VPN connections
from those sites to the main site. There is no guarantee
or SLA (service-level agreement) possible for this type
of connection, and the performance of the connection
is subject to the vagaries of Internet communication. It
may be a viable solution for smaller offices, but not for
larger remote sites.
But theres another use for VPN connections of this
sort: adding redundancy to remote site connections.
Should the main link fail for one reason or another, the
J A N U A R Y 2 010

Networking Deep Dive


VPN connection becomes your temporary failover solution. Moreover, this scenario requires an Internet circuit
at each remote office, which can do double-duty as an
Internet gateway for that site. This can reduce the traffic
across the WAN and lower bandwidth requirements for
each office, decreasing the overall cost of the network.
There is a caveat, however if youre running some form
of Internet use monitoring, you will need to add those
sites to the monitoring scheme in order to see that traffic.

SECURING THE NETWORK


Network security is the first line of defense for any
organization. The low-hanging fruit here is the ubiquitous firewall, which is an obvious necessity, but extends
throughout the network to DMZ networks, IPSes (intrusion prevention systems), and so on.
In most cases, the network edge is represented by
one or more firewalls behind one or more Internet circuits. Firewall implementation and configuration is quite
mature these days and doesnt warrant too much attention, but a few architectural points are worth making.
For starters, smaller companies can get away with
terminating client VPN connections directly on their
production firewall, while larger companies should
dedicate a VPN concentrator to this task. Generally, a
multipurpose firewall will have a subset of the management, monitoring, and control capabilities of a dedicated
concentrator, and may be simpler to manage in the long
run. Yes, a properly sized firewall can terminate plenty
of VPN connections, but a dedicated VPN appliance is
the better bet for clients, while the production firewall
can be used to terminate any site-to-site VPNs.
Client VPNs take on two main forms: a client-based
IPSec VPN or an SSL VPN. The former is generally
more robust, but can run into problems when clients
are at sites that restrict certain types of traffic, which
prevents proper operation of an IPSec client. SSL VPNs,
on the other hand, communicate over port 443, which
is also used for standard SSL Web communications. An
SSL VPN leverages the fact that most sites will pass TCP
port 443 through unfettered, and thus can be used in
places that may not permit an IPSec VPN. Users dont
need a client pre-installed on their computer to access
an SSL VPN. Instead, they simply use a Web browser to
initiate the VPN, which may require a small on-the-fly
client download.
INFOWORLD.COM DEEP DIVE SERIES

In practice, most VPN appliances offer both forms of


client VPN connection, and its a good idea to enable
and configure both in order to permit your remote clients every possible method of connection no matter
where they happen to be.
Its also a good idea to implement several VPN group
types and organize VPN users into these groups. That
way you can impose restrictions on internal resource
access on a group basis, which is a far better method
than putting every user in the same VPN group and
leaving the doors wide open. Unlike the LAN and WAN,
you have no control over where a remote user might be,
and spending a little time ensuring that they have access
to what they need and no more can pay off in spades,
should a laptop be stolen or left unattended.
As far as internal network security goes, its a good
idea for a network of any reasonable size to deploy
an IDS (intrusion detection system) sensor in critical
locations to flag known attack patterns. Specifically, a
sensor monitoring the traffic just behind the firewall
and another watching the server VLAN would be a
good start.

NETWORK MONITORING
All the best hardware and network designs in the world
cant help you figure out what your network is doing on
a day-to-day basis. Network monitoring tools can, and
no network should be without them. There are myriad
options in this space, both open source and commercial,
and there are key elements that need to be constantly
monitored to ensure proper operation. For instance,
you should be measuring the latency and bandwidth
utilization across all WAN circuits, and retaining that
data for trending purposes to highlight how bandwidth
needs grow over time. Also, keep a close eye on each
VLAN default route, all aggregate links, and CPU usage
in the core switches and aggregation routers if any. Its
also a good idea to catch STP and HSRP state changes
if applicable.
Its also a good idea to use the syslogging facilities
present in each switch. Configure a syslog server and
point all your switches at it and youll have a record of
every pertinent event on a switch-by-switch basis, which
is invaluable when trying to determine what might have
gone wrong during periods of instability.
And finally, dont forget to back up all that hard work
J A N U A R Y 2 010

Networking Deep Dive


implement some form of configuration management,
whether commercial or open source. Backing up router,
switch, and firewall configurations on a scheduled basis
and being able to produce diffs of them can save massive amounts of time down the road. Its also possible
to notify admins when a configuration changes, which
can head off potential problems at the pass, so to speak.
The network is the circulatory system of any

INFOWORLD.COM DEEP DIVE SERIES

organization, and it needs to be designed, built, and


managed correctly to ensure that the rest of the critical
systems can function properly. The tips and ideas presented here can go a long way toward making all the
trains run on time now and in the future. i
Paul Venezia is an InfoWorld senior contributing editor and a
veteran networking consultant.

J A N U A R Y 2 010