You are on page 1of 42

Network Troubleshooting Tools

Kent Reuber ITS Networking April 6, 2007

What problems do you need to solve?  Tool descriptions  Q&A time  Tool descriptions are in the “Software” section of the LNA Guide: http://lnaguide/software.htm l

What are the problems?
 Are hosts online? (ping)  How do you get to hosts? (traceroute)  What are hosts running? (nmap)  Where/when have hosts been seen? (ipm)  “The network is slow” (Netspeed, iperf)  DHCP and DNS (SUNet reports)  Wireless problems (various)  Packet sniffing (wireshark), and batch NetDB changes (NetDB CLI)

Ping and traceroute


Are you there?

 Ping sends ICMP echo requests to a host and asks for a reply. Reply time is also returned.  Some hosts may choose not to reply by security policy. It may not mean that they’re down.  Stanford de-prioritizes pings at some of our borders, so a long ping time or dropped pings does not indicate a poor connection.  Stanford maintains a special host:
 “”  Exempt from ping filter.  Have outside users ping “ping-me” if they claim that connections to Stanford are

Ping for Advanced Users
 Can increase packet size to see duplex errors. (Unix: ping -s)
 Default small (<60 byte) ping packets don’t generate enough traffic to show duplex problems.  Try using pings of 1000+ bytes.

 Use nmap or similar utility for “ping sweeps” of entire networks:
 “nmap -sP <network range>”

Traceroute: get there?
 How traceroute works:

How do I

 Source sends a series of packets with increasing time-to-lives. (TTL is the allowed number of router hops.) Unix/Mac: UDP, Windows “tracert”: ICMP.  Routers will decrement TTL and respond with an ICMP “unreachable” message if TTL is 0.  Like ping, a timestamp is returned.

QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.

Traceroute notes
 Routers need not reply to traceroutes. Lack of a reply does not mean that the router is down.  Return traffic doesn’t necessarily use the same path.
 This can cause problems with firewalls and packet shapers that assume they see the whole conversation.  When troubleshooting connection problems, you may want to have the destination send traceroutes to you as well.



Scanning nets

 In addition to ping scans, you can scan for open ports on hosts.  This can be useful for seeing who is running a service (intentionally or otherwise!)  My recipe for scanning for open TCP ports:
 ”nmap -P0 -sT net -p ports -oG - | grep open”

Getting nmap
Download from  Unix and MacOS X usually require compiling from source.  Windows binary available.


IPM: IP <-> MAC addresses
 Stanford-specific utility  How it works:
 Devices broadcast ARP packets when they need to communicate locally.  Routers see these ARP and cache it.  Information is periodically harvested and kept in a database.  Using IPM, you can track when an IP/MAC was first and last seen and where.

IPM: for?

What’s it good

 You can find MAC addresses which aren’t in Netdb.  Find out where a particular device has been seen.  See if multiple devices are using a single IP address.

More on IPM
 Where is it:
 AFS: /usr/pubsw/sbin/ipm  Note: this directory is not in your default PATH.

 Using IPM:
 Wildcards: “_” (single character), “%” (multiple characters)  Run “ipm -h” to see list of options.

MAC vendor codes
 MAC addresses are 48-bit (6 bytes) xx:xx:xx:xx:xx:xx, where each “x” is a hexadecimal number 0-9,a-f.  First 3 bytes are the Organizationally Unique Identifier (OUI), which tell you who made the network card.  Can look this up. My favorite site:  Can tell you when NetDB records are outdated. For example, a NetDB record for a Macintosh with

Netspeed and Iperf

Netspeed & Iperf: Speed testing
 Often hear “the network is slow”.
 Is it the client, the network or a server?  Where’s the bottleneck?

 Useful tools:
 Netspeed (Web based speed to campus backbone).  Iperf (command line tool for point-to-point).

 Web based speed testing to Stanford backbone: or /  Useful for finding duplex errors (misconfigured hubs or switches) in the

 Command line testing tool.
 Can also run speed tests against and  Can be run in server mode for testing speed between arbitrary points (e.g., within your network)

How fast can you go?
 DSL: 1 Mbps (asymmetric)  802.11b wireless: 1-5 Mbps  802.11g wireless: 1-12 Mbps  Fast Ethernet: 80+ Mbps  Gigabit: ?? Note: consider these tests as upper bounds. For gigabit especially, you may not be able to transfer real data this fast.


Troubleshooting DHCP
 Many things can go wrong. Problems are rarely caused by DHCP server unavailability.  Things to check:
 What IP is the host getting?  Netdb record for the host.  DHCP server logs, roaming

Understanding DHCP
    Stanford has two DHCP servers: dusk and dawn. Info from Netdb is uploaded approximately every 15 minutes. Give Netdb the time to upload data. At Stanford, MAC address information is required for successful DHCP. Initial DHCP is a four step process using broadcasts; renews are different.
QuickTime™ and a TIFF (LZW) decompressor are needed to see this picture.

 DHCP addresses are valid for a limited period (wired and wireless).
 Normal DHCP: 2 days  Roaming DHCP: 42 minutes

 Hosts will re-confirm their leases halfway through the lease period.
 Clients use unicast directly to the DHCP server (clients have an address and they know who their server is).

DHCP roaming
 If the Netdb record has a “home” IP address appropriate for the network where the device is located, DHCP servers will send it.
 Can have “home” IP addresses and still be able to roam to other networks.  Can have multiple “home” addresses bound to each MAC address.

 If no appropriate address is entered, DHCP will look for available roaming addresses on the local network.
 Number of roaming address is specified by the LNA. Defined in the Netdb network record.  Usually there are only a handful of roaming addresses. Can easily run out of them.

What address did you get?
 The address received may tell you what the problem is.  Self assigned (169.254.*.*):
 NetDB record not set up properly.  No roaming address available.  Routing or DHCP server problem (less likely).

 10.x.x.x:
 Used by Network self-registration system. (SNSR)  Could also be used by a rogue.

 192.168.*.*:
 Probably a rogue DHCP server.

Finding rogues
 Try pinging the gateway that’s being distributed.  Use “arp” command to get the MAC address of the gateway. Or use a sniffer if you have one.  Look at switch MAC tables and find the offending hosts. Shut off the port or go have a “chat”.  New Net-to-Switch configs block rogue DHCP servers!

Available DHCP reports
 DCHP logs for a given host.
 Type in MAC address and see the conversation.  Takes practice to read.

 Roaming address utilization
 How many roaming addresses were used in a day.

 DHCP reports from dusk and dawn
 Hourly logs show number of DHCP messages for hosts.  “No free leases” may indicate that you’re out of roaming addresses.

 All reports are linked from LNA Guide software section: http://lnaguide/software.html


DNS at Stanford
 Host information is entered in NetDB
 Uploads to DHCP servers about every 15 minutes.  Uploads to DNS servers about every hour.
 Starts at 5 minutes after the hour.  Takes about 20 minutes. Should be done by 30 minutes past the hour.  Specific info on timing is kept in the NetDB help files.

DNS inspection tools
 Standard: “host”, “nslookup”, “dig”.  Stanford whois can show you most NetDB information:
 “whois -h <query>”  Use “%” and “_” as wildcards as per ipm.  Great for people who need “read-only” access, since you don’t need a NetDB account.  For host names, you need to end query in a “.” or specify “” so that whois knows you want information on a host.


Wireless problems
 Wireless is slow or unavailable.  Reports can be vague. “Wireless is slow on the 2nd floor.”  Isolating the problem can speed resolution.
 Exactly where is the problem occurring?  What access point is the user connecting to?  Do others have problem in the

Wireless tools
 Access point association:
 Mac: Internet Connect utility  PC: ??

 Access point discovery for seeing available AP’s and channels: NetStumbler, iStumbler  Iperf and Netspeed are useful for checking speed problems.  Often, a AP reboot will solve the problem.
 AP jack (tso) information is in Netdb.  Can unplug and replug if necessary.

Packet sniffer

EtherPeek and Wireshark
 Stanford has site license for Etherpeek, but it’s still expensive.  Wireshark (formerly Ethereal) is free. (Motto: “Sniff free or die!”)
 X windows application for Unix/Mac.  Binary for Windows.   Some books are available!

Advice on Sniffing
 Need for a sniffer is rare, but invaluable when you need it.
 Learn to use it before you need it!

 You will need to set up special “span” ports on your switches to see all traffic.
 No need if you’re interested in broadcasts and multicasts.  Most useful for seeing traffic entering and leaving your net.

NetDB Command Line

NetDB CLI overview
 Designed for power users.  Provides a subset of NetDB functionality (mostly nodes) for batch changes. New features are periodically added.  Use with caution. Try one or two hosts before doing big batches.

How to run NetDB CLI
 Located in AFS space:
 /usr/pubsw/sbin/netdb (note: this directory is probably not in your PATH)  Use -h option to get command syntax

 Stuff you can do (to a single machine or list of machines):
 Change administrators, locations.