You are on page 1of 32

804

1066_05F9_c2 © 1999, Cisco Systems, Inc. 1

Establishing Best
Practices for
Network Management
Session 804

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 2

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 1
Agenda

• Introduction to Best Practices


• Preparing the Network for Management
• Managing Change
• Fault Management
• Summary

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 3

Introduction to
Best Practices

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 4

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 2
Network Downtime is Costly
Infonetics Cost of WAN
• The Internet and Downtime ’98
e-commerce has 8
significantly 7
increased the $3.6M
6
Productivity
Productivity
availability stakes… 5 Loss
Loss

24-hour banking Average


Dollars 4
E-trade per Year 3
($000,000) $4.2M $3.6M
Global economy 2 Revenue
Revenue
Loss
Loss
1
0
Costs Enterprise Network
Mgmt. Budget

*Due to hard downtime and service degradations


804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 5

Best Practices Defined

• Applying what works well for others to


improve overall network availability
Reduce the time required for planned
outages (scheduled change) and includes
changes with no associated outage
Reduce network downtime during
unplanned outages (unscheduled change)

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 6

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 3
Lots of Practices—Some Truths

• Even the best NM Do What


products can be Works
useless with for You!
“bad” practices
• Tools help you to
do your job, they
are NOT the job
• Communication and
security are the
“bread and butter”
of best practices
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 7

Preparing the Network

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 8

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 4
Congratulations!

You’ve just Been Promoted to


Manage the Entire Network
for the Western Region...

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 9

What They’re Really Thinking…

I sure hope What a loser!


What am I getting he lasts longer Does he have any
into… how am than the last guy.. idea what he’s
I going to do this?
in for?
Where do I begin?
How come we
don’t have legs?

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 10

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 5
Preparing the Network
for Management

Best Practices
1. Selecting the “right” tools
2. Preparing the devices
3. Preparing the tools
4. Building a baseline
5. Maintaining “management”

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 11

Selecting the Right Tools

?
• How do I select the “right” set of
management applications?
Understand the technologies and buzzwords
Understand your network and end-user
requirements
Implement company standards
Many choices evaluate and choose
what’s right for your environment
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 12

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 6
Platforms and Vendor Specific
Management
• NMS
SNMP-based, status map, and trap receiver
HP Openview, Tivoli Netview, CA UniCenter, SNMPc, etc.
MicroMuse, Seagate, Concord, Enterprise Pro, and MRTG
• Vendor Specific
Geared towards managing a specific vendors devices only
Optivity, Transcend, CiscoWorks2000

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 13

Integrating Enterprise
Management

Helpdesk, Trouble-ticket, Event MOM

Application
Application DBMS
DBMS Server
Server Network
Network Desktop
Desktop User
User

Service Service Service Device Device Device Device Device Device

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 14

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 7
Understand Your Organization

• Roles and
responsibilities
• Escalation policy
• Help desk vs.
operations
• Planners vs.
administrators
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 15

Preparing the Devices

• Security for Management


• Notification
• Baseline

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 16

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 8
Securing the Devices

• Identify scope of control


Who needs access to what?
• Secure and log access
Physical access (badge readers)
Telnet and console
(AAA accounting, Syslog)
SNMP communities (ACL, SNMP traps)
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 17

Sample Security Configuration

aaa new-model
aaa authentication login test tacacs+ line Tacacs+
aaa authentication enable default tacacs+ enable
access-list 8 permit 161.44.34.157 SNMP Community ACL
logging 161.44.34.157
logging source-interface Loopback0 Syslog
snmp-server community public RO
snmp-server community bitbuck RW 8
snmp-server contact Paul L. Della Maggiora SNMP gets and sets
snmp-server chassis-id 071293
snmp-server system-shutdown
snmp-server trap-source Loopback0
snmp-server trap-authentication SNMP traps
snmp-server host 161.44.34.157 public frame-relay

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 18

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 9
Security Access Changes

• Password change policy


Quarterly
Every time an employee leaves
• Solution
Use radius or tacacs+
Script the change
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 19

Notification

• SNMP Traps
Critical for NMS
notification
• Syslog
Cisco-specific
notification

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 20

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 10
Sample Notification Configuration

aaa new-model
aaa authentication login test tacacs+ line Tacacs+
aaa authentication enable default tacacs+ enable
access-list 8 permit 161.44.34.157 SNMP Community ACL
logging 161.44.34.157
logging source-interface Loopback0 Syslog
snmp-server community public RO
snmp-server community bitbuck RW 8
snmp-server contact Paul L. Della Maggiora SNMP gets and sets
snmp-server chassis-id 071293
snmp-server system-shutdown
snmp-server trap-source Loopback0
snmp-server trap-authentication SNMP traps
snmp-server host 161.44.34.157 public frame-relay

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 21

Building a Baseline

• Document the network


Maps
Spreadsheets/databases
• Track inventory
Identify equipment and who owns it
• Backup configurations
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 22

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 11
Building a Baseline

• Collect performance data


Snapshot of
the network
Provides historical
data for comparison
Useful for capacity
planning and trending

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 23

Discovering the Network

• Auto-discovery can make


documentation easy…
but the daemons
must be tamed
Filters
Seedfiles
Discovery intervals
Exchange inventory
among multiple
autodiscovery tools

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 24

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 12
Layer 2 Autodiscovery

1. Query seed device via SNMP


2. Query CDP neighbor table (ciscoCdpMIBObjects)
3. Interrogate neighbors
Caveat—CDP only sees Cisco devices

c55k-26 (enable) sho cdp neigh


Capability Codes: R - Router, T - Trans Bridge, B - Source Route Bridge
S - Switch, H - Host, I - IGMP, r - Repeater

Port Device-ID Port-ID Platform


Capability
-------- ----------------------- ----------------- ------------------ -------
---
4/1 002261261 4/1 WS-C5000 T B S
4/1 002274433 4/1 WS-C5000 T B S
4/1 069004796 4/1 WS-C5500 T B S
4/1 Router_81.130 Ethernet0 cisco 4500 R
4/1 WBU_GATEWAY Ethernet0 cisco 4500 R

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 25

Layer 3 Autodiscovery

1. Start with default router


2. Query MIB II ifTable, ipAddrTable, ipRouteTable
3. Interrogate neighbors
Special cases e.g. IP unnumbered, HSRP
4500-4>sho ip rout
Codes: C - connected, S - static, I - IGRP, R - RIP, M - mobile, B - BGP
D - EIGRP, EX - EIGRP external, O - OSPF, IA - OSPF inter area
E1 - OSPF external type 1, E2 - OSPF external type 2, E - EGP
i - IS-IS, L1 - IS-IS level-1, L2 - IS-IS level-2, * - candidate
default
U - per-user static route

Gateway of last resort is not set

100.0.0.0/8 is subnetted, 1 subnets


O 100.100.100.0 [110/70] via 172.16.11.1, 13:35:34, Serial0
153.10.0.0/16 is subnetted, 1 subnets
C 153.10.1.0 is directly connected, Serial1
172.16.0.0/16 is subnetted, 1 subnets
C 172.16.11.0 is directly connected, Serial0

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 26

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 13
Inventory

• Typical NMS is not enough


IP address, comm strings, and interfaces
• Third-party management suites and
vendor specific provide richer content
• MIBs are generally vendor specific,
although entity MIB will change this

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 27

Inventory

• Items of interest
System information
Chassis information
Chassis cards
Interfaces
Storage and memory
Serial numbers
• All information available
via IETF and Cisco MIBs
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 28

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 14
Configurations

• Collection repository
Useful for staging new configs
Version control helps with space
and documentation
• How to automate
Scheduled backup
Watch Syslog
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 29

Maintaining Management

• Adding new devices


• Keeping the management
applications up-to-date
• New management products
and standards

An Ongoing Process!
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 30

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 15
Change Management

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 31

Post Mortem Blues

• Unplanned outages may be


the result of many factors. I Didn’t Do It
How do you explain and
account for what occurred?
Fact based vs. hearsay
Who, what, and when
was the change made?
Your job may be at stake
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 32

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 16
Some Facts

• 80% of all outages


are due to human error*
When an airlines
reservation system went
down, thousands of travel
agents had to book flights
manually. Estimated loss of
reservations amounted to
$36,000 a minute

*Based on Carnegie-Mellon Usability Study


X
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 33

Common Causes of Change

• Business growth or downsizing


• New applications or services
• Implementing new technology
• Deploying product fixes or upgrades

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 34

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 17
Change Management Defined

• Configuration, software and


hardware changes
• Change tasks include:
Anticipating and planning for change,
controlling the introduction of change,
and installing and implementing changes
to software and hardware

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 35

Best Practices for Change

Best Practices
1. Implementing a
change control process
2. Planning for change
3. Implementing change
4. Monitoring change

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 36

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 18
Change Control Process

Change request Change review board Change or work order


• End user request • Identify risk • Tracking #
• New app, server • Schedule change • Detailed change
• New network service • Generate work order requests

Validation Implementation
Close Work Order or
• Change verification • Net admin
Resubmit If Problems
• Audit • Engineer/tech.

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 37

Examples

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 38

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 19
Planning

• Hardware
Pre-configure, test prior to upgrade
• Software
Research release, defect support, new
feature set, and device compatibility
• Configuration
Test prior to deployment
• Have a back-out plan
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 39

Implementing

• Make different types of changes


one at a time
• Maker/checker model
• Understand contingency plan in
event of failure
• Validate the change was successful
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 40

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 20
Monitoring

• Identifying change,
who, what, when
• Audit trail
• Fault notification

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 41

Change Management Tools

Planning

SWIM—Defect,
SWIM—Defect,
image
image analysis
analysis
CWSI—Layer
CWSI—Layer
2/Layer
2/Layer 33 topo
topo
Netsys—Impact
Netsys—Impact
of
of change
change

Deployment Monitor
CAS—Change
CAS—Change audit
audit
SWIM—Download
SWIM—Download and
and reporting
reporting
software
software images
images service,
service, logs
logs
CWConfig—Deploy
CWConfig—Deploy software,
software, config
config
config
config changes
changes and
and hardware
hardware
CiscoView—Switch
CiscoView—Switch changes
changes
config
config changes
changes CWSI—Topo
CWSI—Topo and and
user
user tracking
tracking

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 42

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 21
Change Scenario
1. User telnets into device
and makes a config
change (shutdown int)

4. Archive gets config via


3. C/Agent identifies device
Server change, notifies archive
transport validates
change w/DIFF
Archive
Change
Agent Audit Log
5. IF VALID, Archive gets Config
and logs details to ENCASE

Syslog
Poll Transport

Change

Network 2. Device updated


Syslog generated
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 43

Fault Management

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 44

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 22
Scenario

• Virginia building-003
network goes down
• Your boss has
bad breath
• Multiple people
making changes
• Resolution takes
nine hours

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 45

Scenario

• Result:
Network was down additional four hours
due to conflicting changes
No one seems to know how the problem
occurred or how it was resolved

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 46

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 23
Best Practices for
Fault Management

Best Practices
1. Preventive Measures
2. Coordination
3. Reacting to Faults
4. Escalation Policy
4. Become Proactive

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 47

Preventive Measures

• Maintain accurate documentation


Key to quick resolution
Includes maps, closets, connections,
wiring, and servers
May require process/policy change.
Only good if up to date, easy to
maintain, and useful
Dump it if you can’t maintain it!
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 48

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 24
Preventive Measures

• Remove single points of failure


Alternate paths for mission-critical
applications
Redundant equipment for
critical junctures
Ensure appropriate bandwidth to
avoid contention and over utilization
Permits network rerouting
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 49

Coordination

• Communication
is KEY...
Understand roles
and responsibilities
Place phones in
closets; use cell
phones, pagers
Publish policies Say What You Do,
and procedures Do What You Say
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 50

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 25
Coordination

• Establish base of operations


All efforts must go through one person
Prevents “who dropped the baby” and
“slam management”
Conduct practice “scramble”
• Train staff on devices and technology
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 51

Determination of Faults

• Notification via:
NMS status change
Trap and event logs
Help desk
ALARM
Phone call from tech
(“whoops...”)

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 52

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 26
Determination of Faults

• Remove the “noise” factor


1. Filter
2. Prioritize
3. Appropriately notify
4. Correlate

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 53

Reacting to Faults

• Determine fault domain


Which equipment, services,
and users are affected?
• Determine level of response
What is the severity of the fault?
Can we kill the backbone?
Identify dispatch timeframe and
number of people
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 54

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 27
Reacting to Faults (Severe)

• Determine
escalation timeline
Criteria and time limits
to escalate to next level
Opening a case with
the TAC
Identifying the point of Is It Time to Hit the
drastic action Big Red Switch?
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 55

Reacting to Faults (severe)

• Coordinate, communicate,
and document
• Debrief
Determine source of fault
Evaluate recovery efforts
Document resolution for continuous
improvement process
In order to learn, avoid CYA environment
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 56

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 28
Moving from Reactive to Proactive

• Automate fault notification, escalation


and resolution via “triggers”
• React to data before it goes bad
• Learn device and network behavior
That doesn’t look right…

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 57

Active vs. Passive Polling

• Polling with thresholds vs.


event-based polling
RMON events and alarms
• Conservation of network traffic vs.
device CPU and memory
• Might be a combination of both
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 58

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 29
Fault Management Tools

Planning
CiscoView—
CiscoView—
Real-time
Real-time time
time monitoring
monitoring
RME—Availability,
RME—Availability,
Syslog
Syslog and
and CCO
CCO tools
tools
CWSI—User
CWSI—User tracking,
tracking, traffic
traffic
director
director and
and topo
topo

Monitor
Deployment
Availability—
Availability—
SWIM—
SWIM— Monitor
Monitor key
key resources
resources
Defect
Defect analysis
analysis Syslog—Reporting,
Syslog—Reporting,
CCO/TAC—
CCO/TAC— automated
automated recovery
recovery
Case
Case tracking
tracking tools
tools 24-Hour
24-Hour Reports—
Reports—
Stack
Stack Decoder—
Decoder— Monitor
Monitor reloads,
reloads, Syslog,
Syslog,
Crash
Crash analysis
analysis and
and change
change
Traffic
Traffic Director—RMON
Director—RMON
config
config and
and report
report

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 59

Best Practices Can Improve


Network Availability

• Prepare the network for management


Security, notification and maintenance
• Implement a change control process
Plan, deploy and monitor
• Reduce unplanned outage minutes
through fault management
Prepare, coordinate and be proactive
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 60

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 30
For More Information

• General network management portal


http://netman.cit.buffalo.edu/index.html

• Another good network management portal


http://compnetworking.miningco.com/msubmanage.htm?
terms=network+management&cob=home&TMog=
5006366091143m&Mint=56534342191358&FFV=1

• “The Simple Times”


http://www.simple-times.org/pub/simple-times/issues/

• SNMP FAQ
http://www.cis.ohio-state.edu/hypertext/faq/usenet/
snmp-faq/part1/faq.html
804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 61

For More Information

• Sample Cisco device security configs


http://www.cisco.com/warp/public/700/tech_configs
.html#SECURITY

• Cisco device SNMP configuration tips


http://www.cisco.com/warp/public/490/index.shtml

• White paper on threshold management


http://www.ccci.com/product/papers/pete/papers/thresh.htm

• Public domain performance monitoring tool


(MRTG)
http://ee-staff.ethz.ch/~oetiker/webtools/mrtg/mrtg.html

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 62

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 31
Please Complete Your
Evaluation Form
Session 804

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 63

804
1066_05F9_c2 © 1999, Cisco Systems, Inc. 64

Copyright © 1998, Cisco Systems, Inc. All rights reserved. Printed in USA.
1066_05F9_c2.scr 32