Professional Documents
Culture Documents
06 Fcaps PDF
06 Fcaps PDF
Network Management
Spring
p g 2013
Bahador Bakhshi
CE & IT Department, Amirkabir University of Technology
This presentation is based on the slides listed in references.
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
conditions in network
Root Cause
Is the occurrence of a specific type of fault
Symptom
Fault messages generated due to occurrence of root cause
An indication of fault for management system
Fault Management
g
Fault management
Monitoring the network to ensure that
y
g is running
g smoothly
y
everything
Symptoms collection
Ultimate objective
Ensure that users do not experience disruption
If do keep it minimum
5
Fault Management
g
Functionalities
Network monitoring
Basic alarm management
Fault diagnosis
Root cause analysis
Troubleshooting
Trouble ticketing
g
Proactive fault management
6
Fault Management
g
Steps
Fault Management:
g
Monitoring
g & Detection
Examples
Equipment alarms: A
A line card went out
out
Environmental alarms: Temperature too high
Service level alarms: Excessive
Excessive noise on a line
line
Alarms (contd)
(
)
Alarms are associated with specific information
E.g. X.733: Alarm reporting function
10
Alarm Severities
There are different standards for severities
ITU-T/ X.733 6 levels: critical, major, minor,
g, indeterminate,, cleared
warning,
IETF syslog 8 levels: emergency, alert,
critical error,
critical,
error warning,
warning notice
notice, informational,
informational
debug
11
Fault Management:
g
Alarm Management
g
Basic functions
Collect alarm information from the network
Visualize alarm information
Subscription
Deduplication
Correlation
Augmentation
12
Manager is waiting
Trap server is listening on specified port
14
15
Alarm Processing
g
Alarm collection and visualization are basic
required functionalities
However, iin llarge networks,
H
t
k eventt iinformation
f
ti
overflows
So many alarms + operator Missed alarms
F t
Fortunately
t l
Not all alarms are the same (alarm filtering)
Different severity
Usually, alarm are correlated (alarm preprocessing)
16
Alarm Filtering
g vs. Preprocessing
g
17
p
for him
reallyy important
Deduplication
p
E.g.,
Oscillating alarms
Link down alarm from two adjacent routers
Very
V
simple
i l case off correlated
l t d alarms
l
18
19
Implementation flavors
Original alarms do not get modified but additional alarm
20
Alarm Augmentation
g
Alarms do not always have sufficient information
Alarm augmentation: collect additional information
about the alarm context, e.g.
Current state
Current configuration
Self-test / diagnostics
Fault Management:
g
Analysis
y
& Diagnosis
g
22
Rule-Based Systems
y
Typically, heuristics based
Codify human expertise
If you get a time-out error, see if you can ping the other side
If that doesnt work, run IP config to see if your IP is configured
Rule-Based Systems
y
((contd))
25
Rule-Based Systems
y
((contd))
Knowledge base
Rule-based in the form of ifthen or condition
action,,
Inference engine
Compares the current state with the rule-base
Finds the closest match to output
26
the model
others in similar,
similar but not necessarily identical
identical,
situations
28
29
30
Fault Management:
g
Trouble Ticketing
g
Purpose: Track proper resolution of problems
Collect all information about a problem
Ensure proper steps are taken
31
Fault Management:
g
Trouble Ticketing
g
Boundary between perspectives can be blurred
Some alarm management systems generate tickets
automatically
Some
S
analogous problems apply
trouble tickets
Interface Customer Help Desk, CRM in the front
Alarm Management & OSS in the back
32
Examples
Analyze current alarms for precursors of bigger problems
Analyze network traffic patterns for impeding problems
Fault Management
g
Life Cycle
y
1) Detection of faults
Reporting of alarms by failure detection mechanism
Fault Management
g
Life Cycle
y
((contd))
2) Service restoration
E.g., Built-in redundancy (host-swap) or reinitialize
4) Prioritize
Not all faults
f
are off the same priority
Determine which faults to take immediate action
Fault Management
g
Life Cycle
y
((contd))
5) Troubleshooting
Repair, Restore, Replace
6) Reevaluate
7) Fault Reporting
Why?
y Speed
p
up
p future fault management
g
What? Cause & Resolution
36
Fault Management
g
Issues
Fault detection: By operator vs. By Customer
If customer
t
detected
d t t d Service
S i h
has b
been violated
i l t d
Technologies
g
in Fault Management
g
Automatic fail over
Vendor
V d specific
ifi iin system
t
mechanism
h i
Alarm notification
SNMP trap or property protocols
Alarm/Event processing
Correlation
C
l ti and
d roott cause analysis
l i b
by expertt systems
t
((artificial
tifi i l
intelligence approaches)
Customer care
Helpdesk systems (24x7 availability)
Trouble ticket system (submission and monitoring)
38
Fault Management
g
Summary
y
39
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
40
Configuration
g
Management
g
What is configuration?
1) Description of physical/logical components
off a system;
t
e.g.,
Network logical & physical topology
Physical configuration of routers
2) The
Th process off updating
d ti parameters
t
off
system, e.g., configuring OSPF on routers
3) The result of configuration process, e.g.,
sett off managementt parameters
t
& their
th i values
l
41
Configuration
g
Management
g
((contd))
Functions related to dealing with how
network, services, devices are configured
Physical configuration, e.g.
Logical configuration,
configuration e
e.g.
g
Challenges
Number of devices/software
Diversity
Di
it off devices/software
d i
/ ft
42
Logical
g
Configuration
g
Management
g
The process of obtaining functional data from each
network
t
k device,
d i
storing
t i and
d documenting
d
ti that
th t data,
d t
and subsequently utilizing that data to manage the
operations
ti
off allll network
t
k devices
d i
Includes the initial configuration of a device to bring it up,
as wellll as ongoing
i configuration
fi
ti changes
h
When to configure
g
System (network & equipment) setup
New equipment
q p
((hardware))
Software upgrades
Service provisioning
p
g
43
Configuration
g
Management
g
Functions
(Auto)Discovery & Auditing
Configuration setting
Provisioning
Synchronization
Image management
Backup and restore
44
CM: ((Auto)Discovery
)
y & Auditing
g
FAPS management areas need current network
configuration
We should be able to query the network to find out
what actually has been configured
It is called auditing (in most cases, it is also called
discovery)
Moreover,
Moreover we need Auto-discovery
Auto discovery
Find out the entities in network
Configuration
g
Management
g
Inventory
y
Deals with the actual assets in a network
Equipment
Type of device, manufacturer, CPU, memory, disk space
Equipment hierarchies: line cards, which slot, etc.
Bookkeeping information: when purchased, inventory
number, support information,
Software
Software image OS, revision, licenses,
Where & when deployed
p y
Bookkeeping information: when purchased, inventory
number, support information,
46
CMDB ((Configuration
g
Management
g
Database))
CMDB
Contains
C t i iinformation
f
ti about
b t the
th configuration
fi
ti off devices
d i
in
i the
th network
t
k
Relatively static but heterogonous information
Applications
A li ti
examples
l
Network configuration cache to be used in FAPS
Configuration validation
What-if analysis
Configuration
C fi
ti cloning,
l i
b k
backup,
and
d restore
t
47
CM: Configuration
g
Setting
g
(almost) All network devices should be configured
properly
l ffor the
th specific
ifi network
t
k
The core of network management
Called: Provisioning
48
Configuration
g
Setting
g Techniques
Reusing configuration settings
E.g.,
E g configuration of OSPF for all routers in the same area
Script-Based
p
configuration
g
Approach 1
Configuration
C fi
ti workflow
kfl
: A sequence off operations
ti
to
t achieve
hi
a goall
Maintaining a single complex script for whole configuration is difficult
CM: Configuration
g
Setting:
g Provisioning
g
Provisioning: The steps required to set up network
and
d system
t
resources tto provide,
id modify,
dif or revoke
k
a network service
Bandwidth, Port assignments, Address assignments (IP
Scope:
Individual
d dua syste
systemss (equipment
( equ p e t p
provisioning)
o so g )
50
CM: Configuration
g
Synchronization
y
Management systems (CMDB) keep management databases
Cache in the database to avoid repeatedly hitting the network
Configuration
Config ration changes often not reliably
reliabl indicated
Synchronization strategy depends on who is the master
The network or the management database
Fundamental decision in managing a network
51
CM: Configuration
g
Synchronization
y
52
53
Management
g
DB as Golden Store
Common in some service provider environments
Very
V
controlled
t ll d environments
i
t
54
re-provisioning
Patch Management
g
Patch Identification
Determination of available upgrades to existing devices
Patch Assessment
Determining the importance and criticality that any new
patch
t h be
b applied
li d
Patch Testing
Checking whether the installation of the new patches
Patch
P h IInstallation
ll i
Installation of the patches and updating the software of
Configuration
g
Management
g
Issues
Make sure the inventories be updated
Out-of-date
O t f d t iinventories
t i (DB
(DBs)) are useless
l
Autodiscovery mechanism should be used
Revision
R i i control
t l and
db
backup
k off th
the iinventories
t i
Time history of network is needed
The configuration management system may fails
Security
S
it
Configuration process should be secure
Insecure configuration attack
57
Configuration
g
Management
g
Technologies
g
SNMP
SNMP Public Community:
Gather information about the current network environment,
Read Only
Read-Only
SNMP Private Community:
Gather information about the current network environment AND
make changes, Read-Write
Netconf
New protocol by IETF (XML based)
Configuration
g
Management
g
Summary
y
59
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
60
Accounting
g Management
g
Account of the use of network resources
Metering: Measure what has been consumed by
Accounting
g Management
g
Functions ((TMN))
Usage Measurement
How much resources are used?
P i i
Pricing
Pricing strategy & Rating usage
62
Accounting
g and Billing
g
63
Accounting
g Data
Which data should be measured for accounting?
Depends on service type and pricing strategy
A few examples:
Call Detail Records (CDRs)
Apply to voice service
Generated as part of call setup (and teardown) procedures
Call statistics upon end of call, or periodically
Accounting
g Data (contd)
(
)
Volume based information
Interface statistics
E.g. duration of TCP connection: TCP syn / syn-ack, fin / finackk exchange
h
Billing
g
Data Collection
Measuring the usage data at the device level
Performed by accounting
Data Mediation
Converting proprietary records into a well known or
standard format
Billing
g ((contd))
Calculating call (service) duration
In some application, real-time duration is needed
Charging
Tariffs and parameters to be applied
Invoicing
Translating charging information into monetary units and
67
Billing
g vs. Accounting
g
68
Billing
g Models
Postpaid vs. Prepaid
Postpaid: Off-line charging
Charging
g g criteria
Volume based vs. Time based charging
Best effort vs.
vs QoS based (DiffServ) charging
Flat fee vs. Application specific
69
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
70
Performance Management:
g
Design
g Phase
Each system is designed for a target level of
performance
The generall approaches
Th
h tto guarantee
t Q
QoS
S
under high load conditions (e.g., congestion)
Over provisioning
Classification
Traffic based
based, User based
based,
Prioritize classes to each other
71
Performance Management:
g
Operation Phase
Why PM in operation time?
Oversimplified assumptions in design phase
E.g.,
E g Poisson arrival rate
rate, M/M/1
M/M/1,
Not satisfied by the real workload
Capacity planning
72
Performance Management
g
Performance Management involves
Management of consistency and quality of
73
Performance Metrics
How to measure (define) performance?
Performance
P f
metrics
t i differ
diff b
by llayer and
d service
i
Throughput
74
Performance Management
g
Functions
Document the network management business objectives
Create detailed and measurable service level objectives
Define performance SLAs and Metrics
E
E.g.,
average/peak
/
k volume
l
off traffic,
t ffi average/maximum
/
i
delay,
d l
availability,
il bilit
75
Performance Management
g
Aspects
Proactive
Reporting
R
ti & M
Monitoring
it i ((performance
f
metric
t i hi
history
t
graphs)
h )
The value of performance metrics are gathered periodically
The data analyzed and reported
Capacity planning
Reactive
QoS assurance
Define threshold
Automatically take action when a threshold is eclipsed
76
Performance Management
g
Issues
1) Effect of performance management on network
performance
Large volume of performance monitoring data increase
network traffic
Periodic polling
Polling rate?!
Database design
Performance troubleshooting
77
Performance Management
g
Issues ((1))
Data collection & Database design approaches
Performance
P f
monitored
it d data
d t is
i time-series
ti
i
Round-robin DB
Time based partitioning of databases
Aggregation method
e.g., average
DBs
DB b
based
d on titime scales
l
78
Performance Management
g
Issues ((2))
Performance Troubleshooting
Detecting
D t ti Performance
P f
Problems
P bl
Threshold; e.g.,
80% of maximum acceptable utilization/delay
Mean + 3 * Standard deviation
Statistical abnormality
The time-series data generated by performance metric has statistical
Performance Management
g
Issues ((2))
Performance Troubleshooting
Correcting
C
ti performance
f
problems
bl
Misconfiguration
Incorrect configuration cause slow down device
System changes
Inconsistent configuration for software update
Hardware compatibility issues
Workload growth
The congested resource should be upgraded (capacity planning)
Workload surge
Workload increases very rapidly in a very short amount of time
Spare resource and traffic shaping can help
80
Performance Management
g
Tools
Monitoring network traffic
Monitoring
M it i network
t
k protocols
t
l
Monitoring
g network equipments
q p
81
Performance Management
g
Summary
y
82
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
83
Security
y & Management
g
Security of Management
Management of security
84
Security
y of Management
g
Security of management deals with ensuring
that management operations themselves are
secured
Major domains to secure
Security of NOC
Security
y of NOC
Firewall
To protect NOC from external attacks
IDS
To detect intrusions
OS update/patch
To fix vulnerabilities
Antivirus/Anti Spam
To p
prevent viruses,, Trojans,
j
, malwares,,
Single-Sign-On
To manage password
Physical security
To secure physical access to NOC
86
Security
y of Management
g
Network
Out-of-band management
Physically separated management network
Dedicated VPN for network management
Security
y Management
g
Security management is concept that deals with protection of
data in a network system against unauthorized access
access,
disclosure, modification, or destruction and protection of the
network system itself (including NOC & management network)
against unauthorized use, modification, or denial of service
Includes
Security policies
Implementation
p
of security
y mechanisms
Monitoring, Action & Reporting security event
We don
dontt discuss about security techniques, e.g., public and
private key encryption, confidentiality, integrity, Firewall, IDS,
IPS, Honeypot,
89
Security
y Management
g
Functions ((TMN))
Security administration
Planning and administering security policy and
g g security
y related information
managing
Prevention
Security mechanism to prevent intruders
Detection
Detect intrusion
Security
y Policies
Overall security guide line and decision in network
Security policies must be comprehensive
Consider all domains in the network
Prevention
Needs to be covered by security policies
In service provider networks
Attack NOC (to access control on whole network)
Attack Network (to disturb the service, to access
customer data)
92
AAA (Authentication)
(
)
Authentication is the act of establishing or
confirming someone as authentic, that is, that
claims made by
y or about the thing
g are true
Authentication is accomplished via the
94
AAA (Authorization)
(
)
Authorization is a process to protect resources to
b used
be
db
by consumers th
thatt h
have b
been granted
t d
authority to use them
aka, access control
AAA (Accounting)
(
g)
Accounting refers to the tracking of the
consumption of network resources by users
Typical
T
i l iinformation
f
ti that
th t is
i gathered
th d in
i
accounting may be:
The identity of the user
The nature of the service delivered
When the service began, and when it ended
In security domain
What does the client do
96
AAA Protocols
RADIUS
Remote Authentication Dial In User Service
Authenticated dial-up
dial up and VPN customers
TACACS
Terminal Access Controller Access Control
y
System
Different protocols and authentication methods
Diameter
97
98
Outline
Fault management
Configuration management
Accounting management
Performance management
S
Security
it managementt
Conclusion
99
Summary
y
NOC
Configuration management service
provisioning
p
g
Fault & Performance management service
assurance
Accounting management Billing
SOC
Security of management
Management of security (FCAP for security)
100
References
Reading Assignment: Chapters 6, 7, 8, and 9 of Dinesh Chandra Verma, Principles of
Computer Systems and Network Management, Springer, 2009
Reading Assignment: Chapter 5 of Alexander Clemm, Network Management
Fundamentals , Cisco Press, 2007
Mani Subramanian,
Subramanian Network
Network Management: Principles and Practice
Practice, Ch.
Ch 13
R. Dssouli, Advanced Network Management, Concordia Institute for Information Systems
Engineering, http://users.encs.concordia.ca/~dssouli/INSE 7120.html
Nhut Nguyen, Telecommunications Network Management, University of Texas at Dallas,
www.utdallas.edu/~nhutnn/cs6368/
J. Won-Ki Hong, Network Management System, PosTech University,
dpnm.postech.ac.kr/cs607/
Raymond A. Hansen, Enterprise Network Management, Purdue University,
netcourses.tech.purdue.edu/cit443
Woraphon Lilakiatsakun, Network Management, Mahanakorn University of Technology,
http://www.msit2005.mut.ac.th/msit_media/1_2553/ITEC4611/Lecture/
101