You are on page 1of 36

Control Plane

2010 Juniper Networks, Inc. All rights reserved. | www.juniper.net

Brains of a Junos Device


 Control plane resides on the Routing Engine
Routing Engines serve as the brains of a device running the Junos operating system
Runs the Junos OS Based on an X86 or PowerPC architecture Contains memory and flash Manages protocols Manages system processes Manages user processes Maintains routing table Maintains forwarding table Maintains bridging table Connects to the data plane Controls some chassis components
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 2

Control Plane Process Separation


 Separation within separation
Not only are the data plane and control plane separate, but system processes are often separate Provides protection from one process bringing down the entire control plane

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 3

Control Plane Interaction


 Kernel provides the underlying infrastructure for all the Junos processes
Control Plane running the Junos OS Routing Tables Forwarding Table Bridging Table Layer 2 Protocol Process Kernel (Operating System) Routing Protocol Process

Interface Process

User

Chassis Process

Intel PCI Platform

Data Plane
www.juniper.net | 4

2010 Juniper Networks, Inc. All rights reserved.

System Processes
 Processes in the user space interact with the kernel and are often called daemons  The Junos OS runs a variety of daemons:
user@mx> show system processes extensive | count Count: 125 lines user@mx> show system processes extensive | match rpd 1228 root 1 4 0 41364K 10416K kqread 5:28 0.00% rpd

Process ID (PID)

Total Size

Resident Memory

CPU Usage

Process Name

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 5

Disabling and Restarting System Processes


 Use the operational mode restart command to restart a system process:
user@mx> restart routing ? Possible completions: <[Enter]> gracefully immediately . . . Execute this command Gracefully restart the process Immediately restart (SIGKILL) the process

 Disabling a system process:


[edit system processes] lab@mxD-2# show ethernet-link-fault-management disable;

Note: Disabling system processes can cause instabilities and should only be performed as a troubleshooting step under the guidance of JTAC.

Nov 8 21:35:08 mxD-2 init: ethernet-link-fault-management (PID 1268) terminate signal sent Nov 8 21:35:08 mxD-2 init: ethernet-link-fault-management (PID 1268) exited with status=0 Normal Exit
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 6

System Process Summary


 Use the summary command modifier for a complete look at the devices overall process state:
user@mx> show system processes summary last pid: 21038; 21:49:06 load averages: 0.23, 0.18, 0.07 up 55+23:28:58

121 processes: 2 running, 91 sleeping, 28 waiting Mem: 306M Active, 34M Inact, 62M Wired, 139M Cache, 112M Buf, 1448M Free Swap: 2915M Total, 2915M Free

PID USERNAME 11 root

THR PRI NICE 1 171 52

SIZE 0K

RES STATE 16K RUN

TIME

WCPU COMMAND

1328.7 98.05% idle

Hint: Use show task memory detail over time to identify memory leaks.
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 7

Other System Commands


 System connections represent active IP sockets for which the Junos device is acting as a server
user@mx> show system connections | match "10.210.15.30" tcp4 0 0 10.210.15.8.22 10.210.15.30.61944 ESTABLISHED

 System alarms are raised for licensing violations


user@mx> show system alarms No alarms currently active

 Use show system statistics to view control plane counters separated by protocol
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 8

User Processes
 Log out of the Junos device gracefully to prevent hung user sessions:
user@mx> show system users 9:46PM USER lab lab up 56 days, 23:25, 2 users, load averages: 0.41, 0.18, 0.07 TTY u0 p0 FROM 10.210.15.30 LOGIN@ 21Sep10 9:45PM IDLE WHAT - -cli (cli) - -cli (cli)

tty of u0 indicates a console session tty of p0 indicates a remote Telnet/SSH session

 Clearing user sessions:


user@mx> request system logout user lab terminal u0 logout-user: done

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 9

Routing Protocol Daemon


 Core functions:
Controls routing protocols running on router Starts all configured protocols Handles all routing messages Maintains routing tables Implements routing policy

 Maintains its own scheduler


Prioritizes and switches between routing tasks
OSPF Routing Tables rpd
Scheduler

BGP Others
www.juniper.net | 10

Junos Kernel
2010 Juniper Networks, Inc. All rights reserved.

rpd Scheduler Slips


 Scheduler slips indicate that rpd was not able to service its internal processes in a timely manner An indicator that rpd is too busy
Slips are reported in the syslog
Nov 15 12:32:31 router rpd[309]: RPD_SCHED_SLIP: 10 sec scheduler

 Task accounting can help identify the cause of a slip


OSPF
Scheduler

BGP Others

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 11

Core Files
 Core dump files
Generated by system process crashes (or forcibly) Files should be uploaded to JTAC and associated with a JTAC case number Often associated with problem reports

 Check for core dump files


System syslog request support information show system core-dumps
/var/tmp typically hosts process cores /var/crash typically hosts kernel and PFE cores

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 12

Troubleshooting Methodology (1 of 3)
 New or existing implementation?
Understanding is important for isolating the issue

 Do no harm!
Clearing a route or database entry
Single route must refresh

Least Severe Action

Bouncing a protocol session or neighborship


All learned routes must refresh

Bouncing a protocol
All adjacencies or peerings must re-establish

Restarting routing (rpd)


All routing must restart

Rebooting the device


All system processes must restart
Most Severe Action

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 13

Troubleshooting Methodology (2 of 3)
 Define success (and isolate)
Route received from neighbor
Check protocol adjacency Check protocol database

Route appears in routing table


Check preference Test import routing policy

Route being advertised to neighbor


Check protocol adjacency Test export policy

Route is stable
Check logs, interfaces, and protocol traces
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 14

Troubleshooting Methodology (3 of 3)
 Identify and implement a solution
Repair hardware issue Adjust protocol configuration
Adjacency configuration Metrics and preferences Policy

Adjust implementation
Prevent link overutilization

Test in lab environment

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 15

General Path Troubleshooting (1 of 4)


 Host X is no longer able to reach host Y:
Router B .2 Router C .1 192.168.20/24 .1 .2

.1 Router A .1

.2 Router D .2

.2 .1 .1 .2 192.168.50/24 Router E Router F

Host X

Host Y

Assume local host-to-router connectivity is successful


2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 16

General Path Troubleshooting (2 of 4)


 Ping testing
Router A cannot ping router D
Router A Router D Router B Router C

user@mxA> ping 192.168.30.2 PING 192.168.30.2 (192.168.30.2): 56 data bytes ^C --- 192.168.30.2 ping statistics --6 packets transmitted, 0 packets received, 100% packet loss user@mxA> ping 192.168.40.2 PING 192.168.40.2 (192.168.40.2): 56 data bytes ^C --- 192.168.40.2 ping statistics --6 packets transmitted, 0 packets received, 100% packet loss
Router F Router E

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 17

General Path Troubleshooting (3 of 4)


 Traceroute results:
Router A Router B Router C

Router D

Router F

Router E

user@mxA> traceroute 192.168.30.2 traceroute to 192.168.30.2 (192.168.30.2), 30 hops max, 40 byte packets 1 2 3 4 5 192.168.10.2 (192.168.10.2) 192.168.20.2 (192.168.20.2) * * * * * * * *^C 0.464 ms 0.406 ms 0.334 ms 0.364 ms 0.330 ms 0.356 ms

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 18

General Path Troubleshooting (4 of 4)


 Check your knowledge
What further isolation steps would you perform? What are some of the possible causes? What are some possible solutions? How would you test possible solutions?

Router B

Router C

Router A

Router D

Router F
2010 Juniper Networks, Inc. All rights reserved.

Router E
www.juniper.net | 19

Protocol Troubleshooting Chart


Chassis, software, interface, and transmission line are OK Yes

Route present and active?

No

IGP route? Yes

No

BGP session established? Yes Route hidden? No

No

Suspect configuration or IGP

Yes

Suspect IGP configuration

No

Adjacencies up? Yes

Suspect remote peer policy

Investigate forwarding faults

Suspect policy or IGP configuration

Suspect policy or IGP configuration

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 20

Working with Protocols (1 of 4)


 Helpful commands:
Protocol show commands:
user@mx> show ospf neighbor Address 172.18.5.1 Interface ge-1/0/2.144 State Full ID 192.168.37.1 Pri 128 Dead 31

Disabling a protocol (Disabling single interface is less drastic):


[edit protocols ospf area 0.0.0.0] user@mx> show interface ge-1/0/2.144 { disable; } interface all;

Restart routing (most drastic):


user@mx> restart routing Routing protocols process started, pid 28667
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 21

Working with Protocols (2 of 4)


 Helpful commands (contd.):
View routes from a specific protocol:
user@mx> show route protocol ospf

View routes from protocol perspective (before routing table):


user@mx> show ospf database

Clear a protocol adjacency:


user@mx> clear ospf neighbor

Clear a specific entry in a protocol database:


user@mx> clear ospf database lsa-id lsa-id purge

Sets LSA to MAXAGE, resulting in re-advertisement from originator

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 22

Working with Protocols (3 of 4)


 Monitoring protocol traffic:
System logging:
user@mx> show log messages | match ospf Nov 23 23:37:59 mxD-2 rpd[20773]: RPD_OSPF_NBRDOWN: OSPF neighbor 172.18.5.1 (realm ospf-v2 ge-1/0/2.144 area 0.0.0.0) state changed from Full to Down due to KillNbr (event reason: interface went down)

Traceoptions:
[edit protocols ospf] user@mx# set traceoptions flag ? Possible completions: all database-description error event flooding ...
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 23

Trace everything Trace database description packets Trace errored packets Trace OSPF state machine events Trace LSA flooding

Working with Protocols (4 of 4)


 Monitoring protocol traffic (contd.):
Monitoring interface traffic:
user@mx> monitor traffic interface interface-name no-resolve verbose output suppressed, use <detail> or <extensive> for full protocol decode Address resolution is OFF. Listening on ge-1/0/2.144, capture size 100 bytes 00:24:29.389895 In IP 172.18.5.1 > 224.0.0.5: OSPFv2, Hello, length 44

00:24:33.111454 Out IP truncated-ip - 26 bytes missing! 172.18.5.2 > 224.0.0.5: OSPFv2, Hello, length 44 00:24:38.916834 In IP 172.18.5.1 > 224.0.0.5: OSPFv2, Hello, length 44 00:24:42.024384 Out IP truncated-ip - 26 bytes missing! 172.18.5.2 > 224.0.0.5: OSPFv2, Hello, length 44 ^C 4 packets received by filter 0 packets dropped by kernel
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 24

Troubleshooting an OSPF Adjacency (1 of 5)


 New OSPF Area 0 adjacency not coming up:
user@mx2> show ospf neighbor user@mx2>

mx1 lo0: 192.168.37.1

ge-1/0/2 (.1)

VLAN 144 172.18.5.0/30 Area O

(.2) ge-1/0/2

mx2 lo0: 192.168.38.1

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 25

Troubleshooting an OSPF Adjacency (2 of 5)


 What information can you derive without studying the configuration?
Other protocol show commands:
user@mx2> show ospf interface Interface ge-1/0/2.144 State DR Area 0.0.0.0 DR ID 192.168.38.1 BDR ID 0.0.0.0 Nbrs 0

user@mx2> clear ospf statistics user@mx2> show ospf statistics Packet type Sent Hello DbD LSReq LSUpdate LSAck ...
2010 Juniper Networks, Inc. All rights reserved. www.juniper.net | 26

Total Received 3 0 0 0 0 0 0 0 0 0

Last 5 seconds Sent 0 0 0 0 0 Received 1 0 0 0 0

Troubleshooting an OSPF Adjacency (3 of 5)


 Other outputs to check:
System log:
user@mx2> show system uptime | match Current Current time: 2010-11-22 00:47:15 UTC user@mx2> show log messages | match "Nov 22" | match ospf Nov 22 00:15:31 mxD-2 rpd[28667]: RPD_OSPF_NBRDOWN: OSPF neighbor 172.18.5.1 (realm ospf-v2 ge-0/0/0.0 area 0.0.0.0) state changed from Full to Down due to InActiveTimer (event reason: neighbor was inactive and declared dead)

No related log messages in recent past

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 27

Troubleshooting an OSPF Adjacency (4 of 5)


 Other outputs to check (contd.):
Configure traceoptions:
[edit protocols ospf] user@mx2# show traceoptions file ospftrace; flag hello;
Note: Bounce OSPF first to restart adjacency formation process

Alas! mx2 is sending Hello messages:


user@mx2# run show log ospftrace | find hello

Nov 24 00:59:45.898113 OSPF sent Hello 172.18.5.2 -> 224.0.0.5 (ge-1/0/2.144 IFL 76 area 0.0.0.0) Nov 24 00:59:45.898595 Nov 24 00:59:45.898619 128 Nov 24 00:59:45.898639 ... Version 2, length 44, ID 192.168.38.1, area 0.0.0.0 mask 255.255.255.252, hello_ivl 10, opts 0x2, prio dead_ivl 40, DR 0.0.0.0, BDR 0.0.0.0

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 28

Troubleshooting an OSPF Adjacency (5 of 5)


 Expand the trace:
[edit protocols ospf] user@mx2# show traceoptions file ospftrace; flag hello; flag error detail;
Note: Bounce OSPF first to restart adjacency formation process

The culprit:
user@mx2# run show log ospftrace ...

Nov 24 01:31:44.779373 OSPF packet ignored: authentication failure (bad cksum). Nov 24 01:31:44.779554 OSPF packet ignored: authentication failure from 172.18.5.1 ...
Note: Monitoring the interface traffic would have been helpful with a plain-text authentication mismatch, but an MD5 secret mismatch would not have been detected.

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 29

Bridging in the Control Plane (1 of 3)


 MX Series bridging processes:
user@mx> restart l2? Possible completions: l2-learning l2cpd-service Layer 2 address flooding and learning process Layer 2 Control Protocol process

 EX Series bridging processes:


user@ex> restart ethernet-switching Ethernet Switching Process signalled but still running, waiting 8 seconds more Ethernet Switching Process started, pid 17987 user@ex> restart lldpd-service Link Layer Discovery Protocol started, pid 17989

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 30

Bridging in the Control Plane (2 of 3)


 MX Series and EX Series bridge tables:
user@mx> show bridge mac-table MAC flags (S -static MAC, D -dynamic MAC, SE -Statistics enabled, NM -Non configured MAC) Routing instance : default-switch Bridging domain : lab, VLAN : NA MAC address 50:c5:8d:87:8c:86 MAC flags D Logical interface ge-1/0/2.0

user@ex> show ethernet-switching table Ethernet-switching table: 6 entries, 4 learned VLAN v100 v100 MAC address * Type Flood Age Interfaces - All-members 0 ge-0/0/14.0
www.juniper.net | 31

00:0c:29:73:13:fe Learn

2010 Juniper Networks, Inc. All rights reserved.

Bridging in the Control Plane (3 of 3)


 Clearing bridge entries:
MX Series
user@mx> clear bridge mac-table ? Possible completions: <[Enter]> <address> bridge-domain instance interface isid logical-system vlan-id Execute this command MAC address Name of bridging domain, or 'all' Display information for a specified instance Clear media access control table for specified interface Clear MAC address learned on a specified ISID Name of logical system, or 'all' Clear MAC address learned on a specified VLAN (0..4095)

EX Series
user@ex> clear ethernet-switching table

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 32

ARP Overview
 ARP associates IP addresses with Layer 2 addresses in an ARP table
Once a routing issue is isolated to a broadcast segment, monitor the ARP process for a local problem
.1 .1

.2

.2

Host X ARP Table: 192.168.30.2 = 02:00:54:55:4E:01


2010 Juniper Networks, Inc. All rights reserved.

Host Y

www.juniper.net | 33

Working with the ARP Table (1 of 2)


 Display the ARP table:
user@mx> show arp MAC Address Address Name 10.210.15.5 10.210.15.24 Interface fxp0.0 fxp0.0 Flags none none 80:71:1f:c3:0d:ff 10.210.15.5 00:1b:21:28:4f:54 10.210.15.24 Total entries: 2

Manually clear the ARP entry:


user@mx> clear arp hostname 10.210.15.5 10.210.15.5 deleted

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 34

Working with the ARP Table (2 of 2)


 Monitoring the ARP process:
user@mx> monitor traffic interface ge-1/0/0.141 no-resolve verbose output suppressed, use <detail> or <extensive> for full protocol decode Address resolution is OFF. Listening on ge-1/0/0.141, capture size 96 bytes 06:03:58.441952 Out arp who-has 172.18.1.1 tell 172.18.1.2 06:03:58.442425 ^C 2 packets received by filter 0 packets dropped by kernel In arp reply 172.18.1.1 is-at 50:c5:8d:87:8c:84

2010 Juniper Networks, Inc. All rights reserved.

www.juniper.net | 35