NE40E V800R010C00 Feature Description QoS PDF

HUAWEI NetEngine40E Universal Service Router
V800R010C00
Feature Description - QoS
Issue 02
Date 2018-06-20
HUAWEI TECHNOLOGIES CO., LTD.

Copyright © Huawei Technologies Co., Ltd. 2018. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without prior written
consent of Huawei Technologies Co., Ltd.
Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.
All other trademarks and trade names mentioned in this document are the property of their respective
holders.
Notice
The purchased products, services and features are stipulated by the contract made between Huawei and the
customer. All or part of the products, services and features described in this document may not be within the
purchase scope or the usage scope. Unless otherwise specified in the contract, all statements, information,
and recommendations in this document are provided "AS IS" without warranties, guarantees or
representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in the
preparation of this document to ensure accuracy of the contents, but all statements, information, and
recommendations in this document do not constitute a warranty of any kind, express or implied.
Huawei Technologies Co., Ltd.

Address: Huawei Industrial Base
Bantian, Longgang
Shenzhen 518129
People's Republic of China
Website: http://www.huawei.com
Email: support@huawei.com
Issue 02 (2018-06-20) Huawei Proprietary and Confidential i

Copyright © Huawei Technologies Co., Ltd.
Feature Description - QoS Contents
Contents
1 About This Document.................................................................................................................. 1

2 What Is QoS.................................................................................................................................... 4
2.1 What Is QoS....................................................................................................................................................................4
2.2 QoS Specifications......................................................................................................................................................... 5
2.3 Common QoS Specifications..........................................................................................................................................8
3 DiffServ Overview...................................................................................................................... 13
3.1 DiffServ Model.............................................................................................................................................................13
3.2 DSCP and PHB.............................................................................................................................................................14
3.3 Components in the DiffServ Model..............................................................................................................................17
4 End-to-End QoS Service Models.............................................................................................. 20

5 Overall QoS Process....................................................................................................................23
6 Classification and Marking....................................................................................................... 45
6.1 Traffic Classifiers and Traffic Behaviors......................................................................................................................45
6.2 QoS Priority Fields....................................................................................................................................................... 47
6.3 BA Classification..........................................................................................................................................................50
6.3.1 What Is BA Classification......................................................................................................................................... 50
6.3.2 QoS Priority Mapping............................................................................................................................................... 50
6.3.3 BA and PHB.............................................................................................................................................................. 70
6.4 MF Classification......................................................................................................................................................... 75
6.4.1 What Is MF Classification......................................................................................................................................... 75
6.4.2 Traffic Policy Based on MF Classification................................................................................................................78
6.4.3 QPPB......................................................................................................................................................................... 82
7 Traffic Policing and Traffic Shaping....................................................................................... 87

7.1 Traffic Policing............................................................................................................................................................. 87
7.1.1 Overview................................................................................................................................................................... 87
7.1.2 Token Bucket............................................................................................................................................................. 88
7.1.3 CAR........................................................................................................................................................................... 92
7.1.4 Traffic Policing Applications.....................................................................................................................................97
7.2 Traffic Shaping........................................................................................................................................................... 101
7.3 Comparison Between Traffic Policing and Traffic Shaping.......................................................................................109
8 Congestion Management and Avoidance.............................................................................111
Issue 02 (2018-06-20) Huawei Proprietary and Confidential ii

Feature Description - QoS Contents
8.1 Traffic Congestion and Solutions................................................................................................................................111

8.2 Queues and Congestion Management.........................................................................................................................114
8.3 Congestion Avoidance................................................................................................................................................ 127
8.4 Impact of Queue Buffer on Delay and Jitter...............................................................................................................131
8.5 HQoS.......................................................................................................................................................................... 132
9 MPLS QoS...................................................................................................................................156
9.1 MPLS QoS Overview................................................................................................................................................. 156
9.2 MPLS DiffServ...........................................................................................................................................................157
9.3 MPLS HQoS...............................................................................................................................................................163
9.3.1 Implementation Principle........................................................................................................................................ 163
9.3.2 Application.............................................................................................................................................................. 166
10 Multicast Virtual Scheduling................................................................................................168

10.1 Introduction.............................................................................................................................................................. 168
10.2 Principles.................................................................................................................................................................. 169
10.2.1 Basic Principles of Multicast Virtual Scheduling..................................................................................................170
10.3 Applications..............................................................................................................................................................170
10.3.1 Typical Single-Edge Network with Multicast Virtual Scheduling........................................................................ 171
10.3.2 Typical Double-Edge Network with Multicast Virtual Scheduling.......................................................................171
11 L2TP QoS...................................................................................................................................173
11.1 Introduction to L2TP QoS........................................................................................................................................ 173
11.2 Principles.................................................................................................................................................................. 173
11.2.1 Principles............................................................................................................................................................... 174
12 Acronyms and Abbreviations............................................................................................... 175
Issue 02 (2018-06-20) Huawei Proprietary and Confidential iii

Feature Description - QoS 1 About This Document
1 About This Document
Purpose
This document describes the QoS feature in terms of its overview, principles, and
applications.
Related Version
The following table lists the product version related to this document.
Product Name Version
NE40E Series V800R010C00
U2000 V200R017C60
eSight V300R009C00
Intended Audience
This document is intended for:
l Network planning engineers
l Commissioning engineers
l Data configuration engineers
l System maintenance engineers
Security Declaration
l Encryption algorithm declaration
The encryption algorithms DES/3DES/RSA (RSA-1024 or lower)/MD5 (in digital
signature scenarios and password encryption)/SHA1 (in digital signature scenarios) have
a low security, which may bring security risks. If protocols allowed, using more secure
encryption algorithms, such as AES/RSA (RSA-2048 or higher)/SHA2/HMAC-SHA2 is
recommended.
l Password configuration declaration
Issue 02 (2018-06-20) Huawei Proprietary and Confidential 1

– Do not set both the start and end characters of a password to "%^%#". This causes
the password to be displayed directly in the configuration file.
– To further improve device security, periodically change the password.
l Personal data declaration
Your purchased products, services, or features may use users' some personal data during
service operation or fault locating. You must define user privacy policies in compliance
with local laws and take proper measures to fully protect personal data.
l Feature declaration
– The NetStream feature may be used to analyze the communication information of
terminal customers for network traffic statistics and management purposes. Before
enabling the NetStream feature, ensure that it is performed within the boundaries
permitted by applicable laws and regulations. Effective measures must be taken to
ensure that information is securely protected.
– The mirroring feature may be used to analyze the communication information of
terminal customers for a maintenance purpose. Before enabling the mirroring
function, ensure that it is performed within the boundaries permitted by applicable
laws and regulations. Effective measures must be taken to ensure that information is
securely protected.
– The packet header obtaining feature may be used to collect or store some
communication information about specific customers for transmission fault and
error detection purposes. Huawei cannot offer services to collect or store this
information unilaterally. Before enabling the function, ensure that it is performed
within the boundaries permitted by applicable laws and regulations. Effective
measures must be taken to ensure that information is securely protected.
l Reliability design declaration
Network planning and site design must comply with reliability design principles and
provide device- and solution-level protection. Device-level protection includes planning
principles of dual-network and inter-board dual-link to avoid single point or single link
of failure. Solution-level protection refers to a fast convergence mechanism, such as FRR
and VRRP.
Special Declaration
l This document serves only as a guide. The content is written based on device
information gathered under lab conditions. The content provided by this document is
intended to be taken as general guidance, and does not cover all scenarios. The content
provided by this document may be different from the information on user device
interfaces due to factors such as version upgrades and differences in device models,
board restrictions, and configuration files. The actual user device information takes
precedence over the content provided by this document. The preceding differences are
beyond the scope of this document.
l The maximum values provided in this document are obtained in specific lab
environments (for example, only a certain type of board or protocol is configured on a
tested device). The actually obtained maximum values may be different from the
maximum values provided in this document due to factors such as differences in
hardware configurations and carried services.
l Interface numbers used in this document are examples. Use the existing interface
numbers on devices for configuration.
l The pictures of hardware in this document are for reference only.

Symbol Conventions
The symbols that may be found in this document are defined as follows.
Symbol Description
Indicates an imminently hazardous situation which, if not

avoided, will result in death or serious injury.
Indicates a potentially hazardous situation which, if not

avoided, could result in death or serious injury.

avoided, may result in minor or moderate injury.

avoided, could result in equipment damage, data loss,
performance deterioration, or unanticipated results.
NOTICE is used to address practices not related to personal
injury.
Calls attention to important information, best practices and

tips.
NOTE is used to address information not related to
personal injury, equipment damage, and environment
deterioration.
Change History
Updates between document issues are cumulative. Therefore, the latest document issue
contains all updates made in previous issues.
l Changes in Issue 03 (2018-04-10)
This issue is the third official release. The software version of this issue is
V800R010C00SPC200.
This issue is the second official release. The software version of this issue is
V800R010C00SPC200.
This issue is the first official release. The software version of this issue is
V800R010C00SPC100.

Feature Description - QoS 2 What Is QoS
2 What Is QoS
About This Chapter
2.1 What Is QoS

2.2 QoS Specifications
2.3 Common QoS Specifications
2.1 What Is QoS

As networks rapidly develop, services on the Internet become increasingly diversified. Apart
from traditional applications such as WWW, email, and File Transfer Protocol (FTP), the
Internet has expanded to encompass other services such as IP phones, e-commerce,
multimedia games, e-learning, telemedicine, videophones, videoconferencing, video on
demand (VoD), and online movies. In addition, enterprise users use Virtual Private Network
(VPN) technologies to connect their branches in different areas so that they can access each
other's corporate databases or manage remote devices through Telnet.
Figure 2-1 Internet services
Email, FTP, WWW
IP calls
E-commerce
……
Multi-media games Online movies

Diversified services enrich users' lives but also increase the risk of traffic congestion on the
Internet. In the case of traffic congestion, services can encounter long delays or even packet
loss. As a result, services deteriorate or even become unavailable. Therefore, a solution to
resolve traffic congestion on the IP network is urgently needed.
The best way to resolve traffic congestion is actually to increase network bandwidths.
However, increasing network bandwidths is not practical in terms of operation and
maintenance costs.
The quality of service (QoS) that uses a policy to manage traffic congestion at a low cost has
been deployed. QoS aims to provide end-to-end service guarantees for differentiated services
and has played an overwhelmingly important role on the Internet. Without QoS, service
quality cannot be guaranteed.
2.2 QoS Specifications

QoS provides customized service guarantees based on the following specifications:
l Bandwidth/throughput
l Delay
l Delay variations (Jitter)
l Packet loss rate
Bandwidth/Throughput
Bandwidth, also called throughput, refers to the maximum number of bits allowed to transmit
between two ends within a specified period (1 second) or the average rate at which specific
data flows are transmitted between two network nodes. Bandwidth is expressed in bit/s.
As services become increasingly diversified, Internet Citizens expect higher bandwidths so

they cannot only browse the Internet for news but also experience any number of popular
applications. The epoch-making information evolution continually delivers new and attractive
applications, such as new-generation multimedia, video transmission, database, and IPTV, all
of which demand extremely high bandwidths. Therefore, bandwidth is always the major focus
of network planning and provides an important basis for network analysis.
Figure 2-2 Insufficient bandwidth
The Internet is slow,

and the online video
Have you cannot play
watched the smoothly. Maybe I
video I have have to download it.
recommended?
IP
network

NOTE
Two concepts, upstream rate and downstream rate, are closely related to bandwidth. The upstream rate
refers to the rate at which users can send or upload information to the network, and the downstream rate
refers to the rate at which the network sends data to users. For example, the rate at which users upload
files to the network is determined by the upstream rate, and the rate at which users download files is
determined by the downstream rate.
Delay
A delay refers to the period of time during which a packet is transmitted from a source to its
destination.
Use voice transmission as an example. A delay refers to the period during which words are
spoken and then heard. If a long delay occurs, voices become unclear or interrupted.
Most users are insensitive to a delay of less than 100 ms. If a delay ranging from 100 ms to
300 ms occurs, the speaker can sense slight pauses in the responder's reply, which can seem
annoying to both. If a delay greater than 300 ms occurs, both the speaker and responder
obviously sense the delay and have to wait for responses. If the speaker cannot wait but
repeats what has been said, voices overlap, and the quality of the conversation deteriorates
severely.
Figure 2-3 Long delay
(2s later) Hello! (4s

Hello! (2s later) later) Is that Jack?
Can you hear
me?
Interrupted for so long?
IP
network
Delay variations (jitter)

Jitter refers to the difference in delays of packets in the same flow. If the period before a
packet that has reached a device is sent by the device differs from one packet to another in a
flow, jitters occur, and service quality is negatively affected.
Specific services, especially voice and video services, are zero-tolerant of jitters, which causes
interruptions in voice or video services.

Figure 2-4 High jitter
I can leave you?

I can leave. You Can not!
can not.
IP
network
Time I can
Delay
leave D1=50ms I can
. D2=50ms
leave
D3=10ms
You ,
D4=40ms you
can
not
D5=90ms
can
D6=90ms
not
Jitters also affect protocol packet transmissions. Specific protocol packets are transmitted at a
fixed interval. If high jitters occur, such protocols alternate between Up and Down, adversely
affecting quality.
Jitter thrives on networks but service quality will not be affected if jitters do not exceed a
specific tolerance. Buffers can alleviate excess jitters but prolong delays.
Packet Loss Rate

Packet loss occurs when one or more packets traveling across a network fail to reach their
destination. Slight packet loss does not affect services. For example, users are unaware of the
loss of a bit or a packet in voice transmissions. If a bit or a packet is lost in video
transmission, the image on the screen becomes momentarily garbled but the image recovers
very quickly. Even if TCP is used to transmit data, slight packet loss is not a problem because
TCP instantly retransmits the packets that have been lost. If severe packet loss does occur,
however, packet transmission efficiency is affected. The packet loss rate indicates the severity
of service interruptions on networks and concerns users.
Figure 2-5 High packet loss rate
I have sent a I have sent you

file to you.
What?
IP
network

2.3 Common QoS Specifications

Diversified services are transmitted on the IP bearer network. These services are characterized
by unidentified traffic volume and direction and have specific requirements for the QoS.
Before providing differentiated QoS services for services, classify the services based on QoS
requirements.
Table 2-1 Service performance requirements (Table I.1 and Table I.2 in ITU-T G.1010)
M Application Symm Typical Key Performance Parameters and
ed etry Rate Values
ia
One-way Jitter Information Loss
Delay Rate
Voi Voice phone Two- 4-64 <150 ms <1 < 3% packet loss
ce way kbit/s preferred ms ratio (PLR)
<400 ms limit
Voice + Mainly 4-32 < 1 s for <1 < 3% PLR

Message one- kbit/s playback ms
way
< 2 s for record
High-quality Mainly 16-128 < 10 s << 1 < 1% PLR

voice media one- kbit/s ms
way
Vi Video phone Two- 16-384 < 150 ms < 1% PLR

de way kbit/s preferred
os
<400 ms limit
Video media One- 16-384 < 10 s < 1% PLR

way kbit/s
Da Web browser Mainly ~10 KB Preferred < 2 N.A. Zero

ta HTML one- s /page
way Acceptable < 4
s /page
Batch data Mainly 10 Preferred < 15 N.A. Zero

transmission one- KB-10 s
way MB Acceptable <
60 s

M Application Symm Typical Key Performance Parameters and

ed etry Rate Values
ia
One-way Jitter Information Loss
Delay Rate
Transaction Two- < 10 KB Preferred < 2 s N.A. Zero

services way Acceptable < 4
(High-priority s
services. For
example, e-
commerce
and ATM)
Command/ Two- ~ 1 KB < 250 ms N.A. Zero

Control way
Still image One- < 100 Preferred < 15 N.A. Zero

way KB s
Acceptable <
60 s
Interactive Two- < 1 KB < 200 ms N.A. Zero

game way
Telnet Two- < 1 KB < 200 ms N.A. Zero

way
(asymm
etric)
E-mail Mainly < 10 KB Preferred < 2 s N.A. Zero

(server one- Acceptable < 4
access) way s
E-mail Mainly < 10 KB Several N.A. Zero

(migration one- minutes
between way
servers)
Fax (in real Mainly ~ 10 KB < 30 s/page N.A. <10-6 BER

time) one-
way
Fax (storage Mainly ~ 10 KB Several N.A. <10-6 BER

and one- minutes
forwarding) way
Low-priority Mainly < 10 KB < 30 s N.A. Zero

transactions one-
way
Usenet Mainly 1 MB or Several N.A. Zero

one- higher minutes
way

Table 2-2 Specifications of end-to-end conversational or real-time services (Table 1 in 3GPP

TS22.105)
Serv M Applica Symme Data Key Performance Parameters and
ice e tion try Rate Values
Typ d
e ia E2E One-Way Jitter in Informati
Delay a Call on Loss
Rate
Con V Convers Two- 4-25kb/ <150 msec <1 msec < 3% FER
vers oi ational way s (recommended)
ation ce
al/ <400 msec (upper
Real limit)
- V Video Two- 32-384 <150 msec ≤ 10 ms < 1% FER
time id phone way kb/s (recommended) (generall
e y),
os <400 msec (upper dependi
limit) ng on
< 100 msec (voice the
and image terminal
synchronization) perform
ance
D Remote Two- <28.8 < 250 msec N.A Zero

at sensing way kb/s
a survey
(bidirecti
onal
control)
D Online Two- < 60 <75 msec N.A <3% FER

at games way kb/s (recommended) (recomme
a nded)
<5% FER
(upper
limit)
D Telnet Two- < 1 KB < 250 msec N.A Zero

at way
a (asymm
etric)
Inter V Voice + Mainly 4-13 <1 sec (recording < 1 msec < 3% FER
activ oi Message one-way kb/s replay)
e ce <2 sec (recording)
D Web Mainly < 4 sec /page N.A Zero

at browser one-way
a HTML


Typ d
Rate
D Transact Two- < 4 sec N.A Zero

at ion way
a services
(high-
priority
services.
For
example,
e-
commer
ce and
ATM)
D E-mail Mainly < 4 sec N.A Zero

at (server one-way
a access)
Strea V Convers Mainly 5-128 < 10 sec < 2sec < 1%

ming oi ation, one-way kb/s Packet loss
medi ce conversa ratio
a tion and
music,
multime
dia, and
high-
quality
music
V Movie Mainly 20-384 < 10 sec <2 sec < 2%

id trailers, one-way kb/s Packet loss
e video ratio
os surveilla
nce,
real-time
video
D Batch Mainly < 384 < 10 sec N.A Zero

at data one-way kb/s
a transmis
sion/
backhaul
and
informat
ion
synchron
ization


Typ d
Rate
D Static Mainly < 10 sec N.A Zero

at pictures one-way
a

Feature Description - QoS 3 DiffServ Overview
3 DiffServ Overview
About This Chapter
3.1 DiffServ Model

3.2 DSCP and PHB
3.3 Components in the DiffServ Model
3.1 DiffServ Model

The DiffServ model is the most commonly used QoS model on IP networks. Technologies
described in this document are based on the DiffServ model.
DiffServ classifies incoming packets on the network edge and manages packets of the same
class as a whole to ensure the same transmission rate, delay, and jitter.
Network edge nodes mark packets with a specific service class in packet headers, and then
apply traffic management policies to the packets based on the service class. Interior nodes
perform specific behaviors for packets based on packet information.
Figure 3-1 DiffServ model
PHB-based
DS domain DS domain
forwarding
Boundary node Boundary node
Interior node Interior node
SLA/TCA
Service Boundary Boundary node
Different PHBs in
classification node
different DSs, being
and aggrgation
coordinated based on
User
the SLA/TCA User
network
network

l DiffServ (DS) node: a network node that implements the DiffServ function.
l DS boundary node: connects to another DS domain or a non-DS-aware domain. The DS
boundary node classifies and manages incoming traffic.
l DS interior node: connects to DS boundary nodes and other interior nodes in one DS
domain. DS interior nodes implement simple traffic classification based on DSCP
values, and manage traffic.
l DS domain: a contiguous set of DS nodes that adopt the same service policy and per-hop
behavior (PHB). One DS domain covers one or more networks under the same
administration. For example, a DS domain can be an ISP's networks or an organization's
intranet. For an introduction to PHB, see the next section.
l DS region: consists of one or more adjacent DS domains. Different DS domains in one
DS region may use different PHBs to provide differentiated services. The service level
agreement (SLA) and traffic conditioning agreement (TCA) are used to allow for
differences between PHBs in different DS domains. The SLA or TCA specifies how to
maintain consistent processing of the data flow from one DS domain to another.
l SLA: The SLA refers to the services that the ISP promises to provide for individual
users, enterprise users, or adjacent ISPs that need intercommunication. The SLA covers
multiple dimensions, including the accounting protocol. The service level specification
(SLS) provides technique description for the SLA. The SLS focuses on the traffic control
specification (TCS) and provides detailed performance parameters, such as the
committed information rate (CIR), peak information rate (PIR), committed burst size
(CBS), and peak burst size (PBS).
3.2 DSCP and PHB

Per-hop behavior (PHB) is an important concept in the DiffServ model. The Internet
Engineering Task Force (IETF) redefined the type of service (ToS) for IPv4 packets and
Traffic Class (TC) for IPv6 packets as the Differentiated Service (DS) field for the DiffServ
model. The value of the DS field is the DiffServ code point (DSCP) value. Different DSCP
values correspond to different PHBs, as described in this section.

DSCP
Figure 3-2 DSCP
DSCP domain
0 1 2 3 4 5 6 7
Precedence D T R C
IPv4
packet
8 bit
Version HeadLength ToS Total Length …
IPv6
packet
8 bit 20 bit
Version Traffic Class Flow Label Payload Length …
In an IPv4 packet, the six left-most bits (0 to 5) in the DS field are defined as the DSCP value,
and the two right-most bits (6 and 7) are reserved bits. Bits 0 to 2 are the Class Selector Code
Point (CSCP) value, indicating a class of DSCP. Devices that support the DiffServ function
perform forwarding behaviors for packets based on the DSCP value.
In IPv6 packet headers, two fields are related to QoS: TC and Flow Label (FL). The TC field
contains eight bits and functions the same as the ToS field in IPv4 packets to identify the
service type. The FL field contains 20 bits and identifies packets in the same data flow. The
FL field, together with the source and destination addresses, uniquely identifies a data flow.
All packets in one data flow share the same FL field, and devices can rapidly process packets
in the same data flow.
PHB
Per-hop Behavior (PHB) is a description of the externally observable forwarding treatment
applied at a differentiated services-compliant node to a behavior aggregate. A DS node
performs the same PHB for packets with the same DSCP value. The PHB defines some
forwarding behaviors but does not specify the implementation mode.
At present, the IETF defines four types of PHBs: Class Selector (CS), Expedited Forwarding
(EF), Assured Forwarding (AF), and best-effort (BE). BE PHB is the default.

Table 3-1 Mapping of PHBs and DSCP values

PHB DSCP Value Description
CS XXX000, where X is 0 or 1. The CS PHB indicates the same service

When Xs are all 0s, this class as the IP precedence value.
PHB is the default PHB.
For the CS PHB, the DSCP
value is equal to the IP
precedence value multiplied
by 8. For example, CS6 = 6
x 8 and CS7 = 7 x 8.
EF 101110 The EF PHB defines that the rate at which

packets are sent from any DS node must be
higher than or equal to the specified rate.
The EF PHB cannot be re-marked in the DS
domain but can be re-marked on the edge
nodes.
The EF PHB functions the same as a virtual
leased line to provide services with a low
packet loss rate, delay, and jitter and a
specific bandwidth.
The EF PHB applies to real-time services
that require a short delay, low jitter, and low
packet loss rate, such as video, voice, and
video conferencing.
AF XXXYY0, where X is 0 or The AF PHB defines that traffic that

1. XXX indicates the IP exceeds the specified bandwidth (as agreed
precedence. YY indicates to by users and an ISP) can be forwarded.
the drop precedence. The Traffic that does not exceed the bandwidth
larger the value, the higher specification is forwarded as required, and
the drop priority. the traffic that exceeds the bandwidth
Currently, four AF classes specification is forwarded at a lower
with three levels of drop priority.
precedence in each AF class Carriers provide differentiated bandwidth
are defined for general use. resources for the AF PHB. After the AF
An IP packet that belongs to PHB is allocated sufficient bandwidths,
an AF class i and has drop other data can consume the remaining
precedence j is marked with bandwidths.
the AF codepoint AFij, The AF PHB applies to services that require
where i ranges from 1 to 4 a short delay, low packet loss rate, and high
and j ranges from 1 to 3. reliability, such as e-commerce and VPN
services.
BE 000000 The BE PHB focuses only on whether

packets can reach the destination, regardless
of the transmission performance.
Traditional IP packets can be transmitted in
BE mode. Any router must support the BE
PHB.

Table 3-2 Common PHB applications
PHB Applications
CS6 CS6 and CS7 PHBs are used for protocol packets by default, such as OSPF and
and BGP packets. If these packets are not forwarded, protocol services are interrupted.
CS7
EF EF PHB is used for voice services. Voice services require a short delay, low jitter,
and low packet loss rate, and are second only to protocol packets in terms of
importance.
NOTE
The bandwidth dedicated to EF PHB must be restricted so that other services can use the
bandwidth.
AF4 AF4 PHB is used for signaling of voice services.

NOTE
Signaling is used for call control, during which a seconds-long delay is tolerable, but no
delay is allowed during a conversation. Therefore, the processing priority of voice services is
higher than that of signaling.
AF3 AF3 PHB is used for BTV services of IPTV. Live programs are real-time services,
requiring continuous bandwidth and a large throughput guarantee.
AF2 AF2 PHB is used for VoD services of IPTV. VoD services require lower real-time
performance than BTV services and allow delays or buffering.
AF1 AF1 PHB is used for leased-line services, which are second to IPTV and voice
services in terms of importance. Bank-based premium services, one type of
leased-line services, can use the AF4 or even EF PHB.
BE BE PHB applies to best-effort services on the Internet, such as email and telnet
services.
3.3 Components in the DiffServ Model

The DiffServ model consists of four QoS components. Traffic classification and re-marking
provide a basis for differentiated services. Traffic policing and shaping, congestion
management, and congestion avoidance control network traffic and resource allocation in
different ways and allow the system to provide differentiated services.
l Classification and Marking: classification classifies packets while keeping the packets
unchanged. Traffic marking sets different priorities for packets and therefore changes the
packets.
NOTE
Traffic marking refers to external re-marking, which is implemented on outgoing packets. Re-
marking modifies the priority field of packets to relay QoS information to the next-hop device.
Internal marking is used for internal processing and does not modify packets. Internal marking is
implemented on incoming packets for the device to process the packets based on the marks before
forwarding them. The concept of internal marking is discussed later in this document.

l Policing and Shaping: restricts the traffic rate to a specific value. When traffic exceeds
the specified rate, traffic policing drops excess traffic, and traffic shaping buffers excess
traffic.
l Congestion management: places packets in queues for buffering when traffic
congestion occurs and determines the forwarding order based on a specific scheduling
algorithm.
l Congestion avoidance: monitors network resources. When network congestion
intensifies, the device proactively drops packets to regulate traffic so that the network is
not overloaded.
The four QoS components are performed in a specific order, as shown in the following figure.
Figure 3-3 QoS implementation

Congestion
management
Queue
Data 0
Queue n
1 li
Ingre Traffic Traffic Other
Queue
u
d g
Voice Classificati processi e Egress
ss Policing
2 h
on (CAR) ng c
…… S
Video Queue
N
The QoS components are performed at different locations on the network, as shown in the
following figure. In principle, traffic classification, traffic re-marking, and traffic policing are
implemented on the inbound user-side interface, and traffic shaping is implemented on the
outbound user-side interface (if packets of various levels are involved, queue scheduling and a
packet drop policy must be configured on the outbound user-side interface). Congestion
management and congestion avoidance are configured on the outbound network-side
interface.

Figure 3-4 QoS Components
PC
BRAS IGW
Phone
HG ONT
OLT Internet
STB LSW
P/CR CR
VoIP
Broadband DSLAM
access network
Enterprise PE/SR
leased line
IPTV
Corporation CE
Incoming traffic: traffic classification/marking, traffic
policing
Outgoing traffic: congestion management, congestion
avoidance, traffic shaping
Outgoing traffic: congestion management, congestion
avoidance

Feature Description - QoS 4 End-to-End QoS Service Models
4 End-to-End QoS Service Models
Network applications require successful end-to-end communication. Traffic may traverse

multiple routers on one network or even multiple networks before reaching the destination
host. Therefore, to provide an end-to-end QoS guarantee, an overall network deployment is
required. Service models are used to provide an end-to-end QoS guarantee based on specific
requirements.
QoS provides the following types of service models:
l Best-Effort
l Integrated service (IntServ)
l Differentiated service (DiffServ)
Best-Effort
Best-Effort is the default service model on the Internet and applies to various network
applications, such as FTP and email. It is the simplest service model. Without network
approval or notification, an application can send any number of packets at any time. The
network then makes its best attempt to send the packets but does not provide any guarantee
for performance.
The Best-Effort model applies to services that have low requirements for delay and reliability.
IntServ
Before sending a packet, IntServ uses signaling to apply for a specific level of service from
the network. The application first notifies the network of its traffic parameters and specific
service qualities, such as bandwidths and delays. After receiving a confirmation that sufficient
resources have been reserved, the application sends the packets. The network maintains a state
for each packet flow and executes QoS behaviors based on this state to fulfill the promise
made to the application. The packets must be controlled within the range described by the
traffic parameters.
IntServ uses the Resource Reservation Protocol (RSVP) as signaling, which is similar to
Asynchronous Transfer Mode Static Virtual Circuit (ATM SVC), and adopts connection-
oriented transmission. RSVP is a transport layer protocol and does not transmit data at the
application layer. Like ICMP, RSVP functions as a network control protocol and transmits
resource reservation messages between nodes.

When RSVP is used for end-to-end communication, the routers including the core routers on
the end-to-end network maintain a soft state for each data flow. A soft state is a temporary
state that refreshes periodically using RSVP messages. Routers check whether sufficient
resources can be reserved based on these RSVP messages. The path is available only when all
involved routers can provide sufficient resources.
Figure 4-1 IntServ model

I need
PC 2 Mbit/s. I need
d I need
ee . 2 Mbit/s. I nee
I n bit/s 2 Mbit/s. 2 Mb d I nee
Phone 2M it/s. 2 Mb d VoIP
it/s.
OK OK
OK OK
STB OK
OK
IPTV
IntServ uses RSVP to apply for resources over the entire network, requiring that all nodes on
the end-to-end network support RSVP. In addition, each node periodically exchanges state
information with its neighbor, consuming a large number of resources. More importantly, all
nodes on the network maintain a state for each data flow. On the backbone network, however,
there are millions of data flows. Therefore, the IntServ model applies to edge networks and
does not widely apply to the backbone network.
DiffServ
DiffServ classifies packets on the network into multiple classes for differentiated processing.
When traffic congestion occurs, classes with a higher priority are given preference. This
function allows packets to be differentiated and to have different packet loss rates, delays, and
jitters. Packets of the same class are aggregated and sent as a whole to ensure the same delay,
jitter, and packet loss rate.
In the DiffServ model, edge routers classify and aggregate traffic. Edge routers classify
packets based on a combination of fields, such as the source and destination addresses of
packets, precedence in the ToS field, and protocol type. Edge routers also re-mark packets
with different priorities, which can be identified by other routers for resource allocation and
traffic control. Therefore, DiffServ is a flow-based QoS model.

Figure 4-2 DiffServ model
PC DiffServ
DiffServ域
domain VoIP
Phone PHB
PHB PHB IPTV
STB
PHB
Traffic Classification Resource allocation
and marking on the and traffic control
boundary node based on the marker
on the interior node
Compared with IntServ, DiffServ requires no signaling. In the DiffServ model, an application
does not need to apply for network resources before transmitting packets. Instead, the
application notifies the network nodes of its QoS requirements by setting QoS parameters in
IP packet headers. The network does not maintain a state for each data flow but provides
differentiated services based on the QoS parameters of each.
DiffServ takes full advantage of network flexibility and extensibility and transforms
information in packets into per-hop behaviors, greatly reducing signaling operations.
Therefore, DiffServ not only adapts to Internet service provider (ISP) networks but also
accelerates IP QoS applications on live networks.

Feature Description - QoS 5 Overall QoS Process
5 Overall QoS Process
Concept of Upstream and Downstream Traffic

Traffic that a router forwards is classified into upstream and downstream traffic. Traffic sent
to the switched network is called upstream traffic and traffic forwarded from the switched
network is called downstream traffic, as shown in Figure 5-1.
Figure 5-1 Upstream and downstream traffic
User side Network side

Switched
Upstream on Upstream on the
network
the user side network side
Line card Line card
Downstream on Downstream on the

the user side network side

Board Architecture on the Forwarding Plane

Subcard switched
Motherboard
network
Upstream Upstream Upstream Upstream

PIC PFE TM FIC
Downstream
Downstream Downstream Downstream
PIC
PFE TM FIC
eTM
Abbreviations:
PIC: Physical Interface Controller Upstream
PFE: Packet Forward Engine
Downstream
TM: Traffic Manager
FIC: Fabric Interface Controller
eTM: Extra Traffic Manager
Packet Forwarding Process

Figure 5-2 shows how an Ethernet packet is forwarded when the PIC is not equipped with an
eTM subcard.
Figure 5-2 Packet forwarding process when the PIC is not equipped with an eTM subcard
CPU
ls
a Packet Packet
n
g
is
l Upstream board
a
ci lsl
trc t Upstream PFE Upstream TM Upstream e
cSwitched
e e e
l rn m (NP/ASIC) t t
e
FIC roNetwork
/e
l e a
h
e
kc kc ci
k a C
I t rf a M
in
l ict P E e y d
e s y a
P
g g P
l p n ca n tri n s e b w in in n
a m o rf ci io a o o t o
l s s o
ti
ics O a t i ) cif io
o b d t n F s s
e
rt
g
n t uf a rp g
e e
l f n
i f ta g
le
b R
yl e ) l)
e
c e a
t
y i in dra ic p ra ic n a A p s U a o c n
s f p t if i r o e
h s
p s d ot is rd d
t
C
p
a st P n p r
P e n mA s a a F ss r
a
n
o
( t e C io
t e
p m
g
U cr u B la w r mM la R o k p u Q
oP b c o c w
r A n c
a O e O ra
f( o p ( u F
in F C q V
ls
a
n
ig
s
l Downstream board
a
ci
rt Downst lsl
c CP Downstream Downstream e
le c
e t
e
Car PFE t TM t ream o
l/ n e e e
kc FIC
r
k a
cit r m e
(NP/ASIC) kc ic
in e a c a a M
l
l p th fr no a
fr / n g w P n
r it o P
Downstream n d n S o n in lo
a O E g e
t e icf tio ra ) ic
g ff tio L s F
g
n t
s
n
o
o
ti
ics n l P e la ti s e i ti a
PIC i n
i u R fa a n
i a a y a e )l u s a a izt
y s
s d d A rt cfii w k p tr ci
fi M a
-l
u
s m c a e s
e
ci
tl cil
h e n o C
A ss a
c p
aF s in k p r ro n
io
u
q c u p
e
k
P c u m
B a s a
t n
il a
c fo p t o
r M u c
o o l B mM a
l b n p a
r
P b
t c ( c n i Q p d
P
u O e C (O
o

l Packet forwarding process for upstream traffic

a. Optical/electrical signals of a physical link are encapsulated as an Ethernet packet
to be sent to the upstream PFE, which can be a network processor (NP) or an
application-specific integrated circuit (ASIC).
b. The inbound interface processing module on the upstream PFE parses the link layer
protocol and identifies the packet type.
c. The traffic classification module on the upstream PFE implements BA and MF
traffic classification in sequence based on the configuration on the inbound
interface.
d. The upstream PFE searches the forwarding table for an outbound interface and next
hop based on packet information (such as MAC address, destination IP address, and
MPLS label). The upstream PFE drops the packets with the forwarding behavior
drop, and CAR is not implemented for these packets.
e. The upstream PFE implements rate limit for upstream traffic based on the CAR
configuration on the inbound interface or in the MF traffic classification profile.
NOTE
CAR does not apply to CPU packets to prevent packet loss in the case of traffic congestion.
For packets to be sent to the CPU, the upstream PFE independently implements CP-CAR.
f. The upstream PFE sends packets to the upstream TM.
g. The upstream TM processes flow queues (optional) based on the user-queue
configuration on the inbound interface or in the MF classification profile, and then
implements VOQ processing. After that, the upstream TM sends packets to the
upstream Flexible Interface Card (FIC).
h. The upstream FIC fragments packets and encapsulates them into micro cells before
sending them to the switched network.
NOTE
Similar to an ATM module, the switched network forwards packets based on a fixed cell
length. Therefore, packets are fragmented before being sent to the switched network.
l Packet forwarding process for downstream traffic
Micro cells are sent from the switched network to the downstream TM.
a. The downstream FIC encapsulates the micro cells into packets again.
b. The downstream TM duplicates multicast packets.
c. The downstream TM processes flow queues based on the user-queue configuration
on the outbound interface (including the VLANIF interface) if needed, and
processes class queues (CQs) before sending them to the downstream PFE.
d. The downstream PFE searches the forwarding table for packet encapsulation
information. For example, for an IPv4, the PFE searches the forwarding table based
on the next hop. For an MPLS packet, the PFE searches the MPLS forwarding
table.
e. The downstream PFE implements MF classification based on the outbound
interface configuration and then BA traffic classification (only mapping from the
service class and drop precedence to the external priority).
f. The downstream PFE implements rate limit for downstream traffic based on the
CAR configuration on the outbound interface or in the MF traffic classification
profile.
g. For packets to be sent to the CPU, the downstream PFE implements CP-CAR
before sending them to the CPU. For packets not to be sent to the CPU, the

downstream PFE sends them to the outbound interface processing module for an
additional Layer 2 header (Layer 2 header and MPLS header are added for an
MPLS packet). After that, these packets are sent to the PIC.
h. The PIC converts packets to optical/electrical signals and sends them to the physical
link.
Figure 5-3 shows how a packet is forwarded when the PIC is equipped with an eTM subcard.
The operation for the upstream traffic is the same as that when the PIC is not equipped with
an eTM subcard. The difference in operations for downstream traffic lies in that the
downstream flow queues are processed in the eTM subcard when the PIC is equipped with an
eTM subcard and the downstream flow queues are processed on the downstream TM when
the PIC is not equipped with an eTM subcard. In addition, five-level scheduling (FQ -> SQ ->
GQ -> VI -> port) is implemented for downstream flow queues when the PIC is equipped
with an eTM subcard, whereas three-level scheduling + two-level scheduling are implemented
for downstream flow queues when the PIC is not equipped with an eTM subcard.
Figure 5-3 Packet forwarding process when the PIC is equipped with an eTM subcard
CPU
l
a Packet Packet
n
g
is
l
a
Upstream board switched
ci e lsl network
trc m Upstream PFE Upstream Upstrea e
c
e a
rf (NP/ASIC) FIC
l t
m FIC ro
/e
l e t t
e
ci
k a n
r d e kc M
in
l ict e e g e s y kc g g n
n c in ) e w in n a o
l p re
t th o fa n p s n s
a o o
t
b
t a lo s i
s P ti
a O s C
mI E g r c
le iff tio p e ic io
ff ta b e d n P F s s ta
ics p a P n
is te
n ua a a tii
a g
l
b R ly
p
e
s
) )l e
c e
c n
y U i d rt c
fii m ro rt cfii in a
t A p
U a ro o e
s i s P n r
h e d o s rd rp F s d
r n C
( a t C io
p p m
P c n
u mA B s
a a f Ma
s a o t
o
e
k t e g
o
r o l w l R c p u Q a
b c r o c rw A n a e O r
P
in
o
f( o C p (O u V F
F q
l
a
n
ig
s
l Downstream board
a
ci
rt e e lsl
c m ETM m CP Car Downstream Downstr Downstr e
le a
rf subcard a
rf c
PFE (NP/ASIC) eam TM eam FIC
/e
l t t t ro
a IC e
n e
n t
e
e
kc
ci
k ict P r r - k M
in
l p e e e
c k c a
l m
a th h
t no a n
c
la M n i/l n a P n
a O e E e g Eg rf o ) o
it o P
n ts n o
ics rt u n e c
i i d s ti
s
e
u
is
in
t
in
le
u A ff ta ra g
ns
i
F S
ifi tr L re la it
a a tio a
z
y s s a c
d C R tr ifi w Py u c a it
h n q e s d r pc a a s m til c i
w w c e n o s o p tio ffic M a
l p r u lp e
P o lo o
r c
o
u m As
Ba F a
( m in a
c
o
f Mu
k
c
D F p r o lc n ta n in d a
P b
t b e P
u O
o
QoS Implementation During Packet Forwarding

As shown in Figure 5-4, the QoS implementation during packet forwarding is as follows:

Figure 5-4 QoS implementation during packet forwarding

switched
network
Upstream Upstream MF
BA traffic traffic Upstream Upstream
classification classification CAR scheduling
Internal priority:
Service class & Color
Enhanced Downstream Downstream Downstream

scheduling PHB CAR Downstream scheduling
on MF traffic
downstream classification
eTM
Packet direction
Modified traffic markers
Use traffic markers
l On the upstream PFE:

a. The upstream PFE initializes the internal priority of packets (service class as BE
and color as green).
b. The upstream PFE implements BA traffic classification based on the inbound
interface configuration. BA traffic classification requires the upstream PFE to
obtain the priority field value (802.1p, DSCP,ATM CLP or MPLS EXP) for traffic
classification and modify the internal priority of packets (service class and color).
c. The upstream PFE implements MF traffic classification based on the inbound
interface configuration. MF traffic classification modifies the upstream PFE to
obtain multiple field information for traffic classification. After that, the upstream
PFE implements related behaviors ( such as filter, re-mark, or redirect). If the
behavior is re-mark, the upstream PFE modifies the internal priority of packets
(service class and color).
d. The upstream PFE searches the routing table for an outbound interface of a packet
based on its destination IP address.
e. The upstream PFE implements CAR for packets based on the inbound interface
configuration or MF traffic classification profile. If both interface-based CAR and
MF traffic classification-based CAR are configured, MF traffic classification-based
CAR takes effect. In a CAR operation, a pass, drop, or pass+re-mark behavior can
be performed for incoming traffic. If the behavior is pass+re-mark, the upstream
PFE modifies the internal priority of packets (service class and color).
f. Then, packets are sent to the upstream TM.
l On the upstream TM:
a. The upstream TM processes flow queues based on the inbound interface
configuration or MF traffic classification configuration. If both interface-based
user-queue and MF traffic classification-based user-queue are configured, MF
traffic classification-based user-queue takes effect. Packets are put into different

flow queues based on the service class, and WRED drop policy is implemented for
flow queues based on the color if needed.
b. The upstream TM processes VOQs. VOQs are classified based on the destination
board. The information about the destination board is obtained based on the
outbound interface of packets. Then, packets are put into different VOQs based on
the service class.
c. After being scheduled in VOQs, packets are sent to the switched network and then
forwarded to the destination board on which the outbound interface is located.
d. Then, packets are sent to the downstream TM.
l On the downstream TM
a. (This step is skipped when the downstream PIC is equipped with an eTM subcard)
The downstream TM processes flow queues based on the user-queue configuration
on the outbound interface. Packets are put into different flow queues based on the
service class, and WRED drop policy is implemented for flow queues based on the
color if needed.
b. (This step is skipped when the downstream PIC is equipped with an eTM subcard)
The downstream TM processes port queues (CQs). Packets are put into different
CQs based on the service class, and WRED drop policy is implemented for CQs
based on the color if WRED is configured.
c. Then, packets are sent to the downstream PFE.
l On the downstream PFE:
a. The downstream PFE implements MF traffic classification based on the outbound
interface configuration. MF traffic classification requires the downstream PFE to
obtain multiple field information for traffic classification. Behaviors, such as filter
and re-mark, are performed based on traffic classification results. If the behavior is
re-mark, the downstream PFE modifies the internal priority of packets (service class
and color).
b. The downstream PFE implements CAR for packets based on the outbound interface
configuration or MF traffic classification configuration. If both interface-based
CAR and MF traffic classification-based CAR are configured, MF traffic
classification-based CAR takes effect. In a CAR operation, a pass, drop, or pass+re-
mark behavior can be performed for incoming traffic. If the behavior is pass+re-
mark, the downstream PFE modifies the internal priority of packets (service class
and color).
c. The priorities of outgoing packets are set for newly added packet headers and are
modified for existing packet headers, based on the service class and color.
d. Then, packets are sent to the downstream PIC.
n When the PIC is not equipped with an eTM subcard, the PIC adds the link-
layer CRC to the packets before sending them to the physical link.
n When the PIC is equipped with an eTM subcard, the PIC adds the link-layer
CRC to the packets and performs a round of flow queue scheduling before
sending the packets to the physical link. Downstream flow queues are
processed based on the user-queue configuration on the outbound interface.
Packets are put into different FQs based on the service class, and WRED drop
policy is implemented for FQs based on the color if WRED is configured.
When the PIC is equipped with an eTM subcard, downstream packets are not
scheduled on the downstream TM.

Packet Field Changes During Packet Forwarding

After CAR and traffic shaping are performed for packets, the bandwidth calculation is closely
related to the packet length. Therefore, the packet field changes during packet forwarding
require attention.
For example, packet field changes in some common scenarios are described in the following
part.
Figure 5-5 Incoming packet in sub-interface accessing L3VPN networking

IFG(12+7+1)
L2 Header(14) NPtoTM(4)
VLAN Tag(4) L2 Header(14)

Frame
IP Header(20) VLAN Tag(4)
Header(14) switched
network
Data(46~1500) IP Header(20) IP Header(20)
CRC(4) Data(46~1500) Data(46~1500)

Upstream Upstream Upstream
PIC PFE TM

PIC PFE TM
IFG(12+7+1) eTM Header(4) TMtoNP(2)
L2 Header(14) L2 Header(14) Frame

Header(14)
MPLS Label(4) MPLS Label(4)
MPLS Label(4) MPLS Label(4) IP Header(20)
IP Header(20) IP Header(20) Data(46~1500)
Data(46~1500) Data(46~1500)
This field exists only when the downstream
CRC(4)
PIC is equipped with an eTM.

NOTE
l CAR calculates the bandwidth of packets based on the entire packet. For example, CAR counts the
length of the frame header and CRC field but not the preamble, inter frame gap, or SFD of an
Ethernet frame in the bandwidth. The following figure illustrates a complete Ethernet frame (bytes).
Minimum 12 7 1 6 6 2 46 to 1500 4
Destination Source Length/ Data

Inter frame gap Preamble SFD CRC
MAC MAC type payload
The bandwidth covers the CRC field but not the IFG field.
l The upstream PFE adds a Frame Header, which is removed by the downstream PFE. The Frame
header is used to transfer information between chips. NPtoTM and TMtoNP fields are used to
transfer information between the NP and TM.
l When the PIC is not equipped with an eTM subcard, the length of a packet scheduled on the
downstream TM is different from that of the packet sent to the link. To perform traffic shaping
accurately, you must run the network-header-length command to compensate the packet with a
specific length.
On the downstream interface on the network side:
l when the downstream TM implements traffic shaping for packets, the TNtoNP and Frame Header
field values of the packets are not calculated. Therefore, the packet scheduled on the downstream
TM does not contain the IFG, L2 Header (14 bytes), two MPLS Labels, or CRC fields, compared
with the packet sent to the link. A +26-byte compensation (including the L2 header, two MPLS
labels, and CRC field, but not including the IFG field) or a +46-byte compensation (including the
IFG field) can be performed for the packet.
l When the PIC is equipped with an eTM subcard, no packet length compensation is required.
Figure 5-6 Outgoing packet in sub-interface accessing L3VPN networking

IFG(12+7+1)
L2 Header(14)
MPLS Label(4) L2 Header(14) NPtoTM(4)

Frame
MPLS Label(4) Header(14) switched
IP Header(20)
network
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM

PIC PFE TM

Header(14)
VLAN Tag(4) VLAN Tag(4)
IP Header(20) IP Header(20) IP Header(20)
Data(46~1500) Data(46~1500) Data(46~1500)
CRC(4)
This filed exists only when the
downstream PIC is equipped with an eTM.

NOTE
On the downstream interface on the user side:

when the downstream TM implements traffic shaping for packets, the TNtoNP and Frame Header field
values of the packets are not calculated. Therefore, the packet scheduled on the downstream TM does
not contain the IFG, L2 Header (14 bytes), VLAN tag, or CRC fields, compared with the packet sent to
the link. A +22-byte compensation (including the L2 header, VLAN tag, and CRC field, but not
including the IFG field) or a +42-byte compensation (including the IFG field) can be performed for the
packet.
For more details, see Incoming packet in sub-interface accessing L3VPN networking.
Figure 5-7 Incoming packet in sub-interface accessing VPLS networking

IFG(12+7+1) NPtoTM(4)
L2 Header(14)
Frame
VLAN Tag(4) L2 Header(14) Header(14)
IP Header(20) VLAN Tag(4) L2 Header(14)
switched
network
CRC(4) Data(46~1500) Data(46~1500)
Upstream Upstream
Upstream
PIC TM
PFE
Downstream Downstrea Downstream

PIC m PFE TM
L2 Header(14) L2 Header(14)
Frame
MPLS Label(4) MPLS Label(4) Header(14)
MPLS Label(4) MPLS Label(4) L2 Header(14)
L2 Header(14) L2 Header(14) IP Header(20)
Data(46~1500)
Data(46~1500)
CRC(4) Data(46~1500)
CRC(4)
NOTE

TM does not contain the IFG, L2 Header (14 bytes), two MPLS Labels, or CRC fields, compared
with the packet sent to the link. A +26-byte compensation (including the L2 header, two MPLS
labels, and CRC field, but not including the IFG field) or a +46-byte compensation (including the

Figure 5-8 Outgoing packet in sub-interface accessing VPLS networking

IFG(12+7+1)
L2 Header(14)

Frame
MPLS Label(4) Header(14)
L2 Header(14)
Data(46~1500)
IP Header(20) L2 Header(14) L2 Header(14) switched
Data(46~1500)
CRC(4) IP Header(20) IP Header(20) network
CRC(4) Data(46~1500) Data(46~1500)
PIC PFE TM
Downstrea Downstrea Downstream

m PIC m PFE TM
eTM Header(4) TMtoNP(2)
IFG(12+7+1)
Header(14)
L2 Header(14)
IP Header(20) IP Header(20)
IP Header(20)
Data(46~1500) Data(46~1500)
Data(46~1500)
CRC(4)

NOTE

not contain the IFG, VLAN tag, or CRC fields, compared with the packet sent to the link. A +8-byte
compensation (including the VLAN tag (4 bytes) and CRC (4 bytes) field, but not including the IFG
field) or a +28-byte compensation (including the IFG field) can be performed for the packet.

Figure 5-9 Packet in Layer 2 Ethernet forwarding scenarios

NPtoTM(4)
IFG(12+7+1)
Frame
L2 Header(14) Header(14)
L2 Header(14)
VLAN Tag(4)
L2 Header(14)
VLAN Tag(4)
Data(46~1500)
IP Header(20)
VLAN Tag(4)
switched
Data(46~1500)
CRC(4) IP Header(20) network
IP Header(20)
CRC(4) Data(46~1500)
Data(46~1500)
PIC PFE TM

PIC PFE TM

Header(14)
IP Header(20) IP Header(20) L2 Header(14)
Data(46~1500) Data(46~1500) VLAN Tag(4)
CRC(4) IP Header(20)
Data(46~1500)
NOTE
In Layer Ethernet forwarding scenarios, a data frame can be a VLAN-tagged, QinQ-tagged, or untagged
frame. Use a VLAN-tagged frame as an example. In Layer 2 forwarding, both the Layer 2 Ethernet
frame header and the VLAN tag of a packet are forwarded to the downstream TM, and only the CRC
field is removed.when the downstream TM implements traffic shaping for packets, the TNtoNP and
Frame Header field values of the packets are not calculated. Therefore, the packet scheduled on the
downstream TM does not contain the CRC field, compared with the packet sent to the link. A +4-byte
compensation (not including the IFG field) or a +24-byte compensation (including the IFG field) can be
performed for the packet.

Figure 5-10 IP-to-802.1Q packet conversion on the outbound interface in IP forwarding

scenarios
L2 Header(14) Frame
Header(14)
L2 Header(14)
Data(46~1500)
IP Header(20) switched
Data(46~1500)
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM
Downstream Downstream
PFE Downstream
PIC TM

Header(14)
Data(46~1500) Data(46~1500) Data(46~1500)
CRC(4)

NOTE

not contain the IFG, L2 Header, VLAN tag, or CRC fields, compared with the packet sent to the link. A
+22-byte compensation (including the L2 header, VLAN tag, and CRC field, but not including the IFG
field) or a +42-byte compensation (including the IFG field) can be performed for the packet.

Figure 5-11 IP-to-QinQ packet conversion on the outbound interface in IP forwarding

scenarios
L2 Header(14)
Frame
Data(46~1500)
IP Header(20) Header(14)
L2 Header(14) switched
Data(46~1500)
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM
Downstream
PIC
PFE TM

Header(14)
VLAN Tag(4) VLAN Tag(4) IP Header(20)
Data(46~1500) Data(46~1500)
CRC(4) PIC is equipped with an eTM.
NOTE

not contain the IFG, L2 Header, two VLAN tags, or CRC fields, compared with the packet sent to the
link. A +26-byte compensation (including the L2 header, two VLAN tags, and CRC field, but not
packet.

Figure 5-12 Outgoing IP packet on the POS interface in IP forwarding scenarios

NPtoTM(4)
Frame
PPP Header(8) PPP Header(8)
Header(14) switched
Data(46~1500)
network
Data(46~1500)
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM

PIC PFE TM
PPP Header(8)
eTM Header(4) TMtoNP(2)
Data(46~1500)
IP Header(20)
PPP Header(8) Frame
Data(46~1500)
CRC(4) Header(14)
IP Header(20)
Data(46~1500) IP Header(20)
Data(46~1500)

NOTE
On the downstream interface:

TM does not contain the PPP header, compared with the packet sent to the link. A +8-byte
compensation can be performed for the packet.

Figure 5-13 Outgoing L3VPN packet on the user side of the PE in QinQ interface accessing
L3VPN networking
IFG(12+7+1)
L2 Header(14)

Frame
Header(14)
Data(46~1500)
IP Header(20) MPLS Label(4) switched
Data(46~1500)
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM

PIC PFE TM

Header(14)
VLAN Tag(4) VLAN Tag(4) IP Header(20)
Data(46~1500) Data(46~1500)
CRC(4)
NOTE

not contain the IFG, L2 Header (14 bytes), two VLAN tags, or CRC fields, compared with the packet
sent to the link. A +26-byte compensation (including the L2 header, two VLAN tags, and CRC field, but
not including the IFG field) or a +46-byte compensation (including the IFG field) can be performed for
the packet.
For the , the packet scheduled on the downstream TM does not contain a Frame Header. Therefore, a
+26-byte compensation (not including the 20-byte IFG field) or a +46-byte compensation (including the

Figure 5-14 Outgoing L3VPN packet on the user side of the PE in POS interface accessing
L3VPN networking
PPP Header(8) PPP Header(8) NPtoTM(4)

Frame
Header(14)
MPLS Label(4) MPLS Label(4) switched
Data(46~1500)
IP Header(20) IP Header(20) IP Header(20) network
Data(46~1500)
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM

PIC PFE TM
PPP Header(8) eTM Header(4) TMtoNP(2)
Data(46~1500)
IP Header(20) PPP Header(8) Frame
Header(14)
Data(46~1500)
Data(46~1500)
This field exists only when the

NOTE

TM does not contain the PPP header (8 bytes), compared with the packet sent to the link. A +8-byte

Figure 5-15 Outgoing packet in VLAN mapping scenarios

NPtoTM(4)
IFG(12+7+1)
Frame
L2 Header(14) Header(14)
L2 Header(14)
VLAN Tag(4)
L2 Header(14)
VLAN Tag(4)
Data(46~1500)
IP Header(20)
VLAN Tag(4)
switched
Data(46~1500)
CRC(4) IP Header(20) network
IP Header(20)
CRC(4) Data(46~1500)
Data(46~1500)
PIC PFE TM

PIC PFE TM

Header(14)
IP Header(20) IP Header(20) L2 Header(14)
Data(46~1500) Data(46~1500) VLAN Tag(4)
Data(46~1500)
NOTE
In VLAN mapping scenarios, both the Layer 2 Ethernet frame header and the VLAN tag of a packet are
forwarded to the downstream TM, and only the CRC field is removed. The VLAN tag value is replaced
with a new VLAN tag value.
not contain the CRC field, compared with the packet sent to the link. A +4-byte compensation (not
packet.

Figure 5-16 Outgoing packet in POS interface accessing VLL heterogeneous interworking
scenarios
IFG(12+7+1)
L2 Header(14)
MPLS Label(4) L2 Header(14)
MPLS Label(4) MPLS Label(4) NPtoTM(4)
L2 Header(14) MPLS Label(4)

Frame
Data(46~1500)
IP Header(20) L2 Header(14)
Header(14) switched
Data(46~1500)
CRC(4) IP Header(20) IP Header(20)
network
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM

PIC PFE TM
Data(46~1500)
Header(14)
Data(46~1500)
Data(46~1500)

NOTE
In VLL heterogeneous interworking scenarios, both the L2 header and MPLS label of a packets are
removed on the upstream TM.
TM does not contain the PPP header, compared with the packet sent to the link. A +8-byte

Figure 5-17 Outgoing packet in POS interface accessing VLL homogeneous interworking
scenarios
IFG(12+7+1)
L2 Header(14)
MPLS Label(4) L2 Header(14)
MPLS Label(4) MPLS Label(4) NPtoTM(4)
PPP Header(8) MPLS Label(4)

Frame
Header(14)
Data(46~1500)
IP Header(20) PPP Header(8) switched
Data(46~1500)
CRC(4) Data(46~1500) Data(46~1500)

PIC PFE TM

PIC PFE TM
Data(46~1500)
Header(14)
Data(46~1500)
Data(46~1500)
This field exists only when the

NOTE
In VLL heterogeneous interworking scenarios, both the L2 header and MPLS label of a packets are
removed on the upstream TM.
not contain the PPP header, compared with the packet sent to the link. A +8-byte compensation can be
performed for the packet.
Supplement to Packet Length Compensation

The network-header-length command used to configure packet length compensation is
configured in the service template. Certain service templates have been predefined on
NE40Es. Using these service templates, you do not need to calculate the length required in a
compensation
To manually configure the compensation length, you can run the display interface command
to view statistics on the outbound interface and calculate the length of the packet sent to the
link (L1), and run the display port-queue command to view statistics about the queues and
calculate the length of the packet scheduled on the downstream TM (L2). The compensation
length is obtained in this formula: Length compensation = L1 - L2.For example:
<HUAWEI> display interface gigabitethernet 1/0/0
……
Statistics last cleared:2017-06-08 11:25:55
Last 300 seconds input rate: 13848728 bits/sec, 10856 packets/sec
Last 300 seconds output rate: 13183454 bits/sec, 9111 packets/sec
Input peak rate 14347000 bits/sec, Record time: 2017-06-08 11:27:31
Output peak rate 13271131 bits/sec, Record time: 2017-06-08 11:30:05
Input: 1304984264 bytes, 9341140 packets
Output: 1256201849 bytes, 7740964 packets
……

<HUAWEI> display port-queue statistics gigabitethernet 1/0/0 outbound

GigabitEthernet1/0/0 outbound traffic
statistics:
[be]
Current usage percentage of queue: 0
Total pass:
411,963 packets, 335,927,398 bytes
……
[af1]
Total pass:
172,616 packets, 141,875,765 bytes
……
[af2]
Total pass:
54,516 packets, 45,592,293 bytes
……
[af3]
Total pass:
53,650 packets, 44,916,566 bytes
……
[af4]
Total pass:
53,650 packets, 44,915,912 bytes
……
[ef]
Total pass:
54,516 packets, 45,598,519 bytes
……
[cs6]
Total pass:
63,288 packets, 47,061,713 bytes
……
[cs7]
Total pass:
6,895,327 packets, 551,385,377 bytes
……
In the preceding information, L1 = 1256201849 bytes/7740964 packets= 162bytes/packet.

L2= sum of forwarded bytes in eight queues/sum of forwarded packets in eight queues =
1257273543bytes/7759526packet=162bytes/packet.
Therefore, the compensation value can be calculated using the formula of compensation value
= L1–L2 = 0.
Template Name Conversion Type
bridge-outbound Bridge packet conversion in the outbound direction of the tunnel To

be specific, this profile applies to the scenario where traffic shaping
is implemented on the outbound interface in Layer 2 Ethernet
forwarding scenarios.For details about the scenarios and
compensation values, see Packet in Layer 2 Ethernet forwarding
scenarios.

ip-outbound IP packet conversion or IP-to-802.1Q packet conversion on the

outbound interface. To be specific, this profile applies to the
scenario where traffic shaping is implemented on an 802.1Q
outbound interface of an egress PE in IP forwarding or
L3VPN/GRE scenarios.For details about the scenarios and
compensation values, see IP-to-802.1Q packet conversion on the
outbound interface in IP forwarding scenarios.
ip-outbound1 IP-to-QinQ packet conversion on the outbound interface. To be

specific, this profile applies to the scenario where traffic shaping is
implemented on a QinQ outbound interface of an egress PE in IP
forwarding or L3VPN/GRE scenarios.For details about the
scenarios and compensation values, see IP-to-QinQ packet
conversion on the outbound interface in IP forwarding
scenarios.
ip-outbound2 IP packet conversion on the outbound POS interface. To be specific,

this profile applies to the scenario where traffic shaping is
implemented on a POS outbound interface of an egress PE in IP
forwarding or L3VPN/GRE scenarios.For details about the
scenarios and compensation values, see Outgoing IP packet on the
POS interface in IP forwarding scenarios.
l3vpn-outbound2 L3VPN packet conversion on the outbound POS interface. To be

specific, this profile applies to the scenario where traffic shaping is
implemented on a POS outbound interface of an egress PE in
L3VPN scenarios.For details about the scenarios and compensation
values, see Outgoing L3VPN packet on the user side of the PE in
POS interface accessing L3VPN networking.
pbt-outbound PBT packet conversion on the outbound interface. This profile is

reserved for future use.
vlan-mapping- VLAN mapping packet conversion on the outbound interface. To be

outbound specific, this profile applies to the scenario where traffic shaping is
implemented for outgoing packets in VLAN mapping scenarios.For
details about the scenarios and compensation values, see Outgoing
packet in VLAN mapping scenarios.
vll-outbound VLL packet conversion on heterogeneous medium on the outbound

POS interface or VLL-to-QinQ packet conversion in the outbound
direction of the tunnel. To be specific, this profile applies to the
scenario where traffic shaping is implemented on a POS outbound
interface on the AC side of the egress PE in VLL heterogeneous
interworking scenarios, or the scenario where traffic shaping is
implemented on a QinQ outbound interface on the AC side of the
egress PE in common VLL scenarios.For details about the scenarios
and compensation values, see Outgoing packet in POS interface
accessing VLL heterogeneous interworking scenarios.

vll-outbound1 VLL packet conversion on homogeneous medium on the outbound

POS interface or VLL-to-Dot1Q packet conversion in the outbound
direction of the tunnel. To be specific, this profile applies to the
scenario where traffic shaping is implemented on a POS outbound
interface on the AC side of the egress PE in VLL homogeneous
interworking scenarios, or the scenario where traffic shaping is
implemented on an 802.1Q outbound interface on the AC side of the
egress PE in common VLL scenarios.For details about the scenarios
and compensation values, see Outgoing packet in POS interface
accessing VLL homogeneous interworking scenarios.
vpls-outbound VPLS-to-802.1Q packet conversion in the outbound direction of the

tunnel. To be specific, this profile applies to the scenario where
traffic shaping is implemented on an 802.1Q outbound interface on
the AC side of the egress PE in VPLS scenarios.For details about
the scenarios and compensation values, see Outgoing packet in
sub-interface accessing VPLS networking.
vpls-outbound1 VPLS-to-QinQ packet conversion in the outbound direction of the

tunnel. To be specific, this profile applies to the scenario where
traffic shaping is implemented on a QinQ outbound interface on the
AC side of the egress PE in VPLS scenarios.For details about the
scenarios and compensation values, see Outgoing packet in sub-
interface accessing VPLS networking.

Feature Description - QoS 6 Classification and Marking
6 Classification and Marking
About This Chapter
6.1 Traffic Classifiers and Traffic Behaviors

6.2 QoS Priority Fields
6.3 BA Classification
6.4 MF Classification
6.1 Traffic Classifiers and Traffic Behaviors

Traffic Classifiers
Traffic classification technology allows a device to classify packets that enter a DiffServ
domain in order for the device to identify the packet service type and to apply any appropriate
action upon the packet.
Traffic Classification Techniques

Packets can be classified based on QoS priorities (for details, see sectionQoS Priority
Fields), or packet information such as the source IP address, destination IP address, MAC
address, IP protocol, and port number, or specifications in an SLA. Therefore, traffic
classification can be classified as behavior aggregate classification or multi-field
classification. For details, see section BA Classification and MF Classification.
After packets are classified at the DiffServ domain edge, internal nodes provide differentiated
services for classified packets. A downstream node can accept and continue the upstream
classification or classify packets based on its own criteria.
Traffic Behaviors
A traffic classifier is configured to provide differentiated services and must be associated with
a certain traffic control or resource allocation behavior, which is called a traffic behavior.

The following table describes traffic behaviors that can be implemented individually or jointly
for classified packets on a NE40E.
Traffic Behavior Description
Markin External Sets or modifies the priority of packets to relay QoS

g/Re- marking information to the next device.
marking
Internal Sets the class of service (CoS) and drop precedence of packets
marking for internal processing on a device so that packets can be placed
directly in specific queues.
Setting the drop precedence of packets is also called coloring
packets. When traffic congestion occurs, packets in the same
queue are provided with differentiated buffer services based on
colors.
Traffic policing Restricts the traffic rate to a specific value. When traffic
exceeds the specified rate, excess traffic is dropped.
Congestion management Places packets in queues for buffering. When traffic congestion
occurs, the device determines the forwarding order based on a
specific scheduling algorithm and performs traffic shaping for
outgoing traffic to meet users' requirements on the network
performance.
Congestion avoidance Monitors network resources. When network congestion

intensifies, the device drops packets to prevent overloading the
network.
Packet filtering Functions as the basic traffic control method. The device
determines whether to drop or forward packets based on traffic
classification results.
Policy-based routing Determines whether packets will be dropped or forwarded

(also called redirection) based on the following policies:
l Drop PBR states that a specific IP address must be matched
against the forwarding table. If an outbound interface is
matched, packets are forwarded; otherwise, packets are
dropped.
l Forward PBR states that a specific IP address must be
matched against the forwarding table. If an outbound
interface is matched, packets are forwarded; otherwise,
packets are forwarded based on the destination IP addresses.
Load balancing Load balancing is configured to be session-by-session or

packet-by-packet.
Load balancing applies only to packets that have multiple
forwarding paths available. There are two possible scenarios:
l Multiple forwarding entries exist.
l Only one forwarding entry exists, but a trunk interface that
has multiple member interfaces functions as the outbound
interface in the forwarding entry.

Traffic Behavior Description
Packet fragmentation Modifies the Don't Fragment (DF) field of packets.

NOTE
Some packets sent from user terminals are 1500 bytes long. PCs
generally set the DF value to 1 in the packets. When packets traverse
network devices at various layers, such as the access, aggregation, or
core network layer, additional information is added so that the packet
length will exceed the maximum transmission unit (MTU) of 1500
bytes. If such a packet carries the DF value of 1 in the header, the
packet will be dropped. A DF value of 1 specifies that a datagram not
be fragmented in transit. To prevent such packet loss and to keep users
unaware of any change, the device involved is allowed to set the DF
field in an IP header.
URPF (Unicast Reverse Prevents the source address spoofing attack. URPF obtains the
Path Forwarding) source IP address and the inbound interface of a packet and
checks them against the forwarding table. If the source IP
address is not found, URPF considers the source IP address as a
pseudo address and drops the packet.
Flow mirroring Allows a device to copy an original packet from a mirrored port
and to send the copy to the observing port.
Flow sampling Collects information about specific data flow, such as

timestamps, source address, destination address, source port
number, destination port number, ToS value, protocol number,
packet length, and inbound interface information, to intercept
specific users.
Modifying the TTL Modifies the Time To Live (TTL) value of IP packet headers.
value
6.2 QoS Priority Fields

DiffServ provides differentiated services for packets that carry different QoS information in
specific fields. This section describes these fields.
ToS Field in an IPv4 Packet Header

In an IPv4 packet header, the three left-most bits (IP precedence) in the ToS field or the six
left-most bits (DSCP field) in the ToS field are used to identify the QoS priority. The IP
precedence classifies packets into a maximum of eight classes, and the DSCP field classifies
packets into a maximum of 64 classes.

Figure 6-1 ToS field in an IPv4 packet header
DSCP
0 1 2 3 4 5 6 7
Precedence D T R C
8 bit
Version HeadLength ToS Total Length …
Bits in the ToS field as follows:

l Bits 0 to 2 refer to the precedence. The value ranges from 0 to 7. The larger the value,
the higher the precedence. The largest values (7 and 6) are reserved for routing and
updating network control communications. User-level applications can use only the
precedence levels from 0 to 5.
l The D bit refers to the delay. The value 0 indicates no specific requirements for delay
and the value 1 indicates that the network is required to minimize delay.
l The T bit refers to the throughput. The value 0 indicates no specific requirements for
throughput and the value 1 indicates that the network is required to maximize
throughput.
l The R bit refers to the reliability. The value 0 indicates no specific requirements for
reliability and the value 1 indicates that the network is required to maximize reliability.
l The C bit refers to the monetary cost. The value 0 indicates no specific requirements for
monetary cost and the value 1 indicates that the network is required to minimize
monetary cost.
l Bits 6 and 7 are reserved.
Relevant standards defines bits 0 to 6 as the DSCP field, and the three left-most bits indicate
the class selector code point (CSCP) value, which identifies a class of DSCP. Devices that
support DiffServ apply PHBs to packets based on the DSCP value in the packets. For details
about DSCP and PHB, see DSCP and PHB.
TC Field in an IPv6 Header

Two fields in an IPv6 header are related to QoS, Traffic Class (TC), and Flow Label (FL). The
TC field has eight bits and functions the same as the ToS field in an IPv4 packet header to
identify the service type. The FL field has 20 bits and is used to identify packets in the same
data flow. The FL, together with the source and destination addresses, identifies a data flow.
All packets in one data flow share the same FL so that a device can process the packets that
have the same QoS requirement as a whole.
Figure 6-2 TC field in an IPv6 header

8 bit 20 bit
Version Traffic Class Flow Label Payload Length …

EXP Field in an MPLS Header

Multiprotocol Label Switching (MPLS) packets are classified based on the EXP field value.
The EXP field in MPLS packets is similar in function to the ToS field or DSCP field in IP
packets.
Figure 6-3 EXP field in an MPLS header
20bits 3bits 1bits 8bits

Label Exp S TTL
The EXP field is 3 bits long and indicates precedence. The value ranges from 0 to 7 with a
larger value reflecting a higher precedence.
The precedence field in an IP header also has three bits. Therefore, one precedence value in
an IP header exactly corresponds to one precedence value in an MPLS header. However, the
DSCP field in an IP header has 6 bits, unlike the EXP length. Therefore, multiple DSCP
values correspond to only one EXP value. As the IEEE standard defines, the three left-most
bits in the DSCP field (the CSCP value) correspond to the EXP value, regardless of what the
three right-most bits are.
802.1p Value in a VLAN Packet

VLAN packets are classified based on the 802.1p value in the packets. The PRI field (802.1p
value) in a VLAN packet header identifies the QoS requirement. Figure 1-4 illustrates a
VLAN packet header.
Figure 6-4 PRI field in a VLAN packet header

TPID PRI CFI VLAN ID
The PRI field is 3 bits long and indicates precedence. The value ranges from 0 to 7 with a
larger value reflecting a higher precedence.
Table 6-1 Mapping between the 802.1p/IP Precedence value and applications
802.1p/IP Precedence Typical Applications
7 Reserved for network control packets (such as routing

protocol packets)
6
5 Voice streams
4 Video conferencing
3 Call signaling

802.1p/IP Precedence Typical Applications
2 High-priority data streams
1 Medium-priority data streams
0 Best effort (BE) data streams
6.3 BA Classification
6.3.1 What Is BA Classification

Behavior Aggregate (BA) classification allows the device to classify packets based on related
values as follows:
l DSCP value of IPv4 packets
l TC value of IPv6 packets
l EXP value of MPLS packets
l 802.1p value of VLAN packets
It is used to simply identify the traffic that has the specific priority or class of service (CoS)
for mapping between external and internal priorities.
BA classification confirms that the priority of incoming packets on a device is trusted and
mapped to the service-class and color based on a priority mapping table. The service-class and
color of outgoing packets are then mapped back to the priority. For details about priority
mapping, see section QoS Priority Mapping.
To configure BA classification on a NE40E, configure a DiffServ (DS) domain, define a
priority mapping table for the DS domain, and bind the DS domain to a trusted interface.
BA classification applies to the DS internal nodes.
6.3.2 QoS Priority Mapping

The priority field in a packet varies with network type. For example, a packet carries the
802.1p field on a VLAN, the DSCP field on an IP network, and the EXP field on an MPLS
network. To provide differentiated services for different packets, the device maps the QoS
priority of incoming packets to the scheduling precedence (also called service-class) and drop
precedence (also called color), and then performs congestion management based on the
service-class and congestion avoidance based on the color. Before forwarding packets out, the
device maps the service-class and color of the packets back to the QoS priority, which
provides a basis for other devices to process the packets.
A device maps the QoS priority to the service-class and color for incoming packets and maps
the service-class and color back to the QoS priority for outgoing packets, as shown in the
following figure.

Figure 6-5 QoS priority mapping

Upstream
802.1p
DSCP Service SFU

Class
Mapping
MPLS Exp Color
802.1p
Service
Class DSCP
Mapping
Color MPLS Exp
Downstream
Service-class
Service-class refers to the internal service class of packets. Eight service-class values are
available: class selector 7 (CS7), CS6, expedited forwarding (EF), assured forwarding 4
(AF4), AF3, AF2, AF1, and best effort (BE). Service-class determines the type of queues to
which packets belong.
The priority of queues with a specific service-class is calculated based on scheduling
algorithms.
l If queues with eight service-class all use priority queuing (PQ) scheduling, the queues
are displayed in descending order of priorities: CS7 > CS6 > EF > AF4 > AF3 > AF2 >
AF1 > BE.
l If the BE queue uses PQ scheduling (rarely on live networks) but all the other seven
queues use weighted fair queuing (WFQ) scheduling, the BE queue is of the highest
priority.
l If queues with eight service-class all use WFQ scheduling, the priority is irrelevant to
WFQ scheduling.
NOTE
More details about queue scheduling are provided later in this document.
Color
Color, referring to the drop precedence of packets on a device, determines the order in which
packets in one queue are dropped when traffic congestion occurs. As defined by the Institute
of Electrical and Electronics Engineers (IEEE), the color of a packet can be green, yellow, or
red.
Drop precedences are compared based on the configured parameters. For example, if a
maximum of 50% of the buffer area is configured to store packets colored Green, whereas a
maximum of 100% of the buffer area is configured to store packets colored Red, the drop
precedence of packets colored Green is higher than that of packets colored Red.

Trusting the Priority of Received Packets

As described in section Traffic Classifiers and Traffic Behaviors, after packets are
classified on the DiffServ domain edge, internal nodes provide differentiated services for the
packets that are classified. A downstream node can resume the classification result calculated
on an upstream node or perform another traffic classification based on its own criteria. If the
downstream node resumes the classification result calculated on an upstream node, the
downstream node trusts the QoS priority (DSCP, IP precedence, 802.1p, or EXP) of packets
that the interface connecting to the upstream node receives. This is called the mode of trusting
the interface.
A NE40E does not trust the interface by default. After receiving a packet, a NE40E re-marks
the service-class of the packet as BE and the color of the packet as Green, regardless of what
QoS priority the packet carries.
DS Domain and Priority Mapping Table

A NE40E can perform QoS priority mapping based on the priority mapping table. Different
DiffServ (DS) domains can have their own mapping tables. Administrators of a device can
define DS domains and specify differentiated mappings for the DS domains.
A NE40E allows administrators to define a DS domain and has predefined the following
domains:
l Default domain: describes the default mappings between the external priority, service-
class, and color of IP, VLAN, and MPLS packets.
l 5p3d domain: describes the mappings between the 802.1p value, service-class, and color
of VLAN packets. This domain applies to the 802.1ad-compliant local area network
(LAN) that supports five scheduling precedence and three drop precedences.
NOTE
IEEE defines eight PHBs (CS7, CS6, EF, AF4, AF3, AF2, AF1, and BE) and further defines four
PHBs for three drop precedences. Therefore, the total number of PHBs is 16 (4 + 4 x 3 = 16).
There are 64 DSCP values, allowing each PHB to correspond to a DSCP value. However, there are
only eight 802.1p values, causing some PHBs not to have corresponding 802.1p values. Generally
the eight 802.1p values correspond to the eight scheduling precedence. IEEE 802.1ad defines
STAG and CTAG formats, with the STAG supporting Drop Eligible Indicator (DEI) whereas the
CTAG does not. IEEE 802.1ad provides a 3-bit Priority Code Point (PCP) field that applies to both
the CTAG and STAG to specify the scheduling and drop precedence. PCP allows an 802.1p value
to indicate both the scheduling and drop precedences, and also brings the concepts of 8p0d, 7p1d,
6p2d, and 5p3d. The letter p indicates the scheduling precedence, and the letter d indicates the
drop precedence. For example, 5p3d supports five scheduling precedences and three drop
precedences.
The default and 5p3d domains exist by default and cannot be deleted, and only the default
domain can be modified.
Priority Mapping Table for the Default Domain

The mapping between the external priority, service-class, and color on a NE40E is described
as follows:

Table 6-2 Default mapping from the DSCP value to the service-class and color
DSCP Service- Color DSCP Service- Color
class class
0~7 BE Green 28 AF3 Yellow
8 AF1 29 BE Green
9 BE 30 AF3 Red
10 AF1 31 BE Green
11 BE 32 AF4
12 AF1 Yellow 33 BE
13 BE Green 34 AF4
14 AF1 Red 35 BE
15 BE Green 36 AF4 Yellow
16 AF2 37 BE Green
17 BE 38 AF4 Red
18 AF2 39 BE Green
19 BE 40 EF
20 AF2 Yellow 41~45 BE
21 BE Green 46 EF
22 AF2 Red 47 BE
23 BE Green 48 CS6
24 AF3 49~55 BE
25 BE 56 CS7
26 AF3 57~63 BE
27 BE
Table 6-3 Default mapping from the service-class and color to the DSCP value
Service-class Color DSCP
BE Green 0
AF1 Green 10
AF1 Yellow 12
AF1 Red 14

Service-class Color DSCP
AF2 Green 18
AF2 Yellow 20
AF2 Red 22
AF3 Green 26
AF3 Yellow 28
AF3 Red 30
AF4 Green 34
AF4 Yellow 36
AF4 Red 38
EF Green 46
CS6 Green 48
CS7 Green 56
Table 6-4 Default mapping from the IP Precedence/MPLS EXP/802.1p to the service-class
and color
IP Precedence/MPLS Service-class Color

EXP/802.1p
0 BE Green
1 AF1 Green
2 AF2 Green
3 AF3 Green
4 AF4 Green
5 EF Green
6 CS6 Green
7 CS7 Green
Table 6-5 Default mapping from the service-class and color to IP Precedence/MPLS EXP/
802.1p
Service-class Color IP Precedence/MPLS

EXP/802.1p
BE Green, Yellow, Red 0

Service-class Color IP Precedence/MPLS

EXP/802.1p
AF1 Green, Yellow, Red 1
EF Green, Yellow, Red 5
CS6 Green, Yellow, Red 6
CS7 Green, Yellow, Red 7
Priority Mapping Table for the 5p3d Domain

IEEE 802.1ad provided the PCP definition, as shown in the following figure.
Figure 6-6 PCP encoding/decoding

Table6-3-- Priority Code Point encoding
Priority
7 7DE 6 6DE 5 5DE 4 4DE 3 3DE 2 2DE 1 1DE 0 0DE
drop_eligible
8P0D
7 7 6 6 5 5 4 4 3 3 2 2 1 1 0 0
(default)
7P1D 7 7 6 6 5 4 5 4 3 3 2 2 1 1 0 0
PCP
6P2D 7 7 6 6 5 4 5 4 3 2 3 2 1 1 0 0
5P3D 7 7 6 6 5 4 5 4 3 2 3 2 1 0 1 0
Table6-4-- Priority Code Point decoding
PCP 7 6 5 4 3 2 1 0
8P0D
7 6 5 4 3 2 1 0
(default)
drop_eligible
7P1D 7 6 4 4DE 3 2 1 0
Priority
6P2D 7 6 4 4DE 2 2DE 1 0
5P3D 7 6 4 4DE 2 2DE 0 0DE
As shown in Figure 6-6, the number that ranges from 0 to 7 indicates the 802.1p value. The
value in the format of number x+letter DE indicates that the 802.1p priority is x and the

drop_eligible value is true. If the drop_eligible value is false, the drop precedence can be
ignored. If the drop_eligible value is true, the drop precedence cannot be ignored.
The 5p3d domain on a NE40E uses an IEEE 802.1ad-compliant priority mapping table by
default. Table 1-9 shows the mapping table that is designed to match the IEEE 802.1ad.
Table 6-6 IEEE802.1ad-compliant mapping table for the 5p3d domain
802.1p Value to Color Color to 802.1p Value
Drop_eligible Color defined in a Color defined in a Drop_eligible

defined in IEEE NE40E NE40E defined in IEEE
802.1ad 802.1ad
false Green Green false
true Yellow Yellow, Red true
The default mapping between the 802.1p value, service-class, and color for the 5p3d domain
on a NE40E is shown in Table 6-7 and Table 6-8.
Table 6-7 Mapping from the 802.1p value to the service-class and color
802.1p Service-class Color
0 BE Yellow
1 BE Green
2 AF2 Yellow
3 AF2 Green
4 AF4 Yellow
5 AF4 Green
6 CS6 Green
7 CS7 Green
NOTE
The mapping from the 802.1p value to the service-class may apply to an inbound interface that belongs
to a non-5p3d domain, leading to eight 802.1p values in Table 6-7. The outbound interface belongs to a
5p3d domain, leading to five service-classes in Table 1-10: BE, AF2, AF4, CS6, and CS7.
Table 6-8 Mapping from the service-class and color to the 802.1p value
Service-class Color 802.1p
BE Green 1

Service-class Color 802.1p
BE Yellow 0
BE Red 0
AF1 Green 1
AF1 Yellow 0
AF1 Red 0
AF2 Green 3
AF2 Yellow 2
AF2 Red 2
AF3 Green 3
AF3 Yellow 2
AF3 Red 2
AF4 Green 5
AF4 Yellow 4
AF4 Red 4
EF Green 5
EF Yellow 4
EF Red 4
CS6 Green/Yellow/Red 6
CS7 Green/Yellow/Red 7
NOTE
In Table 6-8, the mapping from the service-class and color to the 802.1p value may apply to an inbound
interface that uses a 5p3d domain or DSCP, EXP, or IP precedence as a basis for mapping, leading to
eight service-classes. The outbound interface may use a non-5p3d domain, leading to eight 802.1p
values.
IETF RFC Recommendation

IETF relevant standards classifies services into 12 types (Table 6-9) based on service
attributes and service quality requirements and provides recommendation on DSCP mappings
(Table 6-10).

Table 6-9 Recommendation for traffic classification (Figure 2 in RFC)

Service Service QoS Counters
Category Characteristic
s Delay Jitter Packet Loss
Tolerance Tolerance Rate
Tolerance
Network Network Low Low Yes

Control control plane
service flow,
such as a
routing
protocol,
VRRP, and
RSVP-TE
Telephony VoIP services Very Low Very Low Very Low

(such as G.711
and G.729)
Signaling VoIP and video Low Low Yes

service
signaling, such
as SIP, SIP-T,
H.323, and H.
248
Multimedia Desktop Low Very Low Low

Conferencing multimedia -
conference
(including only Medium
voice and
video. Data is
classified as
Low-Latency
Data)
Real-Time Video Low Very Low Low

Interactive conference
(including only
voice and
video. Data is
classified as
Low-Latency
Data), HD
video,
interactive
game (using
RTP/UDP)
Multimedia VoD Low Medium Yes

Streaming -
Medium


Tolerance
Broadcast Broadcast Very Low Medium Low

Video television, real-
time video
surveillance
service
Low-Latency Interactive Low Low Yes

Data important data -
services
requiring quick Medium
response, such
as VCX IP
messaging,
ERP, CRM, and
DB.
OAM Network O&M, Low Medium Yes

maintenance,
and
management
services, such
as SNMP,
Syslog, and
SSH
High- Non-interactive Low Medium Yes

Throughput background -
Data service, not
requiring quick High
response, such
as E-mail and
FTP
Standard Default Internet Not Specified

services (best
effort services).
Services that
are not marked
with priorities
can be
classified into
this category.


Tolerance
Low-Priority Non-real-time High High Yes

Data elastic services,
such as
entertainment
video traffic. If
network
congestion
occurs, services
of this category
are dropped
first.
Table 6-10 Mappings from service types to DSCP values (Figure 3 in RFC)
Service Type DSCP DSCP Value Application Examples
Name
Network Control CS6 110000(48) Network routing
Telephony EF 101110(46) IP Telephony bearer
Signaling CS5 101000(40) IP Telephony signaling
Multimedia AF41 100010(34) H.323/V2 video

Conferencing AF42 100100(36) conferencing (adaptive)
AF43 100110(38)
Real-Time CS4 100000(32) Video conferencing and

Interactive Interactive gaming
Multimedia AF31 011010(26) Streaming video and

Streaming AF32 011100(28) audio on demand
AF33 011110(30)
Broadcast Video CS3 011000(24) Broadcast TV & live

events
Low-Latency Data AF21 010010(18) Client/server transactions

AF22 010100(20) Web-based ordering
AF23 010110(22)
OAM CS2 010000(16) OAM & P

Service Type DSCP DSCP Value Application Examples

Name
High-Throughput AF11 001010(10) Store and forward

Data AF12 001100(12) applications
AF13 001110(14)
Standard CS0 000000(0) Undifferentiated

applications
Low-Priority Data CS1 001000(8) Any flow that has no BW

assurance
Traffic Classification Recommendations from the 3GPP

As defined in TS23.203 by the 3GPP, wireless services are classified as nine classes with
specific QoS class identifiers (QCIs). Each QCI indicates the QoS requirements of one class,
including the resource type, priority, delay, and packet loss rate. QCIs standardize the QoS
requirements. The EPS controls QoS based on QCIs. QCIs are transmitted between NEs so
that QoS parameters do not have to be negotiated or transmitted. QCIs are applied only to
wireless NEs but are invisible at the bearer layer.
3GPP recommendations
Table 6-11 Traffic classification recommendations from the 3GPP (Table 6.1.7 in 3GPP
TS23.203)
QC Resourc Priority Data Packet Error Typical Service
I e Type Packet Rate and
Delay Loss Rate
1 2 100 ms 10-2 Conversational voice
2 4 150 ms 10-3 Conversational video

(real-time)
GBR
3 3 50 ms 10-3 Online games
4 5 300 ms 10-6 Non-conversational video

(buffered streaming media)
5 1 100 ms 10-6 IMS signaling
6 6 300 ms 10-6 Video (buffered streaming)

and TCP-based
Non- applications such as
GBR WWW, emails, chat, FTP,
p2p file sharing, and
progressive scanning
video)

QC Resourc Priority Data Packet Error Typical Service

I e Type Packet Rate and
Delay Loss Rate
7 7 100 ms 10-3 Voice, video (live

streaming), and interactive
game
8 8 300 ms 10-6 Video (buffered streaming)

and TCP-based
9 9 applications such as the
WWW Internet access,
email, chat, FTP, p2p file
sharing, progressive
scanning video)
The 3GPP does not provide any recommendations on mappings between QCIs and DSCP
values. For Huawei's recommendations, see Table 6-12.
Table 6-12 Recommended priority mappings for LTE services

Service Q Reso Typical Service DSCP 802.1p/ PHB
Type CI urce MPLS
Type EXP
User 1 GBR Conversational voice 0x2E(4 5 EF

Plane 6)
2 Conversational video 0x1A(2 3 AF31

6)
3 Online games 0x22(3 4 AF41

4)
4 Non-conversational video 0x1A(2 3 AF31

6)
5 non- IMS signaling 0x30(4 5 EF

GBR 8)
6 Video (buffered streaming) and 0x12(1 2 AF21

TCP-based applications such as 8)
WWW, email, chat, FTP, p2p
file sharing, and progressive
scanning video)
7 Voice, video (live streaming), 0x12(1 2 AF21

and interactive game 8)
8 Video (buffered streaming) and 0x0A(1 1 AF11

TCP-based applications such as 0)
WWW, email, chat, FTP, p2p

Service Q Reso Typical Service DSCP 802.1p/ PHB

Type CI urce MPLS
Type EXP
9 file sharing, and progressive 0x00(0 0 BE

scanning video) 0)
Control - SCTP 0x2E(4 5 EF

Plane 6)
OM - Man-machine language (MML) 0x2E(4 5 EF

6)
- FTP 0x0E(1 1 AF11

4)
IP Clock - 0x2E(4 5 EF
6)
Traffic Classification Recommendations from the GSMA

The GSMA classifies traffic as four types: conversational, streaming, interactive, and
background traffic. The GSMA recommends that the four traffic types be mapped to six
DSCP values recommended by the IETF. For details, see Table 6-13 and Table 6-14.
Table 6-13 Mappings between traffic types and DSCP values (Table 6 in GSMA IR34)
Traffic Type QoS Information
THP(Traffic PHB DSCP

Handing Priority)
Conversational N/A EF 101110 (46)
Streaming media N/A AF41 100010 (34)
Interactive 1 AF31 011010 (26)
2 AF21 010010 (18)
3 AF11 001010 (10)
Background N/A BE 000000 (0)
Table 6-14 Mappings between service applications and DSCP values (Table 7 in GSMA
IR34)
Service Application Diffserv PHB Traffic Type
Video sharing EF Conversational
VoIP EF Conversational

Service Application Diffserv PHB Traffic Type
Push-to-Talk AF4 Streaming media
Video streaming AF4 Streaming media
GTP traffic that cannot be AF3 Interactive

identified
DNS AF3 Interactive
Online games AF3 Interactive
Webpage browsing AF2 Interactive
Instant messaging (IM) AF1 Interactive
Remote connection AF1 Interactive
Email, MMS BE Background
Traffic Classification Recommendations from the IEEE 802.1

As defined in the IEEE 802.1 standard (including the 802.1D, 802.1Q, and 802.1ad), services
are classified as eight classes based on the PCP field (3 bits) in the VLAN tag.
Table 6-15 Traffic classification recommendations from the IEEE 802.1
Traffic Type Priority Example Service Characteristics

Protocol
Network Control 7 BGP, PIM, Network maintenance and management

SNMP packets that must be transmitted in a
reliable manner, with a low packet loss
rate
Internet Work 6 STP, OSPF, Network protocol control packets that

Control RIP are differentiated from common packets
on large-scale networks
Voice 5 SIP, MGCP Voice services that generally require

delay of less than 10 ms
Video 4 RTP Video services that generally require

delay of less than 100 ms
Critical 3 NFS, SMB, Services that require the minimum

Applications RPC bandwidth to be guaranteed
Excellent Effort 2 SQL Used for an information organization to

send messages to the most important
customers.
Best Effort 0(default) HTTP, IM, Default service types, requiring only
X11 best-effort service quality

Traffic Type Priority Example Service Characteristics

Protocol
Background 1 FTP, SMTP Batch transmission services that do not

affect users or key applications
Traffic Classification Recommendations from the MEF

The MEF 23.1 standard classifies services as high-priority (H), medium-priority (M), and
low-priority (L) services and uses CoS labels and drop eligibility identifiers (DEIs). For
details, see Table 6-16.
The MEF 23.1 standard also provides recommended mappings between CoS labels and DSCP
values. For details, see Table 6-17 and Table 6-18.
Table 6-16 Service prioritizing (Table 36 in MEF 23.1)
Service Type COS Label
VoIP H
VoIP & videoconf signaling M
Videoconf data M
IPTV data M
IPTV control M
Streaming media L
Interactive gaming H/M
SANs synch replication M
SANs asynch replication M
Network attached storage L
Text & graphics terminals L
T.38 fax over IP M
Database hot standby M
Database WAN replication M
Database client/server L
Financial/Trading H
CCTV H
Telepresence H
Circuit Emulation H

Service Type COS Label
Mobile BH H H
Mobile BH M M
Mobile BH L L
Table 6-17 Color IDs when the CoS ID type is only EVC or OVC EP (Table 3 in MEF 23.1)
CoS Label CoS ID Color ID
Type
C-Tag PCP PHB (DSCP)
Color Color Color Color

Green Yellow Green Yellow
H EVC or 5, 3 or 1 N/S in Phase EF or AF N/S in Phase

OVC EP 2 (10, 26 or 2
46)
M EVC or 5, 3 or 1 2 or 0 EF or AF AF (0, 12,

OVC EP (10, 26 or 14, 28 or 30)
46)
L EVC or 5, 3 or 1 2 or 0 EF or AF AF (0, 12,

OVC EP (10, 26 or 14, 28 or 30)
46)
Table 6-18 CoS and Color ID (Table 4 in MEF 23.1)

CoS CoS and Color ID
Label
C-Tag PCP PHB (DSCP) S-Tag PCP (DEI S-Tag
not supported) PCP
Color Color Color Color Color Color DEI

Green Yellow Green Yellow Green Yellow support
ed
H 5 N/S in EF(46) N/S in 5 N/S in 5

phase 2 phase 2 phase 2
M 3 2 AF31(26 AF32(28 3 2 3
) )
AF33(30
)
L 1 0 AF11(10 AF12(12 1 0 1
) )

CoS CoS and Color ID

Label
C-Tag PCP PHB (DSCP) S-Tag PCP (DEI S-Tag
not supported) PCP
Color Color Color Color Color Color DEI

Green Yellow Green Yellow Green Yellow support
ed
AF13(14
)
DF(0)
Recommendations from the ITU-T

As defined in Y.1541 by the ITU-T, services can be classified as six classes numbered 0 to 5
(see Table 6-19) based on the four parameters: IPTD, IPDV, IPLR, and IPER. For details
about IP QoS guidance, see Table 6-20 and Figure 6-7.
Table 6-19 IP network QoS type definitions and network performance counters (Table 1 in
ITU-T Y.1541)
Network Network QoS Type
Performa Performance
nce Target Class Class Class Class Class Class 5
Paramete 0 1 2 3 4 Unspeci
r fied
IPTD Upper limit for 100 400 100 400 1s Not

the average IPTD ms ms ms ms required
IPDV Minimum value 50 ms 50 ms Not Not Not Not

of IPTDx(1-10-3)- requir requir require required
IPTD ed ed d
IPLR Upper limit for 1x 1x 1x 1x 1x Not

IPLR 10–3 10–3 10–3 10–3 10–3 required
IPER Upper limit for 1 x 10–4 Not

IPER required
Table 6-20 IP QoS classification guide (Table 2 in ITU-T Y.1541)

QoS Class Application Network Node Technology
Mechanism
0 Real-time, jitter- Independent queues, Restrictions

sensitive, and highly high service priority,
interactive services, and traffic grooming
such as VoIP and VTC

QoS Class Application Network Node Technology

Mechanism
1 Real-time, jitter- Constraint route and

sensitive, and highly distance
interactive services,
such as VoIP and VTC
2 The transaction Independent queues, Constraint route and

(handling) data, and and low packet loss distance
highly interactive rate
services (such as the
signaling)
3 Transaction data and Constraint route and

interactive service distance
4 Services that require Long queue, low Any route/path

only low packet loss packet loss rate
rates, such as short
transaction data, batch
data, and video streams
5 Default traditional Independent queue Any route/path

applications on IP (lowest service
networks priority)
Figure 6-7 Multi-service classification based on a few QoS types (Figure 2 in ITU-T Y.1541)

MPLS DiffSev
On an MPLS network, EXP values are used to identify a maximum of eight service priorities.
If there are more than eight types of services, multiple types must be aggregated to one PHB.
Relevant standards reclassifies services as four types and provides recommended DSCP and
EXP values. For details, see Table 6-21.
Table 6-21 Treatment Aggregate and MPLS EXP Field Usage (Figure 2 and Figure 3 in RFC)
Service P DSCP Four- QoS Counter Exp
Type H Type
B Binary Dela Jitter Packet Binary
(Deci y Tolera Loss (Decimal
mal Tole nce Rate Notation)
Notati ranc Tolera
on) e nce
Network C 11000 Network Low Low Yes 110(6)

Control S6 0(48) Control
Telephony EF 10111 Real- Very Very Very 100(4)

0(46) Time Low Low Low
C 10100
S5 0(40)
A 10001
F4 0(34)
1
Signaling A 10010
F4 0(36)
2
Multimedia A 10011
Conferencing F4 0(38)
3
Real-Time C 10000
Interactive S4 0(32)
Broadcast C 01100
Video S3 0(24)
Multimedia C 01000 Assured Low Low Yes 010(2)

Streaming S2 0(16) Elastic –
Mediu
A 01101 m
F3 0(26)
1
A 01001
F2 0(18)
1

Service P DSCP Four- QoS Counter Exp

Type H Type
B Binary Dela Jitter Packet Binary
(Deci y Tolera Loss (Decimal
mal Tole nce Rate Notation)
Notati ranc Tolera
on) e nce
Low-Latency A 00101
Data F1 0(10)
1
OAM A 01110 011(3)

F3 0(28)
2
A 01010
F2 0(20)
2
A 00110
F1 0(12)
2
A 01111
F3 0(30)
3
A 01011
F2 0(22)
3
High- A 01011
Throughput F1 0(14)
Data 3
Standard De 00000 Elastic Not Specified 000(0)

fa 0(0)
ult
(C
S0
)
Low-Priority C 00100 001(1)

Data S1 0(8)
6.3.3 BA and PHB

BA and PHB Actions
The QoS actions take by the device always depends on the service-class and color of the
packet.

1. The service-class and color of a packet is initialized to <BE, Green>.

2. If the trust upstream command is configured on the inbound interface, when received a
packet, the upstream (inbound) board of the device resets the service-class and color of
the packet based on the priority field(s) (such as DSCP, 802.1p, EXP, and so on) of the
packet. This procedure is called BA action.
3. If remark, or remark within CAR, is configured for the packet, the inbound board resets
the service-class and color of the packet.
4. Then, the device takes other QoS actions based on the service-class and color of the
packet.
5. After the above actions is taken, the downstream (outbound) board should make a
decision whether to modify the priority field(s) (such as DSCP, 802.1p, EXP, and so on)
of the packet or not. In some scenarios, the priority field of the packet is not expected to
be changed. The procedure that the outbound board modifies the priority field(s) of the
packet based on the service-class and color of the packet is called PHB action.
BA and PHB Symbols

The device sets two symbols, "BA-symbol" for inbound board and "PHB-symbol" for
outbound board, to decide whether to take PHB action or not. Both the BA and PHB Symbols
can be "Y" or "N".
The device takes PHB action only when both the two symbols are set as "Y".
By default, the BA-symbol is set as "N", and the PHB-symbol is set as "N".
Both the BA and PHB Symbols can be changed by commands (Table 6-22).
Table 6-22 Commands for BA and PHB Symbols Setting

Commands QoS actions Values of the BA and PHB
symbols
trust upstream The board takes BA action for Both the BA and PHB Symbols are
inbound packets and takes PHB set as "Y".
action for outbound packets.

Commands QoS actions Values of the BA and PHB

symbols
diffserv-mode { pipe | short- The board takes BA action for l The "BA-symbol" and "PHB-
pipe } inbound packets. symbol" keeps unchanged.
diffserv-mode uniform This command is default Both the "BA-symbol" and "PHB-
configuration and does not affect symbol" keep unchanged.
the actions of the inbound and
outbound boards.
service-class l The service-class and color of l The "BA-symbol" is set as "Y"

the packet are reset. if the parameter no-remark is
not configured within the
command.
l The "BA-symbol" is set as "N"
if the parameter no-remark is
configured within the
command.
remark (inbound) The "BA-symbol" is set as "Y" and the "PHB-symbol" is not affected.
The inbound board takes BA action and remarks the inbound packets,
regardless of the "BA-symbol" and "PHB-symbol".
remark (outbound) The outbound board remarks the outbound packets, regardless of the
"BA-symbol" and "PHB-symbol".
For example, assume that the remark dscp 11 command is configured
for outbound interface and the service-class and color of the packet are
<ef, green>. The DSCP of the packet is set as 11 directly, rather than the
value mapped from <ef, green> based on the downstream PHB mapping
table. If the outbound packet has vlan tag, the 802.1p of the vlan tag is set
based on <ef, green> and the downstream PHB mapping table. If both the
remark dscp 11 command and the remark 8021p command are
configured for the outbound interface, then both the DSCP and the
802.1p of the packet are modified directly according the remark
commands.
qos phb enable - The "PHB-symbol" is set as "Y".
qos phb disable - The "PHB-symbol" is set as "N".
qos car { green | yellow | red } The service-class and color of the Both the "BA-symbol" and "PHB-
pass service-class color packet are reset. symbol" keep unchanged.
Rules for PHB Action

As stated above, to control the PHB action, the device set two symbols, "BA-symbol" for
inbound board and "PHB-symbol" for outbound board. They can be changed by commands.
l To set BA-symbol as "Y", configure the "trust upstream", "remark", "service-class"
command on inbound interface.

l To set BA-symbol as "N", configure the "service-class class-value color color-value no-
remark" command, or do not configure the four commands stated above on the inbound
interface.
l To set PHB-symbol as "Y", configure the "trust upstream" or "qos phb enable"
command on outbound interface.
l To set PHB-symbol as "N", configure the "qos phb disable" command or do not
configure the "trust upstream" command on outbound interface.
Table 6-23 Rules for PHB Action

Symbols Mapping back?
BA PHB
N N No
Y N No
Y Y Yes
Which Priority Field is Trusted

The trusted QoS priority fields are depended on the configuration on the inbound interface,
see Table 6-24.
Table 6-24 Trusted Priority Fields

Inbound interface Packet type Which field is trusted
configuration
trust trust
upstream 8021p
No No Any types No field is trusted and the packet is mapped to <BE, Green>.
No Yes Any types No field is trusted and the packet is mapped to <BE, Green>.
Yes No IPoE, IPoVLAN, DSCP

IPoQinQ
IPoPPP, DSCP
IPoHDLC, IPoFR
MPLS EXP in outer label
Other types l No field is trusted.

l The recognized packets are mapped to <CS6, Green>.
Usually, these recognized packets are protocol packets, such
as IS-IS packets, ARP packets, PPPoE packets, and so on So
they are mapped to <CS6, Green>.
l Unrecognized packets are mapped to <BE, Green>. Usually,
the unrecognized packets are error packets, so they are
mapped to <BE, Green>.

Inbound interface Packet type Which field is trusted

configuration
trust trust
upstream 8021p
Yes Yes VLAN 802.1p
QinQ 802.1p in outer VLAN
Non-VLAN & No field is trusted and the packet is mapped to <BE, Green>.
Non-QinQ
NOTE
"Other types" here indicates neither IP nor MPLS packets when the outer L2 header is removed.
Which Priority Field of the Inbound Packet Is Reset in PHB Action
Table 6-25 Which Priority Field Is Reset in PHB Action
Inbound interface configuration Priority field remarked in mapping back

action
trust upstream trust 8021p
Not configured Not configured No field is remarked
Not configured Configured No field is remarked
Configured Not configured l MPLS packet: Outer EXP

l Non-MPLS packet: DSCP
Configured Configured Outer 8021p
Rules for Marking the 802.1p Field of New-added VLAN Tag

Table 6-26 lists the rules for marking the 802.1p field of the new-added VLAN tag.
Table 6-26 Rules for Marking the 802.1p Field of New-added VLAN Tag
PHB-symbol Rules for marking the 802.1p field of the new-added VLAN tag
Y According to the <service-class, color> of the packet and the downstream priority
mapping table.
N Mark as 0.
Rules for Marking the EXP Field of New-added MPLS Header

Table 6-27 lists the rules for marking the EXP field of the new-added MPLS header.

Table 6-27 Rules for Marking the EXP Field of New-added MPLS Header
PHB-symbol Rules for marking the EXP field of the new-added MPLS header
Y Both inner and outer EXP: according to the <service-class, color> of the packet and the
downstream priority mapping table.
N Inner EXP of L3VPN or VLL: according to the <service-class, color> of the packet and
the downstream priority mapping table.
Outer EXP: according to the service-class of the packet:
l if service-class =BE, then Exp=0;
l if service-class =AF1, then Exp=1;
l if service-class =EF, then Exp=5;
l if service-class =EF, then Exp=5;
l if service-class =CS6, then Exp=6;
l if service-class =CS7, then Exp=7.
Inner EXP of VPLS: the same as the mark method of the outer EXP.
6.4 MF Classification
6.4.1 What Is MF Classification

Multi-Field Classification
As networks rapidly develop, services on the Internet become increasingly diversified.
Various services share limited network resources, especially when multiple services use port
number 80. Because of this increasing demand, network devices are required to possess a high
degree of sensitivity for services, including an in-depth parsing of packets and a
comprehensive understanding of any packet field at any layer. This level of sensitivity rises
far beyond what behavior aggregate (BA) classification can offer. Multi-field (MF)
classification can be deployed to help address this sensitivity deficit.
MF classification allows a device to elaborately classify packets based on certain conditions,
such as 5-tuple (source IP address, source port number, protocol number, destination address,
and destination port number). To simplify configurations and facilitate batch modification,
MF classification commands are designed based on a template. For details, see section QoS
Policies Based on MF Classification.
MF classification is implemented at the network edge. The following table shows four modes
of MF classification on a NE40E.

MF Items Remarks
Classification
Layer 2 (link 802.1p value in Items can be jointly used as required.

layer) MF the outer VLAN
classification tag
802.1p value in
the inner VLAN
tag
Source MAC
address
Destination MAC
address
Protocol field
encapsulated in
Layer 2 headers
IP MF IPv4 DSCP value Items can be jointly used as required.

classificat
ion IP precedence
Source IPv4
address
NOTE
The IP address
pool is also
supported.
Destination IPv4
address
NOTE
The IP address
pool is also
supported.
IPv4 fragments
TCP/UDP source
port number
TCP/UDP
destination port
number
Protocol number
TCP
synchronization
flag
IPv6 DSCP value Items can be jointly used as required.
Protocol number

MF Items Remarks
Classification
Source IPv6
address
NOTE
The IP address
pool is also
supported.
Destination IPv6
address
NOTE
The IP address
pool is also
supported.
TCP/UDP source
port number
TCP/UDP
destination port
number
MPLS MF EXP A maximum of four labels can be identified.

classification The three fields can be jointly used in each label
Label as needed.
TTL
UCL MF DSCP value Items can be jointly used as required.

classification
IPv4/IPv6
precedence
Source IPv4/IPv6
address
Destination IPv4/
IPv6 address
IPv4 fragments
TCP/UDP source
port number
TCP/UDP
destination port
number
Protocol number
TCP
synchronization
flag
User-group

NOTE
In addition to the preceding items that can be used in MF classification, a NE40E can perform MF
classification based on VLAN IDs, but does not use the VLAN ID solely. Instead, the MF classification
policy is bound to a VLAN ID (the same as being bound to an interface). The MF classification modes
shown in Table 1-1 support MF classification based on VLAN IDs.
In addition, a NE40E supports MF classification based on time periods for traffic control. MF
classification based on time periods allows carriers to configure a policy for each time period so that
network resources are optimized. For example, analysis on the usage habits of subscribers shows that the
network traffic peaks from 20:00 to 22:00, during which large volumes of P2P and download services
affect the normal use of other data services. Carriers can lower the bandwidths for P2P and download
services during this time period to prevent network congestion.
Configuration example:
time-range test 20:00 to 22:00 daily
acl 2000
rule permit source 10.9.0.0 0.0.255.255 time-range test //Configure
time-range in the ACL rule to specify the period during which the rule takes
effect.
traffic classifier test
if-match acl 2000
traffic behavior test

car cir 100000
traffic policy test

classifier test behavior test
interface xxx
traffic-policy test inbound
6.4.2 Traffic Policy Based on MF Classification

Multiple traffic classifiers and behaviors can be configured on a NE40E. To implement Multi-
Field (MF) classification, a traffic policy in which classifiers are associated with specific
traffic behaviors is bound to an interface. This traffic policy based on MF classification is also
called class-based QoS.
A traffic policy based on MF classification is configured using a profile that allows batch
configuration or modification.
The QoS profile covers the following concepts:
l Traffic classifier: defines the service type. The if-match clauses are used to set traffic
classification rules.
l Traffic behavior: defines actions that can be applied to a traffic classifier. For details
about traffic behaviors, see section Traffic Classification and Traffic Behavior.
l Traffic policy: associates traffic classifiers and behaviors. After a traffic policy is
configured, it is applied to an interface.
The following figure shows relationships between an interface, traffic policy, traffic behavior,
traffic classifier, and ACL.

Figure 6-8 Relationships between an interface, traffic policy, traffic behavior, traffic
classifier, and ACL.
Port Port
Policy Policy
C&B C&B C&B
Classifier Classifier Behavior
Match Match Match Action Action
ACL ACL
Rule Rule Rule Rule
(1) A traffic policy can be applied to different interfaces.
(2) One or more classifier and behavior pairs can be configured in a traffic policy. One
classifier and behavior pair can be configured in different traffic policies.
(3) One or more if-match clauses can be configured for a traffic classifier, and each if-match
clause can specify an ACL. An ACL can be applied to different traffic classifiers and contains
one or more rules.
(4) One or more actions can be configured in a traffic behavior.
And/Or Logic in Traffic Classifiers

If a traffic classifier has multiple matching rules, the And/Or logic relationships between rules
are described as follows:
l And: Packets that match all the if-match clauses configured in a traffic classifier belong
to this traffic classifier.
l Or: Packets that match any of the if-match clauses configured in a traffic classifier
belong to this traffic classifier.
NOTE
If several ACL rules and if-match rules configured in a traffic classifier, the And Logic needs
packets match one ACL rule and all if-match rules.
Shared and Unshared Modes of a Traffic Policy

A traffic policy works in either shared or unshared mode. For example, a traffic policy defines
that the bandwidths of TCP and UDP traffic are restricted to 100 Mbit/s and 200 Mbit/s,
respectively, and that the bandwidth of other traffic is restricted to 300 Mbit/s. If the traffic
policy is applied to two interfaces, there are two possible scenarios:

l If the traffic policy is in unshared mode, the two interfaces to which the traffic policy
applies are restricted individually. On each interface, the bandwidths of TCP traffic, UDP
traffic, and other traffic are restricted to 100 Mbit/s, 200 Mbit/s, and 300 Mbit/s,
respectively.
l If the traffic policy is in shared mode, the two interfaces to which the traffic policy
applies are restricted as a whole. The total bandwidths of TCP traffic, UDP traffic, and
other traffic on the two interfaces are restricted to 100 Mbit/s, 200 Mbit/s, and 300
Mbit/s, respectively.
NOTICE
If a traffic policy works in shared mode, the interfaces must apply traffic policy from the same
network processor on the same board.
Traffic Policy Implementation
Figure 6-9 Traffic policy implementation
Start
Get First
Classifier
No Yes Get First

Classifier
valid? If-match
No Yes
Get Next If-match Get First
Classifier valid？ Rule
Deny
Get Next No Rule Yes Match Yes Rule Permit
If-match valid？ rule？ or Deny？
No Permit
Get Next
Rule
Forward Perform Drop

packets behaviors packets
End

As shown in the figure, a packet is matched against traffic classifiers in the order in which
those classifiers are configured. If the packet matches a traffic classifier, no further match
operation is performed. If not, the packet is matched against the following traffic classifiers
one by one. If the packet matches no traffic classifier at all, the packet is forwarded with no
traffic policy executed.
If multiple if-match clauses are configured for a traffic classifier, the packet is matched
against them in the order in which they are configured. If an ACL or UCL is specified in an
if-match clause, the packet is matched against the multiple rules in the ACL or UCL. The
system first checks whether the ACL or UCL exists. (A non-existent ACL or UCL can be
applied to a traffic classifier.) If the packet matches a rule in the ACL or UCL, no further
match operation is performed.
A permit or deny action can be specified in an ACL for a traffic classifier to work with
specific traffic behaviors as follows:
l If the deny action is specified in an ACL, the packet that matches the ACL is denied,
regardless of what the traffic behavior defines.
l If the permit action is specified in an ACL, the traffic behavior applies to the packet that
matches the ACL.
NOTE
For traffic behavior mirroring or sampling, even if a packet matches a rule that defines a deny action, the
traffic behavior takes effect for the packet.
Cascaded Traffic Policy

A traffic policy (parent policy) can have another traffic policy (child policy) configured in its
traffic behavior to cascade over the new traffic policy.
One traffic policy (parent policy) can cascade over multiple traffic policies (child policies),
and one traffic policy (child policy) can be cascaded by multiple traffic policies (parent
policies). However, the traffic policies cannot be circulated or nested.
When a two-level traffic policy instance is formed and the action of the traffic behavior in the
parent policy is the same as that of the traffic behavior in the child policy, the action of the
traffic behavior in the child policy is implemented.
NOTE
The same action configuration refers to the same action type. Even if the parameters are different, the
actions of the same type are considered the same action configuration. In this case, the action of the
traffic behavior in the child policy is implemented.
When the traffic behaviors for the parent and child policies are both service-class, service-class in the
parent policy preferentially takes effect. However, if service-class in the parent policy carries no-
remark, service-class in the child policy preferentially takes effect.
Hierarchical CAR
When a two-level traffic policy instance is created and the actions of the traffic behaviors in
both policies are CAR, both CAR configurations take effect. In addition, the CAR of the child
policy is implemented before the CAR of the parent policy. This is hierarchical CAR.
For example, the overall rate of the 1.1.1/24 network segment is set to 5 Mbit/s, and the rates
of the IP addresses 1.1.1.1/32 and 1.1.1.2/32 on the 1.1.1/24 network segment need to be
separately restricted and are set to 1 Mbit/s and 3 Mbit/s, respectively.

6.4.3 QPPB
QoS Policy Propagation on BGP (QPPB) is a special Multi-Field (MF) classification
application.
Background
The following example uses the network shown in Figure 6-10 to illustrate how QPPB is
introduced. In this networking, the AS 400 is a high priority network. All packets transmitted
across AS 400 must be re-marked with an IP precedence for preferential transmission. To
meet such requirements, edge nodes (Node-A, Node-B, and Node-C) in AS 100 must be
configured to re-mark the IP precedence of packets destined for or sent from AS 400. The
edge interface connecting to AS 400 on Node-C must be configured to re-mark packets.
Node-A or Node-B must be configured to perform traffic classification for packets destined
for an IP address in AS 400. If a large number of IP addresses or address segments are
configured in AS 400, Node-A and Node-B encounter excess traffic classification operations.
In addition, if the network topology is prone to changes, a large number of configuration
modifications are required.
Figure 6-10 Inter-AS network
AS 300
AS 400
Node-B 1.2.3.4
...
Node-C 12.1.3.5
Node-A ...
124.32.1.1
AS 100 ...
196.1.2.1
...
AS 200
To simplify configuration on Node-A and Node-B, QPPB is introduced. QPPB allows packets
to be classified based on AS information or community attributes.
QPPB, as the name implies, applies QoS policies using Border Gateway Protocol (BGP). The
primary advantage of QPPB is that BGP route attributes can be set for traffic classification by
the route sender and the route receiver must only configure an appropriate policy for receiving
routes. The route receiver sets QoS parameters for packets matching the BGP route attributes
and then implements corresponding traffic behaviors before data forwarding. When the
network topology changes, the BGP route receiver does not modify local configurations if the
route attributes of the advertised BGP routes do not change.
Implementation
As shown in Figure 6-11, Node-A and Node-C are IBGP peers in AS 100. Node-A is
configured to re-mark IP precedence for packets destined for or sent from AS 400. The QPPB
implementation is as follows:

Figure 6-11 QPPB Implementation
1) Configure
2) AS400 routes BGP attributes
3) If received routes advertised to for AS400
match the BGP BGP peers carry routes
attribute, set BGP attributes
behavior IDs in the AS 400
FIB 1.2.3.4
...
Node-C 12.1.3.5
4) Perform the
BGP route ...
Behavior with Node-A
advertisement 124.32.1.1
the behavior ID
...
AS 100 196.1.2.1
AS 200 ...
Configure traffic
behaviors for
routes matching
attributes
1. The BGP route sender (Node-C) sets specific attributes for BGP routes (such as the
AS_Path, community attributes, and extended community attributes).
2. Node-C advertises these BGP routes.
3. The BGP route receiver (Node-A) presets attribute entries. After receiving BGP routes
matching the attribute entries, the BGP route receiver sets a behavior ID identifying a
traffic behavior in the forwarding information base (FIB) table.
4. Before transmitting packets, Node-A obtains the behavior IDs of the routes from the FIB
for these packets and performs the corresponding traffic behaviors for these packets.
The preceding process demonstrates that QPPB does not transmit the QoS policy along with
the BGP route information. The route sender sets route attributes for routes to be advertised,
and the route receiver sets the QoS policy based on the route attributes of the destination
network segment.
Typical Application 1: Inter-AS Traffic Classification
Figure 6-12 Inter-AS Traffic Classification
AS 300
Node-B
Node-C
Node-A AS 400
BGP route
advertisement
AS 100
AS 200
If-math community
Set community
apply car

As shown in Figure 6-12, QPPB allows the edge devices in AS 100 to classify inter-AS
packets. For example, to configure rate limit on Node-C for packets transmitted between AS
200 and AS 400, perform the following operations:
l For packets from AS 200 to AS 400, apply source address-based QPPB on all Node-C's
interfaces that belong to AS 100.
l For packets from AS 400 to AS 200, apply destination address-based QPPB on the
Node-C's interface connecting to AS 400.
NOTICE
FIB-based packet forwarding applies to upstream traffic but not downstream traffic.
Therefore, QPPB is enabled on the upstream interface of traffic.
Typical Application 2: L3VPN Traffic Classification
Figure 6-13 QPPB application to L3VPN traffic

If-math community
apply behavior
Set community
CE
VPN1 VPN1
BGP VPN route
CE advertisement
MPLS CE
CE backbone PE
VPN2 VPN2
PE
As shown in Figure 6-13, PEs connect to multiple VPNs. A PE can set route attributes, such
as community, for a specified VPN instance before advertising any route. After receiving the
routing information, the remote peer imports the route and the associated QoS parameters to
the FIB table. This enables the traffic from CEs to be forwarded based on the corresponding
traffic behaviors. In this manner, different VPNs can be provided with different QoS
guarantees.

Typical Application 3: User-to-ISP Traffic Accounting
Figure 6-14 QPPB application for user-to-ISP traffic accounting
Enable destination- BGP community

based QPPB and apply 1.1.1.12 attribute
a QPPB policy to 100:12
incoming traffic
User A ISP of nation X
BGP route
advertisement
Inter-nation gateway
1.1.1.10
BGP community ISP of Nation Y

attribute
100:10
Local ISP BGP community
1.1.1.13
attribute
1.1.1.11 100:13
Destination Behavior ID BGP community
attribute
Local ISP 10 100:11
Domestic inter- 11 Domestic inter-
domain ISP domain ISP
ISP of nation X 12
ISP of nation Y 13
As shown in Figure 6-14, QPPB is implemented as follows for user-to-ISP traffic accounting:
l BGP routes are advertised with community attributes.
l BGP routes are imported and the community attributes of the BGP routes are matched
against attribute entries. Behavior IDs are set in the FIB table for the routes matching the
attribute entries.
l A QPPB policy is configured. A corresponding traffic behavior (such as statistics
collection, CAR, and re-marking) is configured for the qos-local-id (Behavior ID).
l Destination address-based QPPB is enabled for incoming traffic.
l The QPPB policy is applied to incoming traffic on the user-side interface.
l During packet forwarding, the Behavior ID (qos-local-id) is obtained for packets based
on the destination IP address, and the corresponding traffic behavior is performed.

Typical Application 4: ISP-to-User Traffic Accounting
Figure 6-15 Application for ISP-to-user traffic accounting
BGP community
1.1.1.12 attribute
100:12
Apply a QPPB
policy
User A ISP of nation X
BGP route
Enable source advertisement
address-based Enable source Inter-nation gateway
1.1.1.10
QPPB on interface address-based
BGP community QPPB on interface
ISP of nation Y
attribute
100:10
Local ISP BGP community
1.1.1.13
attribute
1.1.1.11 100:13
Destination Behavior ID BGP community
attribute
Local ISP 10
Domestic inter- 100:11
Domestic inter- domain ISP
11
domain ISP
ISP of nation X 12
ISP of nation Y 13
As shown in Figure 6-15, QPPB is implemented as follows for ISP-to-user traffic accounting:
l BGP routes are advertised with community attributes.
l BGP routes are imported and the community attributes of the BGP routes are matched
against attribute entries. Behavior IDs are set in the FIB table for the routes matching the
attribute entries.
l A QPPB policy is configured. A corresponding traffic behavior (such as statistics
collecting, CAR, and re-marking) is configured for the qos-local-id (Behavior ID).
l Source address-based QPPB is enabled for incoming traffic.
l The QPPB policy is applied to outgoing traffic on the user-side interface.
l During packet forwarding, the Behavior ID (qos-local-id) is obtained for packets based
on the source IP address, and the corresponding traffic behavior is performed.
qppb local-policy policyA
qos-local-id 10 behavior b10

Feature Description - QoS 7 Traffic Policing and Traffic Shaping
7 Traffic Policing and Traffic Shaping
About This Chapter
7.1 Traffic Policing

7.2 Traffic Shaping
7.3 Comparison Between Traffic Policing and Traffic Shaping
7.1 Traffic Policing
7.1.1 Overview
Traffic policing controls the rate of incoming packets to ensure that network resources are
properly allocated. If the traffic rate of a connection exceeds the specifications on an
interface, traffic policing allows the interface to drop excess packets or re-mark the packet
priority to maximize network resource usage and protect carriers' profits. An example of this
process is restricting the rate of HTTP packets to 50% of the network bandwidth)
Traffic policing implements the QoS requirements defined in the service level agreement
(SLA). The SLA contains parameters, such as the Committed Information Rate (CIR), Peak
Information Rate (PIR), Committed Burst Size (CBS), and Peak Burst Size (PBS) to monitor
and control incoming traffic. The device performs Pass, Drop, or Markdown actions for the
traffic exceeding the specified limit. Markdown means that packets are marked with a lower
service class or a higher drop precedence so that these packets are preferentially dropped
when traffic congestion occurs. This measure ensures that the packets conforming to the SLA
can have the services specified in the SLA.
Traffic policing uses committed access rate (CAR) to control traffic. CAR uses token buckets
to meter the traffic rate. Then preset actions are implemented based on the metering result.
These actions include:
l Pass: forwards the packets conforming to the SLA.
l Discard: drops the packets exceeding the specified limit.
l Re-mark: re-marks the packets whose traffic rate is between the CIR and PIR with a
lower priority and allows these packets to be forwarded.

7.1.2 Token Bucket

What Is a Token Bucket
A token bucket is a commonly used mechanism that measures traffic passing through a
device.
A token bucket can be considered a container of tokens, which has a pre-defined capacity.
Tokens are put into the token bucket at a preset rate. When the token bucket is full of tokens,
no more tokens can be added. Figure 7-1 shows a token bucket.
Figure 7-1 Token bucket
Add tokens
Overflow tokens are
dropped.
Token capability
(depth)
NOTE
A token bucket measures traffic but does not filter packets or perform any action, such as dropping
packets.
As shown in Figure 7-2, when a packet arrives, the device obtains enough tokens from the
token bucket for packet transmission. If the token bucket does not have enough tokens to send
the packet, the packet either waits for enough tokens or is discarded. This feature limits
packets to be sent at a rate less than or equal to the rate at which tokens are generated.
Figure 7-2 Processing packets using token buckets
Add tokens
Yes. Forward packets

Sufficient
tokens?
Arriving packets
No. Drop or buffer packets

The token bucket mechanism widely applies to QoS technologies, such as the committed
access rate (CAR), traffic shaping, and Line Rate (LR).
NOTE
This section only describes how to meter and mark packets using token buckets.
Two Token Bucket Markers

Relevant standards define two token bucket markers respectively: single rate three color
marker (srTCM), and two rate three color marker (trTCM). Both token bucket mark the
packets either green, yellow, or red. Note that the colors in token bucket markers are
irrelevant to those indicating drop precedence. The srTCM focuses on the burst packet
size, whereas the trTCM focuses on the burst traffic rate. The srTCM, which is simpler than
the trTCM, is widely used for traffic metering.
Both token bucket markers operate in Color-Blind or Color-Aware mode. The widely used
Color-Blind mode is the default one.
Parameters for srTCM

The following parameters are involved in srTCM:
l Committed Information Rate (CIR): the rate at which tokens are put into a token bucket.
The CIR is expressed in kbps.
l Committed Burst Size (CBS): the committed volume of traffic that an interface allows to
pass through, also the depth of a token bucket. The CBS is expressed in bytes. The CBS
must be greater than or equal to the size of the largest possible packet in the stream. Note
that sometimes a single packet can consume all the tokens in the token bucket. The larger
the CBS is, the greater the traffic burst can be.
l Peak burst size (PBS): the maximum size of burst traffic before all traffic exceeds the
CIR. The PBS is expressed in bytes.
A packet is marked green if it does not exceed the CBS, yellow if it exceeds the CBS but does
not exceed the PBS, and red if it exceeds the PBS.
Mechanism for srTCM

A NE40E uses two token buckets for srTCM.
Figure 7-3 Mechanism for srTCM
CIR
Overflow
Bucket
Bucket E EBS
C CBS

The srTCM uses two token buckets, C and E, which both share the common rate CIR. The
maximum size of bucket C is the CBS, and the maximum size of bucket E is the EBS.
When the EBS is 0, no token is added in bucket E. Therefore, only bucket C is used for
srTCM. When only bucket C is used, packets are marked either green or red. When the EBS
is not 0, two token buckets are used and packets are marked either green, yellow or red.
Method of Adding Tokens for srTCM

In srTCM, both buckets C and E are initially full. Tokens are put into bucket C and then
bucket P after bucket C is full of tokens. After both buckets C and P are filled with tokens,
subsequent tokens are dropped.
Both buckets C and E are initially full.
Rules for srTCM

Tc and Tp refer to the number of tokens in buckets C and P, respectively. The initial values of
Tc and Tp are respectively the CBS and PBS.
In Color-Blind mode, the following rules apply when a packet of size B arrives at time t:
l When one token bucket is used:
– If Tc(t) – B ≥ 0, the packet is marked green, and Tc is decremented by B.
– If Tc(t) – B < 0, the packet is marked red, and Tc remains unchanged.
l When two token buckets are used:
– If Tc(t) – B ≥ 0, the packet is marked green, and Tc is decremented by B.
– If Tc(t) – B < 0 but Tp(t) - B ≥ 0, the packet is marked yellow, and Tp is
decremented by B.
– If Tp(t) – B < 0, the packet is marked red, and neither Tc nor Tp is decremented.
In Color-Aware mode, the following rules apply when a packet of size B arrives at time t:
l When one token bucket is used:
– If the packet has been pre-colored as green and Tc(t) - B ≥ 0, the packet is re-
marked green, and Tc is decremented by B.
– If the packet has been pre-colored as green and Tc(t) – B < 0, the packet is re-
marked red, and Tc remains unchanged.
– If the packet has been pre-colored as yellow or red, the packet is re-marked red
regardless of the packet length. The Tc value remains unchanged.
l When two token buckets are used:
– If the packet has been pre-colored as green and Tc(t) - B ≥ 0, the packet is re-
marked green, and Tc is decremented by B.
– If the packet has been pre-colored as green and Tc(t) – B < 0 but Tp(t) - B ≥ 0, the
packet is marked yellow, and Tp is decremented by B.
– If the packet has been pre-colored as yellow and Tp(t) – B ≥ 0, the packet is re-
marked yellow, and Tp is decremented by B.
– If the packet has been pre-colored as yellow and Tp(t) – B < 0, the packet is re-
marked red, and Tp remains unchanged.
– If the packet has been pre-colored as red, the packet is re-marked red regardless of
the packet length. The Tc and Tp values remain unchanged.

Parameters for trTCM

trTCM covers the following parameters:
l CIR: the rate at which tokens are put into a token bucket. The CIR is expressed in bit/s.
l CBS: the committed volume of traffic that an interface allows to pass through, also the
depth of a token bucket. The CBS is expressed in bytes. The CBS must be greater than or
equal to the size of the largest possible packet entering a device.
l PIR: the maximum rate at which an interface allows packets to pass and is expressed in
bit/s. The PIR must be greater than or equal to the CIR.
l PBS: the maximum volume of traffic that an interface allows to pass through in a traffic
burst.
Mechanism for trTCM

The trTCM uses two token buckets and focuses on the burst traffic rate. The trTCM uses two
token buckets, C and P, with rates CIR and PIR, respectively. The maximum size of bucket C
is the CBS, and the maximum size of bucket P is the PBS.
Figure 7-4 Mechanism for trTCM
PIR
CIR Overflow
Overflow
Drop
Bucket
Drop PBS
Bucket P
C CBS
Method of Adding Tokens for trTCM

Tokens are put into buckets C and P at the rate of CIR and PIR, respectively. When one
bucket is full of tokens, any subsequent tokens for the bucket are dropped, but tokens
continue being put into the other bucket if it is not full.
Buckets C and P are initially full.
Rules for trTCM

The trTCM focuses on the traffic burst rate and checks whether the traffic rate is conforming
to the specifications. Therefore, traffic is measured based on bucket P and then bucket C.
Tc and Tp refer to the numbers of tokens in buckets C and P, respectively. The initial values of
Tc and Tp are respectively the CBS and PBS.
In Color-Blind mode, the following rules apply when a packet of size B arrives at time t:

l If Tp(t) – B < 0, the packet is marked red, and The Tc and Tp values remain unchanged.
l If Tp(t) – B ≥ 0 but Tc(t) – B < 0, the packet is marked yellow, and Tp is decremented
by B.
l If Tc(t) – B ≥ 0, the packet is marked green and both Tp and Tc are decremented by B.
In Color-Aware mode, the following rules apply when a packet of size B arrives at time t:
l If the packet has been pre-colored as green, and Tp(t) – B < 0, the packet is re-marked
red, and neither Tp nor Tc is decremented.
l If the packet has been pre-colored as green and Tp(t) – B ≥ 0 but Tc(t) – B < 0, the
packet is re-marked yellow, and Tp is decremented by B, and Tc remains unchanged.
l If the packet has been pre-colored as green and Tc(t) – B ≥ 0, the packet is re-marked
green, and both Tp and Tc are decremented by B.
l If the packet has been pre-colored as yellow and Tp(t) – B < 0, the packet is re-marked
red, and neither Tp nor Tc is decremented.
l If the packet has been pre-colored as yellow and Tp(t) – B ≥ 0, the packet is re-marked
yellow, and Tp is decremented by B and Tc remains unchanged.
l If the packet has been pre-colored as red, the packet is re-marked red regardless of what
the packet length is. The Tp and Tc values remain unchanged.
7.1.3 CAR
What Is CAR
In traffic policing, committed access rate (CAR) is used to control traffic. CAR uses token
buckets to measure traffic and determines whether a packet is conforming to the specification.
CAR has the following two functions:
l Rate limit: Only packets allocated enough tokens are allowed to pass so that the traffic
rate is restricted.
l Traffic classification: Packets are marked internal priorities, such as the scheduling
precedence and drop precedence, based on the measurement performed by token
buckets.
CAR Process
Figure 7-5 CAR process

Forward at the
original rate
Mismatching
rules
lly
rtia g
Pa tchin Forwarding after
Match ma 监 re-marking
rules Matching 管
Traffic
动 Forward
classification
Arriving packets 作
sp Exce
ec
ific ed
Token a ti
on
bucket
Drop

l When a packet arrives, the device matches the packet against matching rules. If the
packet matches a rule, the router uses token buckets to meter the traffic rate.
l The router marks the packet red, yellow, or green based on the metering result. Red
indicates that the traffic rate exceeds the specifications. Yellow indicates that the traffic
rate exceeds the specifications but is within an allowed range. Green indicates that the
traffic rate is conforming to the specifications.
l The device drops packets marked red, re-marks and forwards packets marked yellow,
and forwards packets marked green.
Marking Process of CAR

NE40Es conform to relevant standards to implement CAR.
CAR supports srTCM with single bucket, srTCM with two buckets, and trTCM. This section
provides examples of the three marking methods in Color-Blind mode. The implementation in
Color-Aware mode is similar to that in Color-Blind mode.
l SrTCM with Single Bucket

This example uses the CIR 1 Mbit/s, the committed burst size (CBS) 2000 bytes, and the
excess burst size (EBS) 0. The EBS 0 indicates that only bucket C is used. Bucket C is
initially full of tokens.
– If the first arriving packet is 1500 bytes long, the packet is marked green because
the number of tokens in bucket C is greater than the packet length. The number of
tokens in bucket C then decreases by 1500 bytes, with 500 bytes remaining.
– Assume that the second packet arriving at the interface after a delay of 1 ms is 1500
bytes long. Additional 125-byte tokens are put into bucket C (CIR x time period = 1
Mbit/s x 1 ms = 1000 bits = 125 bytes). Bucket C now has 625-byte tokens, which
are not enough for the 1500-byte second packet. Therefore, the second packet is
marked red.
– Assume that the third packet arriving at the interface after a delay of 1 ms is 1000
are not enough for the 1000-byte third packet. Therefore, the third packet is marked
red.
– Assume that the fourth packet arriving at the interface after a delay of 20 ms is 1500
bytes long. Additional 2500-byte tokens are put into bucket C (CIR x time period =
1 Mbit/s x 20 ms = 20000 bits = 2500 bytes). This time 3250-byte tokens are
destined for bucket C, but the excess 1250-byte tokens over the CBS (2000 bytes)
are dropped. Therefore, bucket C has 2000-byte tokens, which are enough for the
1500-byte fourth packet. The fourth packet is marked green, and the number of
tokens in bucket C decreases by 1500 bytes to 500 bytes.
The following table illustrates this process:
No. Time Packet Delay Token Tokens in Tokens in Marking
Length Addition Bucket C Bucket C
Before After
Packet Packet
Processing Processing
- - - - - 2000 2000 -

No. Time Packet Delay Token Tokens in Tokens in Marking

Length Addition Bucket C Bucket C
Before After
Packet Packet
Processing Processing
1 0 1500 0 0 2000 500 Green
2 1 1500 1 125 625 625 Red
3 2 1000 1 125 750 750 Red
4 22 1500 20 2500 2000 500 Green
l SrTCM with Two Buckets

This example uses the CIR 1 Mbit/s and the CBS and EBS both 2000 bytes. Buckets C
and E are initially full of tokens.
– If the first packet arriving at the interface is 1500 bytes long, the packet is marked
green because the number of tokens in bucket C is greater than the packet length.
The number of tokens in bucket C then decreases by 1500 bytes, with 500 bytes
remaining. The number of tokens in bucket E remains unchanged.
are not enough for the 1500-byte second packet. Bucket E has 2000-byte tokens,
which are enough for the second packet. Therefore, the second packet is marked
yellow, and the number of tokens in bucket E decreases by 1500 bytes, with 500
bytes remaining. The number of tokens in bucket C remains unchanged.
Mbit/s x 1 ms = 1000 bits = 125 bytes). Bucket C now has 750-byte tokens and
bucket E has 500-byte tokens, neither of which is enough for the 1000-byte third
packet. Therefore, the third packet is marked red. The number of tokens in buckets
C and E remain unchanged.
bytes long. Additional 2500-byte tokens are put into bucket C (CIR x time period =
1 Mbit/s x 20 ms = 20000 bits = 2500 bytes). This time 3250-byte tokens are
destined for bucket C, but the excess 1250-byte tokens over the CBS (2000 bytes)
are put into bucket E instead. Therefore, bucket C has 2000-byte tokens, and bucket
E has 1750-byte tokens. Tokens in bucket C are enough for the 1500-byte fourth
packet. Therefore, the fourth packet is marked green, and the number of tokens in
bucket C decreases by 1500 bytes, with 500 bytes remaining. The number of tokens
in bucket E remains unchanged.
The following table illustrates the preceding process:

N Tim Packe Delay Token Tokens in Buckets Tokens in Buckets Marking

o. e t Addition Before Packet After Packet
Lengt Processing Processing
h
Bucket C Bucket E Bucket C Bucket E
- - - - - 2000 2000 2000 2000 -
1 0 1500 0 0 2000 2000 500 2000 Green
2 1 1500 1 125 625 2000 625 500 Yellow
3 2 1000 1 125 750 500 750 500 Red
4 22 1500 20 2500 2000 1750 500 1750 Green
l TrTCM
This example uses the CIR 1 Mbit/s, the PIR 2 Mbit/s, and the CBS and EBS both 2000
bytes. Buckets C and P are initially full of tokens.
– If the first packet arriving at the interface is 1500 bytes long, the packet is marked
green because the number of tokens in both buckets P and C is greater than the
packet length. Then the number of tokens in both buckets P and C decreases by
1500 bytes, with 500 bytes remaining.
bytes long. Additional 250-byte tokens are put into bucket P (PIR x time period = 2
Mbit/s x 1 ms = 2000 bits = 250 bytes) and 125-byte tokens are put into bucket C
(CIR x time period = 1 Mbit/s x 1 ms = 1000 bits = 125 bytes). Bucket P now has
750-byte tokens, which are not enough for the 1500-byte second packet. Therefore,
the second packet is marked red, and the number of tokens in buckets P and C
remain unchanged.
bytes long. Additional 250-byte tokens are put into bucket P (PIR x time period = 2
Mbit/s x 1 ms = 2000 bits = 250 bytes) and 125-byte tokens are put into bucket C
(CIR x time period = 1 Mbit/s x 1 ms = 1000 bits = 125 bytes). Bucket P now has
1000-byte tokens, which equals the third packet length. Bucket C has only 750-byte
tokens, which are not enough for the 1000-byte third packet. Therefore, the third
packet is marked yellow. The number of tokens in bucket P decreases by 1000
bytes, with 0 bytes remaining. The number of tokens in bucket C remains
unchanged.
bytes long. Additional 5000-byte tokens are put into bucket P (PIR x time period =
2 Mbit/s x 20 ms = 40000 bits = 5000 bytes), but excess tokens over the PBS (2000
bytes) are dropped. Bucket P has 2000-byte tokens, which are enough for the 1500-
byte fourth packet. Bucket C has 750-byte tokens left, and additional 2500-byte
tokens are put into bucket C (CIR x time period = 1 Mbit/s x 20 ms = 2000 bits =
250 bytes). This time 3250-byte tokens are destined for bucket C, but excess tokens
over the CBS (2000 bytes) are dropped. Bucket C then has 2000-byte tokens, which
are enough for the 1500-byte fourth packet. Therefore, the fourth packet is marked
green. The number of tokens in both buckets P and C decreases by 1500 bytes, with
500 bytes remaining.

N Ti Pac Delay Token Addition Tokens in Tokens in Marki

o. me ket Buckets Before Buckets After ng
Len Packet Packet
gth Processing Processing
Bucket Bucket Bucket Bucket Bucket Bucket Bucket Bucket

C P C P C P C P
- - - - - - - 2000 2000 2000 2000 -
1 0 150 0 0 0 0 2000 2000 500 500 Green

0
2 1 150 1 1 125 250 625 750 625 750 Red

0
3 2 100 1 1 125 250 750 1000 750 0 Yellow

0
4 22 150 20 20 2500 5000 2000 2000 500 500 Green

0
Usage Scenarios for the Three Marking Methods

The srTCM focus on the traffic burst size and have a simple token-adding method and packet
processing mechanism. The trTCM focuses on the traffic burst rate and has a complex token-
adding method and packet processing mechanism.
The srTCM and trTCM have their own advantages and disadvantages. They vary from each
other in performance, such as the packet loss rate, burst traffic processing capability, hybrid
packet forwarding capability, data forwarding smoothing capability. The three markers fit for
traffic with different features as follows:
l To control the traffic rate, use srTCM with single bucket.
l To control the traffic rate and distinguish traffic marked differently and process them
differently, use srTCM with two buckets. Note that traffic marked yellow must be
processed differently from traffic marked green. Otherwise, the implementation outcome
of srTCM with two buckets is the same as that of the srTCM with single bucket.
l To control the traffic rate and check whether the traffic rate exceeds the CIR or PIR, use
trTCM. Note that traffic marked yellow must be processed differently from traffic
marked green. Otherwise, the implementation outcome of trTCM is the same as that of
srTCM with single bucket.
CAR Parameter Setting

The CIR is the key to determine the volume of traffic allowed to pass through a network. The
larger the CIR is, the higher the rate at which tokens are generated. The more the tokens
allocated to packets, the greater the volume of traffic allowed to pass through. The CBS is
also an important parameter. A larger CBS results in more accumulated tokens in bucket C
and a greater volume of traffic allowed to pass through.
l The CBS must be greater than or equal to the maximum packet length. For example, the
CIR is 100 Mbit/s, and the CBS is 200 bytes. If a device receives 1500-byte packets, the

packet length always exceeds the CBS, causing the packets to be marked red or yellow
even if the traffic rate is lower than 100 Mbit/s. This leads to an inaccurate CAR
implementation.
The Bucket depth (CBS, EBS or PBS) is set based on actual rate limit requirements. In
principle, the bucket depth is calculated based on the following conditions:
1. Bucket depth must be greater than or equal to the MTU.
2. Bucket depth must be greater than or equal to the allowed burst traffic volume.
Condition 1 is easy to meet. Condition 2 is difficult to operate, and the following formula is
introduced:
Bucket depth (bytes) = Bandwidth (kbit/s) x RTT (ms)/8. Note that RTT refers to round trip
time and is set to 200 ms.
The following formulas are used for NE40Es:
l When the bandwidth is lower than or equal to 100 Mbit/s: Bucket depth (bytes) =
Bandwidth (kbit/s) x 1500 (ms)/8.
l When the bandwidth is higher than 100 Mbit/s: Bucket depth (bytes) = 100,000 (kbit/s) x
1500 (ms)/8.
NOTICE
CAR calculates the bandwidth of packets based on the entire packet. For example, CAR
counts the length of the frame header and CRC field but not the preamble, inter frame gap, or
SFD of an Ethernet frame in the bandwidth. The following figure illustrates a complete
Ethernet frame (bytes):
Minimum 12 7 1 6 6 2 46 to 1500 4
Destination Source Length/ Data

Inter frame gap Preamble SFD CRC
MAC MAC type payload
7.1.4 Traffic Policing Applications

Traffic policing mainly applies to DS ingress node. Packets that exceed the SLA are dropped,
or re-marked to ensure that packets conforming to the SLA are provided with guaranteed
services. Figure 7-6 shows typical networking.

Figure 7-6 Application 1

Traffic policing Traffic policing
for incoming for incoming
traffic traffic DS domain
DS domain
Interior node Interior node
SLA/TCA

Traffic policing
Traffic policing for
SLA/TCA for incoming SLA/TCA
incoming traffic
traffic
User User
network network
As shown in Figure 7-7, a NE40E connects a wide area network (WAN) and a local area
network (LAN). The LAN bandwidth (100 Mbit/s) is higher than the WAN bandwidth (2
Mbit/s). When a LAN user attempts to send a large amount of data to a WAN, the NE40E at
the network edge is prone to traffic congestion. Traffic policing can be configured on the
NE40E at the network edge to restrict the traffic rate, preventing traffic congestion.
Figure 7-7 Application 2

Traffic policing
Data flow: 100 Mbit/s
LAN WAN
High-speed link Low-speed link
Interface bandwidth
2Mbps
Interface-based Traffic Policing

Interface-based traffic policing controls all traffic that enters an interface and does not identify
the packet types. As shown in Figure 7-8, a NE40E at an ISP network edge connects to three
user networks. The SLA defines that each user can send traffic at a maximum rate of 256
kbit/s. However, burst traffic is sometimes transmitted. Traffic policing can be configured on
the ingress NE40E at the ISP network edge to restrict the traffic rate to a maximum of 256
kbit/s. All excess traffic over 256 kbit/s will be dropped.

Figure 7-8 Interface-based traffic policing

User
network
Traffic policing
SL
A:
25
6k
bp
s
ISP Internet
User
network SLA: 256kbps
s
bp
5 6k
A :2
User SL
network
Class-based Traffic Policing

The class-based traffic policy controls the rate of one or more types of packets that enter an
interface but not all types of packets.
As shown in Figure 7-9, traffic from the three users at 1.1.1.1, 1.1.1.2, and 1.1.1.3 is
converged to a NE40E. The SLA defines that each user can send traffic at a maximum rate of
256 kbit/s. However, burst traffic is sometimes transmitted. When a user sends a large amount
of data, services of other users may be affected even if they send traffic at a rate lower than
256 kbit/s. To resolve this problem, configure traffic classification and traffic policing based
on source IP addresses on the inbound interface of the device to control the rate of traffic sent
from different users. The device drops excess traffic when the traffic rate of a certain user
exceeds 256 kbit/s.
Figure 7-9 Class-based traffic policing
Traffic
1.1.1.1
policing
IP
backbone Internet
1.1.1.2
1.1.1.3
NOTE
Multiple traffic policies must be configured on the inbound interface to implement different rate limits
for data flows sent from different source hosts. The traffic policies take effect in the configuration order.
The first traffic policy configured is the first to effect first after data traffic reaches the interface.

Combination of Traffic Policing and Other QoS Policies

Traffic policing and other QoS components can be implemented together to guarantee QoS
network-wide.
Figure 7-10 shows how traffic policing works with congestion avoidance to control traffic. In
this networking, four user networks connect to a NE40E at the ISP network edge. The SLA
defines that each user can send FTP traffic at a maximum rate of 256 kbit/s. However, burst
traffic is sometimes transmitted at a rate even higher than 1 Mbit/s. When a user sends a large
amount of FTP data, FTP services of other users may be affected even if they send traffic at a
rate lower than 256 kbit/s. To resolve this problem, configure class-based traffic policing on
each inbound interface of the NE40E to monitor the FTP traffic and re-mark the DSCP values
of packets. The traffic at a rate lower than or equal to 256 kbit/s is re-marked AF11. The
traffic at a rate ranging from 256 kbit/s to 1 Mbit/s is re-marked AF12. The traffic at a rate
higher than 1 Mbit/s is re-marked AF13. Weighted Random Early Detection (WRED) is
configured as a drop policy for these types of traffic on outbound interfaces to prevent traffic
congestion. WRED drops packets based on the DSCP values. Packets in AF13 are first
dropped, and then AF12 and AF11 in sequence.
Figure 7-10 Combination of traffic policing and congestion avoidance

Congestion
User Class-based avoidance
network traffic policing WRED
ISP
User network
network
Internet
User
network
Interface-based
traffic policing
User
network
Statistics Collection of Traffic Policing

Traffic that enters a network must be controlled, and traffic statistics must be collected.
Traditional statistics collection has the following defects:
l For upstream traffic, only statistics about packets after a CAR operation is implemented
can be collected. Statistics about the actual traffic in need and the packet loss during
CAR are not provided.
l For downstream traffic, only statistics about packets after a CAR operation is
implemented can be collected. Statistics about the forwarded and dropped packets are
not provided.
Carriers require statistics about traffic that has been implemented with CAR to analyze user
traffic beyond the specifications, which provides a basis for persuasion of purchasing a higher
bandwidth. Using the interface-based CAR statistics collection function, NE40Es can collect
and record statistics about the upstream traffic after a CAR operation (the actual access traffic

of an enterprise user or an Internet bar), as well as statistics about the forwarded and dropped
downstream packets after a CAR operation.
7.2 Traffic Shaping

What Is Traffic Shaping
Traffic shaping controls the rate of outgoing packets to allow the traffic rate to match that on
the downstream device. When traffic is transmitted from a high-speed link to a low-speed link
or a traffic burst occurs, the inbound interface of the low-speed link is prone to severe data
loss. To prevent this problem, traffic shaping must be configured on the outbound interface of
the device connecting to the low-speed link, as shown in Figure 7-11.
Figure 7-11 Data transmission from the high-speed link to the low-speed link
Traffic shaping
Data flows: 100 Mbit/s
LAN WAN
Bandwidth Bandwidth
1 Gbit/s 2 Mbit/s
As shown in Figure 7-12, traffic shaping can be configured on the outbound interface of an
upstream device to make irregular traffic transmitted at an even rate, preventing traffic
congestion on the downstream device.
Figure 7-12 Effect of traffic shaping

Without traffic
Packet rate shaping
With traffic shaping
CIR
Time
Traffic Shaping Implementation

Traffic shaping buffers overspeed packets and uses token buckets to transmit these packets
afterward at an even rate.
On router, tokens are added at an interval, which is calculated in the format of CBS/CIR, with
the quantity equal to the CBS for traffic shaping.

NOTE
On router, the length of the frame header and CRC field are calculated in the bandwidth for packets to
which CAR applies but not calculated in the bandwidth for packets that have been implemented with
traffic shaping. For example, if the traffic shaping value is set to 23 Mbit/s for IPoE packets, the IP
packets are transmitted at a rate of 23 Mbit/s with the lengths of the frame header and CRC field not
counted.
In addition, whether the CBS can be modified in traffic shaping is determined by the product model,
product version, and board type.
Traffic shaping is implemented for packets that have been implemented with queue
scheduling and are leaving the queues. For details about queues and queue scheduling, see
Congestion Management and Avoidance.
There are two traffic shaping modes: queue-based traffic shaping and interface-based traffic
shaping.
l Queue-based traffic shaping applies to each queue on an outbound interface.
– When packets have been implemented with queue scheduling and are leaving
queues, the packets that do not need traffic shaping are forwarded; the packets that
need traffic shaping are measured against token buckets.
– After queues are measured against token buckets, if packets in a queue are
transmitted at a rate conforming to the specifications, the packets in the queue are
marked green and forwarded. If packets in a queue are transmitted at a rate
exceeding the specifications, the packet that is leaving the queue is forwarded, but
the queue is marked unscheduled and can be scheduled after new tokens are added
to the token bucket. After the queue is marked unscheduled, more packets can be
put into the queue, but excess packets over the queue capacity are dropped.
Therefore, traffic shaping allows traffic to be sent at an even rate but does not
provide a zero-packet-loss guarantee.
– Figure 7-13 Queue-based traffic shaping
Forward
g
in
l
u Leave Traffic
queue d
e queue shaping?
h
c
S
Comply
Packets that exceed the

specification are marked Token Packets that
bucket are leaving the
unscheduled, and are
scheduled when the queue can be
bandwidth is available. Exceed forwarded.

NOTE
l For the LPUF-50/LPUF-50-L/LPUI-21-L/LPUI-51-L/LPUF-51/LPUF-51-B/LPUI-51/

LPUI-51-B/LPUI-51-S/LPUS-51/LPUF-101/LPUF-101-B/LPUI-101/LPUI-101-B/
LPUS-101/LPUF-51-E/LPUI-51-E/LPUI-51-CM/LPUF-120/LPUF-120-B/LPUF-120-E/
LPUI-102-E/LPUI-120/LPUI-120-B/LPUI-120-L/LPUI-52-E/LPUI-120-E/LPUI-120-
CM/LPUF-240/LPUF-240-B/LPUF-240-E/LPUI-240/LPUI-240-B/LPUI-240-CM/
LPUI-240-L/LPUF-480/LPUF-480-B/LPUI-480/LPUI-480-B/LPUI-480-L/LPUF-480-
E/LPUI-480-CM/LPUI-200/LPUI-200-L/LPUF-200/LPUF-200-B/LPUI-1T/LPUI-1T-B/
LPUI-1T-L/LPUI-1T-CM/LPUI-2T/LPUI-2T-B/LPUI-2T-CM, the token bucket
mechanism uses the Deficit mode, and the number of tokens in the token bucket can be a
minus. When the number of tokens in the token bucket is greater than 0, the system
considers that the packet rate conforms to the configured rate and forwards the packet,
and the number of tokens in the token bucket decreases accordingly. When the number
of tokens in the token bucket is less than or equal to 0, the system considers that the
queue cannot be scheduled and buffers packets in the queue.
– Assume that the CIR is set to 1 Mbit/s and PBS to 2000 bytes for traffic shaping for
a queue, bucket C is initially full of tokens (to be specific, the depth of bucket C is
2000 bytes), and the number of tokens that are placed in the token bucket per
millisecond is 125 bytes, which is calculated using the formula of 1 Mbit/s x 1 ms =
1000 bits = 125 bytes.
n If the first packet arriving at the interface is 1500 bytes long, the packet is
marked green because the number of tokens in bucket C is greater than the
packet length. The number of tokens in bucket C then decreases by 1500 bytes,
with 500 bytes remaining.
n Assume that the second packet of 1500 bytes arrives at the interface 1 ms later.
At this time, 125-byte tokens have been added to bucket C, and bucket C has a
total of 625-byte tokens (with the remaining 500-byte tokens being included).
As the number of tokens in the token bucket is greater than 0, the system
forwards the packet, and the number of remaining tokens in the token bucket is
-875 bytes (625 bytes - 1500 bytes). As the number of tokens in the token
bucket becomes a minus, the system does not forward the packet.
n 1 ms later, the third packet of 1000 bytes arrives at the interface. At this time,
the number of remaining tokens in bucket C is still a minus (-875 bytes + 125
bytes = -750 bytes). Therefore, the system does not forward packets.
n 6 ms later, the fourth packet of 1500 bytes arrives at the interface. At this time,
the number of tokens in bucket C is 0 (-750 bytes + 125 bytes x 6 = 0).
Therefore, the system does not forward packets.
n 1 ms later, the fifth packet of 1500 bytes arrives at the interface. At this time,
bucket C has 125-byte tokens. Therefore, the system forwards the third packet.
After that, the number of tokens in the token bucket becomes a minus again
(125 bytes - 1500 bytes = -1375 bytes). Therefore, the system does not forward
packets.

Table 7-1
No. Tim Pack Tokens Tokens Process Queue Queue
e et in in ing Status Status
Leng Bucket Bucket Result Before After
th C C After Packet Packet
Before Packet Process Process
Packet Processi ing ing
Process ng
ing
- - - 2000 2000 - Schedul Schedul

able able
1 0 1500 2000 500 First Schedul Schedul

packet able able
being
forward
ed
2 1st 1500 625 -875 Second Schedul Schedul

ms packet ing ing
being suspend suspend
forward ed ed
ed
3 2nd 1000 -750 -750 Third Schedul Schedul

ms packet ing ing
forward ed ed
ed
4 8th 1500 0 0 Fourth Schedul Schedul

ms packet ing ing
forward ed ed
ed
5 9th 1500 125 -1375 Third Schedul Schedul

ms packet able ing
being suspend
forward ed
ed
l Interface-based traffic shaping, also called line rate (LR), is used to restrict the rate at
which all packets (including burst packets) are transmitted. Interface-based traffic
shaping takes effect on the entire outbound interface, regardless of packet priorities.
Figure 7-14 shows how interface-based traffic shaping is implemented:
– When packets have been implemented with queue scheduling and are leaving
queues, all queues are measured together against token buckets.
– After queues are measured against token buckets, if the packets total-rate
conforming to the specifications, the queue is forwarded. If the packet rate on an
interface exceeds the specification, the interface stops packet scheduling and will
resume scheduling when tokens are enough.

Figure 7-14 Interface-based traffic shaping
Queue 1
If complying, forward
Queue 2
Scheduling
Leave Token
Queue 3 queue bucket
……
Queue N If the packet rate on an interface
exceeds the specification, the
interface stops packet scheduling
and will resume scheduling when
tokens are enough.
NOTE
The principle of traffic shaping on an interface is the same as that of traffic shaping for queues and is
not described here.
Traffic Shaping Applications

Traffic shaping controls the traffic output to minimize packet loss.
Figure 7-15 Traffic shaping application

Traffic shaping
for outbound
DS domain traffic DS domain
Boundary Boundary
node Interior
node
Interior node node
SLA/TCA

Traffic shaping for Traffic shaping for
outgoing traffic SLA/TCA outgoing traffic SLA/TCA
User
network User
network
l Interface-based traffic shaping

As shown in Figure 7-16, enterprise headquarters are connected to branches through
leased lines on an ISP network in Hub-Spoke mode. The bandwidth of each leased line is
1 Gbit/s. If all branches send data to headquarters, traffic congestion occurs on the nodes
connecting to headquarters at the ISP network edge. To prevent packet loss, configure
traffic shaping on outbound interfaces of the nodes at the branch network edge.

Figure 7-16 Interface-based traffic shaping
Hub
Headquarters
Congestion
1G point
ISP
1G 1G
Spoke
Branch Spoke Branch
l Queue-based traffic shaping

As shown in Figure 7-17, enterprise headquarters are connected to branches through
leased lines on an ISP network in Hub-Spoke mode. The bandwidth of each leased line is
1 Gbit/s. Branches access the Internet through headquarters, but the link bandwidth
between headquarters and the Internet is only 100 Mbit/s. If all branches access the
Internet at a high rate, the rate of web traffic sent from headquarters to the Internet may
exceed 100 Mbit/s, causing web packet loss on the ISP network.
To prevent web packet loss, configure queue-based traffic shaping for web traffic on
outbound interfaces of branches and outbound interfaces connecting to the Internet on
headquarters.
Figure 7-17 Queue-based traffic shaping
Hub Headquarters
1G 100M
I n t e r n et
ISP
1G 1G
Spoke
Branch Spoke Branch

Port Queue Share-Shaping

Share-shaping consists of port queue share-shaping and flow queue share-shaping. For details
about flow queue share-shaping, see HQoS. This topic describes port queue share-shaping.
Port queue share-shaping considers two or more port queues using the same scheduling mode
on one main interface as a group and implements rate limiting for the group so that the overall
bandwidth of the group can be restricted. This prevents the bandwidth of other services on the
interface from being preempted.
For example, as shown in Figure 7-18, when both the AF1 queue (IPTV unicast) and AF2
queue (IPTV multicast) use WFQ scheduling and the traffic of the two queues enters the same
interface from different interfaces or the same interface before being forwarded, you can
implement share-shaping for the AF1 and AF2 queues.
Figure 7-18 Port queue share-shaping
ef: VoIP
af1: IPTV Unicast
Port
scheduling
af2: IPTV Multicast
af3: HSI
Shaping rate
Share-shaping
Currently, each interface supports share-shaping only for one group.
Shaped Rate Adjustment: Last-Mile QoS

Last mile indicates the link between the user and the access switch (such as the Ethernet
DSLAM), as shown in Figure 7-19. Residential and enterprise users generally access the
Ethernet DSLAM using IPoE, PPPoE and the DSLAM is connected to the BRAS or SR, edge
device on the backbone network, through a metropolitan area network (MAN).

Figure 7-19 Last mile
Last Mile
CPE
STB
HG
PC
Access IP
Phone MAN backbone
Modem network
Ethernet BRAS/SR
DSLAM
Router
Residential /Enterprise
network
IPoE/PPPoE
IPoE/PPPoE
In a broadband service access scenario, an Ethernet link connects a BRAS or SR and a

DSLAM. The BRAS or SR encapsulates Ethernet packets, and traffic shaping is implemented
based on the Ethernet packets.
Even if the link connects the user and DSLAM is also an Ethernet link, the encapsulation cost
of the packets sent between the user and DSLAM can possibly exceed that on the user side of
the BRAS or SR. For example, the Ethernet packet encapsulated on the BRAS or SR does not
carry a VLAN tag, but the packet sent between the user and DSLAM carries a single or
double VLAN tags due to VLAN or QinQ encapsulation.
To resolve this problem, last-mile QoS can be configured on the BRAS or SR. Last-mile QoS
allows a device to calculate the length of headers to be added to packets based on the
bandwidth purchased by users and the bandwidth of the downstream interface on the DSLAM
for traffic shaping.
Therefore, the BRAS or SR cannot automatically infer the sum length of the packets that has
been encapsulated on the DSLAM and requires compensation bytes.
After compensation bytes are configured, if the DSLAM connects to the CPE through an
Ethernet link, the BRAS or SR can automatically infer the sum length of the packet
encapsulated on the DSLAM based on the length of the forwarded packet and the configured
compensation bytes, and determine the shaped rate to be adjusted.
The following tables provide common encapsulation-costs and compensation bytes.
Table 7-2 Packet encapsulation-cost

Encapsulation Type Encapsulation-cost (Bytes)
PPP header 2
Eth header 14
VLAN header 4

Encapsulation Type Encapsulation-cost (Bytes)
QinQ header 8
AAL5 VC AAL5 Header + AAL5 tail = 0 + 8 =

encapsul 8
ation
LLC Type1 (connection-less mode, AAL5 Header + AAL5 tail = 8 + 8 =
such as IPoE, PPPoE) 16
Table 7-3 Common compensation bytes for last-mile QoS

Scenario Compensation Bytes
IPoVLAN IPoQinQ = VLAN header - QinQ header

CPE =-4
IP payload Eth IP payload BRAS

IP header DSLAM IP header
/SR
VLAN header QinQ header
Eth header Eth header
IPoE IPoQinQ = 0 - QinQ header

CPE =-8
IP payload Eth IP payload BRAS

IP header DSLAM IP header
/SR
Eth header QinQ header
Eth header
7.3 Comparison Between Traffic Policing and Traffic

Shaping
Similarity
Traffic policing and traffic shaping share the following features:
l Both are used to limit network traffic rate.
l Both use token buckets to measure the traffic rate.
l Both apply to DS boundary nodes.
Difference
The following table lists the differences between traffic policing and traffic shaping.

Traffic policing Traffic Shaping
Drops excess traffic over the specifications Buffers excess traffic over the
or re-marks such traffic with a lower specifications.
priority.
Consumes no additional memory resources Consumes memory resources for excess

and brings no delay or jitter. traffic buffer and brings delay and jitter.
Packet loss may result in packet Packet loss rarely occurs, so does packet
retransmission. retransmission.
Traffic re-marking supported. Traffic re-marking unsupported.

Feature Description - QoS 8 Congestion Management and Avoidance
8 Congestion Management and Avoidance
About This Chapter
8.1 Traffic Congestion and Solutions

8.2 Queues and Congestion Management
8.3 Congestion Avoidance
8.4 Impact of Queue Buffer on Delay and Jitter
8.5 HQoS
8.1 Traffic Congestion and Solutions

Background
Traffic congestion occurs when multiple users compete for the same resources (such as the
bandwidth and buffer) on the shared network. For example, a user on a local area network
(LAN) sends data to a user on another LAN through a wide area network (WAN). The WAN
bandwidth is lower than the LAN bandwidth. Therefore, data cannot be transmitted at the
same rate on the WAN as that on the LAN. Traffic congestion occurs on the router connecting
the LAN and WAN, as shown in Figure 8-1.
Figure 8-1 Traffic congestion
WAN
2M
E1 p s
s
2M E1
bp
Congestion
b
point
Data flow
Ethernet 10M bps Ethernet

Figure 8-2 shows the common traffic congestion causes.

l Traffic rate mismatch: Packets are transmitted to a device through a high-speed link and
are forwarded out through a low-speed link.
l Traffic aggregation: Packets are transmitted from multiple interfaces to a device and are
forwarded out through a single interface without enough bandwidth.
Figure 8-2 Link bandwidth restriction

100 M
bit/s
100 Mbit/s 10 Mbit/s 100 Mbit/s

100 Mbit/s
b it/s
0M
Bandwidth mimatching 10
Aggregation problem
Traffic congestion is derived not only from link bandwidth restriction but also from any
resource shortage, such as available processing time, buffer, and memory resource shortage.
In addition, traffic is not satisfactorily controlled and exceeds the capacity of available
network resources, also leading to traffic congestion.
Location
As shown in Figure 8-3, traffic can be classified into the following based on the device
location and traffic forwarding direction:
l Upstream traffic on the user side
l Downstream traffic on the user side
l Upstream traffic on the network side
l Downstream traffic on the network side

Figure 8-3 Upstream and downstream traffic on the user and network sides
VoIP
User side Network IP/MPLS Core

side
IPTV
SFU
Upstream on Upstream on
the user side the network side
Line card Line card
on the user on the network
side side
Generally, upstream traffic is not congested because upstream traffic does not bother with
traffic rate mismatch, traffic aggregation, or forwarding resource shortage. Downstream
traffic, instead, is prone to traffic congestion.
Impacts
Traffic congestion has the following adverse impacts on network traffic:
l Traffic congestion intensifies delay and jitter.
l Overlong delays lead to packet retransmission.
l Traffic congestion reduces the throughput of networks.
l Intensified traffic congestion consumes a large number of network resources (especially
storage resources). Unreasonable resource allocation may cause resources to be locked
and the system to go Down.
Therefore, traffic congestion is the main cause of service deterioration. Since traffic
congestion prevails on the PSN network, traffic congestion must be prevented or effectively
controlled.
Solutions
A solution to traffic congestion is a must on every carrier network. A balance between limited
network resources and user requirements is required so that user requirements are satisfied
and network resources are fully used.

Congestion management and avoidance are commonly used to relieve traffic congestion.
l Congestion management provides means to manage and control traffic when traffic
congestion occurs. Packets sent from one interface are placed into multiple queues that
are marked with different priorities. The packets are sent based on the priorities.
Different queue scheduling mechanisms are designed for different situations and lead to
different results.
l Congestion avoidance is a flow control technique used to relieve network overload. By
monitoring the usage of network resources in queues or memory buffer, a device
automatically drops packets on the interface that shows a sign of traffic congestion.
8.2 Queues and Congestion Management

Congestion management defines a policy that determines the order in which packets are
forwarded and specifies drop principles for packets. The queuing technology is used.
The queuing technology orders packets in the buffer. When the packet rate exceeds the
interface bandwidth or the bandwidth allocated to the queue that buffers packets, the packets
are buffered in queues and wait to be forwarded. The queue scheduling algorithm determines
the order in which packets are leaving a queue and the relationships between queues.
NOTE
The Traffic Manager (TM) on the forwarding plane houses high-speed buffers, for which all interfaces
have to compete. To prevent traffic interruptions due to long-time loss in the buffer battle, the system
allocates a small buffer to each interface and ensures that each queue on each interface can use the
buffer.
The TM puts received packets into the buffer and allows these packets to be forwarded in time when
traffic is not congested. In this case, the period during which packets are stored in the buffer is at μs
level, and the delay can be ignored.
When traffic is congested, packets accumulate in the buffer and wait to be forwarded. The delay greatly
prolongs. The delay is determined by the buffer size for a queue and the output bandwidth allocated to a
queue. The format is as follows:
Delay of a queue = Buffer size for the queue/Output bandwidth for the queue
Each interface on a NE40E stores eight downstream queues, which are called class queues
(CQs) or port queues. The eight queues are BE, AF1, AF2, AF3, AF4, EF, CS6, and CS7.
The first in first out (FIFO) mechanism is used to transfer packets in a queue. Resources used
to forward packets are allocated based on the arrival order of packets.
Figure 8-4 Entering and leaving a queue

FIFO FIFO
Scheduling
Enter queue Leave queue

Packet Packet Packet Packet Packet Packet
Queue
3 2 1 3 2 1
Packets before Packets leaving the

entering a queue queue
Scheduling Algorithms
The commonly used scheduling algorithms are as follows:

l First In First Out (FIFO)

l Strict Priority (SP)
l Round Robin (RR)
l Weighted Round Robin (WRR)
l Deficit Round Robin (DRR)
l Weighted Deficit Round Robin (WDRR)
l Weighted Fair Queuing (WFQ)
FIFO
FIFO does not need traffic classification. As shown in Figure 8-4, FIFO allows the packets
that come earlier to enter the queue first. On the exit of a queue, FIFO allows the packets to
leave the queue in the same order as that in which the packets enter the queue.
SP
SP schedules packets strictly based on queue priorities. Packets in queues with a low priority
can be scheduled only after all packets in queues with a high priority have been scheduled.
As shown in Figure 8-5, three queues with a high, medium, and low priority respectively are
configured with SP scheduling. The number indicates the order in which packets arrive.
Figure 8-5 SP scheduling

Packet Packet
High-priority queue
6 2
Scheduling
Leave queue
Packet Packet Packet Packet PacketPacket
Medium-priority queue Packet Packet Packet
5 4 3 1 5 4 3 6 2
Packet
Low-priority queue
1
When packets leave queues, the device forwards the packets in the descending order of
priorities. Packets in the higher-priority queue are forwarded preferentially. If packets in the
higher-priority queue come in between packets in the lower-priority queue that is being
scheduled, the packets in the high-priority queue are still scheduled preferentially. This
implementation ensures that packets in the higher-priority queue are always forwarded
preferentially. As long as there are packets in the high queue no other queue will be served.
The disadvantage of SP is that the packets in lower-priority queues are not processed until all
the higher-priority queues are empty. As a result, a congested higher-priority queue causes all
lower-priority queues to starve.
RR
RR schedules multiple queues in ring mode. If the queue on which RR is performed is not
empty, the scheduler takes one packet away from the queue. If the queue is empty, the queue
is skipped, and the scheduler does not wait.

Figure 8-6 RR scheduling

Queue 1 Packet Packet Packet
3 2 1
g Leave
Packet Packet lin queue
Queue 2 Packet 4 u Packet Packet 5
6 5 d Packet 9 Packet 6 Packet 3 Packet 2 Packet 7 Packet 4 Packet 1
e
h 8
c
S
Packet Packet Packet
Queue 3 Packets leaving the queue
9 8 7
WRR
Compared with RR, WRR can set the weights of queues. During the WRR scheduling, the
scheduling chance obtained by a queue is in direct proportion to the weight of the queue. RR
scheduling functions the same as WRR scheduling in which each queue has a weight 1.
WRR configures a counter for each queue and initializes the counter based on the weight
values. Each time a queue is scheduled, a packet is taken away from the queue and being
transmitted, and the counter decreases by 1. When the counter becomes 0, the device stops
scheduling the queue and starts to schedule other queues with a non-0 counter. When the
counters of all queues become 0, all these counters are initialized again based on the weight,
and a new round of WRR scheduling starts. In a round of WRR scheduling, the queues with
the larger weights are scheduled more times.
Figure 8-7 WRR scheduling

4th round 3rd round 2nd round 1st round
Count[1]=1 Count[1]=2 Count[1]=1 Count[1]=2

Queue 1: Packet Packet Packet Packet Packet Packet Packet
50% Packet 3
4 3 2 1 4 2 1
Leave
lin queue Count[2]=0 Count[2]=1 Count[2]=0 Count[2]=1
u
Queue 2: Packet Packet d g
Packet 5 e Packet Packet 5
25% 7 6 h
c … 6
S
Count[3]=0 Count[3]=1 Count[3]=0 Count[3]=1
Queue 3:
Packet 9 Packet 8 Packet 9 Packet 8
25%
In an example, three queues with the weight 50%, 25%, and 25% respectively are configured
with WRR scheduling.
The counters are initialized first: Count[1] = 2, Count[2] = 1, and Count[3] = 1.
l First round of WRR scheduling:

Packet 1 is taken from queue 1, with Count[1] = 1. Packet 5 is taken from queue 2, with
Count[2] = 0. Packet 8 is taken from queue 3, with Count[3] = 0.
l Second round of WRR scheduling:
Packet 2 is taken from queue 1, with Count[1] = 0. Queues 2 and 3 do not participate in
this round of WRR scheduling since Count [2] = 0 and Count[3] = 0.
Then, Count[1] = 0; Count[2] = 0; Count[3] = 0. The counters are initialized again:
Count[1] = 2; Count[2] = 1; Count[3] = 1.
l Third round of WRR scheduling:

Packet 3 is taken from queue 1, with Count[1] = 1. Packet 6 is taken from queue 2, with
Count[2] = 0. Packet 9 is taken from queue 3, with Count[3] = 0.
l Fourth round of WRR scheduling:
Packet 4 is taken from queue 1, with Count[1] = 0. Queues 2 and 3 do not participate in
this round of WRR scheduling since Count [2] = 0 and Count[3] = 0.
Then, Count[1] = 0; Count[2] = 0; Count[3] = 0. The counters are initialized again:
Count[1] = 2; Count[2] = 1; Count[3] = 1.
In statistical terms, you can see that the times for the packets to be scheduled in each queue is
in direct ratio to the weight of this queue. The higher the weight, the more the times of
scheduling. If the interface bandwidth is 100 Mbit/s, the queue with the lowest weight can
obtain a minimum bandwidth of 25 Mbit/s, preventing packets in the lower-priority queue
from being starved out when SP scheduling is implemented.
During the WRR scheduling, the empty queue is directly skipped. Therefore, when the rate at
which packets arrive at a queue is low, the remaining bandwidth of the queue is used by other
queues based on a certain proportion.
WRR scheduling has two disadvantages:
l WRR schedules packets based on the number of packets. Therefore, each queue has no
fixed bandwidth. With the same scheduling chance, a long packet obtains higher
bandwidth than a short packet. Users are sensitive to the bandwidth. When the average
lengths of the packets in the queues are the same or known, users can obtain expected
bandwidth by configuring WRR weights of the queues; however, when the average
packet length of the queues changes, users cannot obtain expected bandwidth by
configuring WRR weights of the queues.
l Services that require a short delay cannot be scheduled in time.
DRR
The scheduling principle of DRR is similar to that of RR.
RR schedules packets based on the packet number, whereas DRR schedules packets based on
the packet length.
DRR configures a counter Deficit for each queue. The counters are initialized as the
maximum bytes (assuming Quantum, generally the MTU of the interface) allowed in a round
of DRR scheduling. Each time a queue is scheduled, if the packet length is smaller than
Deficit, a packet is taken away from the queue, and the Deficit counter decreases by 1. If the
packet length is greater than Deficit, the packet is not sent, and the Deficit value remains
unchanged. The system continues to schedule the next queue. After each round of scheduling,
Quantum is added for each queue, and a new round of scheduling is started. Unlike SP
scheduling, DRR scheduling prevents packets in low-priority queues from being starved out.
However, DRR scheduling cannot set weights of queues and cannot schedule services
requiring a low-delay (such as voice services) in time.
As shown in Figure 8-8, after six rounds of DRR scheduling, three 200-byte packets in Q1
and six 100-byte packets in Q2 are scheduled. The output bandwidth ratio of Q1 to Q2 is
actually 1:1.
Unlike SP scheduling, DRR scheduling prevents packets in low-priority queues from being
starved out. However, DRR scheduling cannot set weights of queues and cannot schedule
services requiring a low-delay (such as voice services) in time.

MDRR
Modified Deficit Round Robin (MDRR) is an improved DRR algorithm. MDRR and DRR
implementations are similar. Unlike MDRR, DRR allows the Deficit to be a negative so that
long packets can be properly scheduled. In the next round of scheduling, however, this queue
will not be scheduled. When the counter becomes 0 or a negative, the device stops scheduling
the queue and starts to schedule other queues with a positive counter.
In an example, the MTU of an interface is 150 bytes. Two queues Q1 and Q2 use DRR
scheduling. Multiple 200-byte packets are buffered in Q1, and multiple 100-byte packets are
buffered in Q2. Figure 8-8 shows how DRR schedules packets in these two queues.
Figure 8-8 MDRR scheduling

6th 5th 4th 3rd 2nd 1st
round round round round round round
Before Before Before Before Before Before
scheduling scheduling scheduling scheduling scheduling scheduling
Deficit[1]=0 Deficit[1]=-150 Deficit[1]=50 Deficit[1]=100 Deficit[1]=-50 Deficit[1]=150
Q1: 200 200 200 Stop Stop 200 200 Stop 200
scheduling scheduling scheduling
g After scheduling After scheduling After scheduling
n
li Deficit[1]=-150 Deficit[1]=-100 Deficit[1]=-50
u
d
e
h Before Before Before Before Before Before
c scheduling scheduling
S scheduling scheduling scheduling scheduling
Deficit[2]=100 Deficit[2]=50 Deficit[2]=150 Deficit[2]=100 Deficit[2]=50 Deficit[2]=150
Q2: 10 10 10 10 10 10 100 100

100 100 100 100
0 0 0 0 0 0
After scheduling After scheduling After scheduling After scheduling After scheduling After scheduling
Deficit[2]=0 Deficit[2]=-50 Deficit[2]=50 Deficit[2]=0 Deficit[2]=-50 Deficit[2]=50
All deficits are added with an initial value because all deficits
in the previous round are smaller than or equal to 0.
As shown in Figure 8-8, after six rounds of DRR scheduling, three 200-byte packets in Q1
and six 100-byte packets in Q2 are scheduled. The output bandwidth ratio of Q1 to Q2 is
actually 1:1.
MDRR is an improved DRR algorithm. MDRR and DRR implementations are similar. Unlike
MDRR, DRR allows the Deficit to be a negative so that long packets can be properly
scheduled. In the next round of scheduling, however, this queue will not be scheduled. When
the counter becomes 0 or a negative, the device stops scheduling the queue and starts to
schedule other queues with a positive counter.
DWRR
Compared with DRR, Weighted Deficit Round Robin (WDRR) can set the weights of queues.
DRR scheduling functions the same as WDRR scheduling in which each queue has a weight
1.
DWRR configures a counter, which implies the number of excess bytes over the threshold
(deficit) in the previous round for each queue. The counters are initialized as the Weight x
MTU. Each time a queue is scheduled, a packet is taken away from the queue, and the counter
decreases by 1. When the counter becomes 0, the device stops scheduling the queue and starts
to schedule other queues with a non-0 counter. When the counters of all queues become 0, all
these counters are initialized as weight x MTU, and a new round of DWRR scheduling starts.

In an example, the MTU of an interface is 150 bytes. Two queues Q1 and Q2 use DRR
scheduling. Multiple 200-byte packets are buffered in Q1, and multiple 100-byte packets are
buffered in Q2. The weight ratio of Q1 to Q2 is 2:1. Figure 8-9 shows how WDRR schedules
packets.
Figure 8-9 WDRR scheduling

3rd round 2nd round 1st round
Before Before Before
Deficit[1]=200 Deficit[1]=100 Deficit[1]=300
Q1: 200 200 200 200 200 200
After scheduling After schedulingAfter scheduling
Deficit[1]=0 Deficit[1]=-100 Deficit[1]=100
Before Before Before

Deficit[2]=100 Deficit[2]=50 Deficit[2]=150
Q2: 100 100 100 100 100 100 100 100 100
After scheduling After scheduling After scheduling
Deficit[2]=0 Deficit[2]=-50 Deficit[2]=50
l First round of WDRR scheduling:

The counters are initialized as follows: Deficit[1] = weight1 x MTU = 300 and Deficit[2]
= weight2 x MTU=150. A 200-byte packet is taken from Q1, and a 100-byte packet is
taken from Q2. Then, Deficit[1] = 100 and Deficit[2] = 50.
l Second round of WDRR scheduling:
A 200-byte packet is taken from Q1, and a 100-byte packet is taken from Q2. Then,
Deficit[1] = -100 and Deficit[2] = -50.
l Third round of WDRR scheduling:
The counters of both queues are negatives. Therefore, Deficit[1] = Deficit[1] + weight1 x
MTU = -100 + 2 x 150 = 200 and Deficit[2] = Deficit[2] + weight2 x MTU = -50 + 1 x
150 = 100.
A 200-byte packet is taken from Q1, and a 100-byte packet is taken from Q2. Then,
Deficit[1] = 0 and Deficit[2] = 0.
As shown in Figure 8-9, after three rounds of WDRR scheduling, three 200-byte packets in
Q1 and three 100-byte packets in Q2 are scheduled. The output bandwidth ratio of Q1 to Q2
is actually 2:1, which conforms to the weight ratio.
WDRR scheduling prevents packets in low-priority queues from being starved out and allows
bandwidths to be allocated to packets based on the weight ratio when the lengths of packets in
different queues vary or change greatly.
However, WDRR scheduling does not schedule services requiring a low-delay (such as voice
services) in time.
WFQ
WFQ allocates bandwidths to flows based on the weight. In addition, to allocate bandwidths
fairly to flows, WFQ schedules packets in bits. Figure 8-10 shows how bit-by-bit scheduling
works.

Figure 8-10 Bit-by-bit scheduling

Queue 1: 50% 4bit
Leave
reassembling
Bit-by-bit
Scheduling
queue
Packet
Queue 2: 25% 6bit 8bit 6bit 4bit
Packets leaving a
Queue 3: 25% 8bit queue
The bit-by-bit scheduling mode shown in Figure 8-10 allows the device to allocate
bandwidths to flows based on the weight. This prevents long packets from preempting
bandwidths of short packets and reduces the delay and jitter when both short and long packets
wait to be forwarded.
The bit-by-bit scheduling mode, however, is an ideal one. A NE40Eperforms the WFQ
scheduling based on a certain granularity, such as 256 B and 1 KB. Different boards support
different granularities.
Advantages of WFQ:
l Different queues obtain the scheduling chances fairly, balancing delays of flows.
l Short and long packets obtain the scheduling chances fairly. If both short and long
packets wait in queues to be forwarded, short packets are scheduled preferentially,
reducing jitters of flows.
l The lower the weight of a flow is, the lower the bandwidth the flow obtains.
Port Queue Scheduling

You can configure SP scheduling or weight-based scheduling for eight queues on each
interface of a NE40E. Eight queues can be classified into three groups, priority queuing (PQ)
queues, WFQ queues, and low priority queuing (LPQ) queues, based on scheduling
algorithms.
l PQ queue
SP scheduling applies to PQ queues. Packets in high-priority queues are scheduled
preferentially. Therefore, services that are sensitive to delays (such as VoIP) can be
configured with high priorities.
In PQ queues, however, if the bandwidth of high-priority packets is not restricted, low-
priority packets cannot obtain bandwidth and are starved out.
Configuring eight queues on an interface to be PQ queues is allowed but not
recommended. Generally, services that are sensitive to delays are put into PQ queues.
l WFQ queue
Weight-based scheduling, such as WRR, DWRR, and WFQ, applies to WFQ queues.
The P40-E subcard uses DWRR or DRR, and other boards use WFQ or WRR.
l LPQ queue
LPQ queue is implemented on a high-speed interface (such as an Ethernet interface).
LPQ is not supported on a low-speed interface (such as a Serial or MP-Group interface).
SP scheduling applies to LPQ queues. The difference is that when congestion occurs, the
PQ queue can preempt the bandwidth of the WFQ queue whereas the LPQ queue cannot.
After packets in the PQ and WFQ queues are all scheduled, the remaining bandwidth can
be assigned to packets in the LPQ queue.

In the actual application, best effort (BE) flows can be put into the LPQ queue. When the
network is overloaded, BE flows can be limited so that other services can be processed
preferentially.
WFQ, PQ, and LPQ can be used separately or jointly for eight queues on an interface.
Scheduling Order
SP scheduling is implemented between PQ, WFQ, and LPQ queues. PQ queues are scheduled
preferentially, and then WFQ queues and LPQ queues are scheduled in sequence, as shown in
Figure 8-11. Figure 8-12 shows the detailed process.
Figure 8-11 Port queue scheduling order
Queue 1 Shaping
PQ Shaping SP
……
scheduling
Queue i Shaping
Queue 1 Shaping
WFQ SP Destination
WFQ …… Shaping
scheduling scheduling Port
Queue j Shaping
Queue 1 Shaping
LPQ SP
…… Shaping
scheduling
Queue k Shaping
Figure 8-12 Port queue scheduling process
Start
No A round of PQ
PQ empty？
scheduling
Yes
No A round of WFQ
WFQ empty？
scheduling
Yes
No A round of LPQ
LPQ empty？
scheudling
Yes

l Packets in PQ queues are preferentially scheduled, and packets in WFQ queues are
scheduled only when no packets are buffered in PQ queues.
l When all PQ queues are empty, WFQ queues start to be scheduled. If packets are added
to PQ queues afterward, packets in PQ queues are still scheduled preferentially.
l Packets in LPQ queues start to be scheduled only after all PQ and WFQ queues are
empty.
Bandwidths are preferentially allocated to PQ queues to guarantee the peak information rate
(PIR) of packets in PQ queues. The remaining bandwidth is allocated to WFQ queues based
on the weight. If the bandwidth is not fully used, the remaining bandwidth is allocated to
WFQ queues whose PIRs are higher than the obtained bandwidth until the PIRs of all WFQ
queues are guaranteed. If any bandwidth is remaining at this time, the bandwidth resources
are allocated to LPQ queues.
Bandwidth Allocation Example 1

In this example, the traffic shaping rate is set to 100 Mbit/s on an interface (by default, the
traffic shaping rate is the interface bandwidth). The input bandwidth and PIR of each service
are configured as follows.
Service Class Queue Input Bandwidth PIR (bit/s)

(bit/s)
CS7 PQ 65 M 55 M
CS6 PQ 30 M 30 M
EF WFQ with the weight 10 M 5M

5
AF4 WFQ with the weight 10 M 10 M

4

3

2

1
BE LPQ 100 M Not configured
The bandwidth is allocated as follows:

l PQ scheduling is performed first. The 100 Mbit/s bandwidth is allocated to the CS7
queue first. The output bandwidth of CS7 equals the minimum rate of the traffic shaping
rate (100 Mbit/s), input bandwidth of CS7 (65 Mbit/s), and PIR of CS7 (55 Mbit/s), that
is, 55 Mbit/s. The remaining bandwidth 45 Mbit/s is allocated to the CS6 queue. The
output bandwidth of CS6 equals the minimum rate of the traffic shaping rate (45 Mbit/s),
input bandwidth of CS6 (30 Mbit/s), and PIR of CS6 (30 Mbit/s), that is, 30 Mbit/s.
After PQ scheduling, the remaining bandwidth is 15 Mbit/s (100 Mbit/s - 55 Mbit/s - 30
Mbit/s).

l Then the first round of WFQ scheduling starts. The remaining bandwidth after PQ
scheduling is allocated to WFQ queues. The bandwidth allocated to a WFQ queue is
calculated based on this format: Bandwidth allocated to a WFQ queue = Remaining
bandwidth x Weight of this queue/Sum of weights = 15 Mbit/s x Weight/15.
– Bandwidth allocated to the EF queue = 15 Mbit/s x 5/15 = 5 Mbit/s = PIR. The
bandwidth allocated to the EF queue is fully used.
– Bandwidth allocated to the AF4 queue = 15 Mbit/s x 4/15 = 4 Mbit/s < PIR. The
bandwidth allocated to the AF4 queue is exhausted.
l The bandwidth is exhausted, and BE packets are not scheduled. The output BE
bandwidth is 0.
The output bandwidth of each queue is as follows:
Service Queue Input PIR (bit/s) Output

Class Bandwidth Bandwidth
(bit/s) (bit/s)
CS7 PQ 65 M 55 M 55 M
CS6 PQ 30 M 30 M 30 M
EF WFQ with the 10 M 5M 5M

weight 5
AF4 WFQ with the 10 M 10 M 4M

weight 4

weight 3

weight 2

weight 1
BE LPQ 100 M Not configured 0

In this example, the traffic shaping rate is set to 100 Mbit/s on an interface. The input
bandwidth and PIR of each service are configured as follows.


(bit/s)
CS7 PQ 15 M 25 M
CS6 PQ 30 M 10 M
EF WFQ with the weight 90 M 100 M

5

4

3

2

1
l Packets in the PQ queue are scheduled preferentially to ensure the PIR of the PQ queue.
Mbit/s).
bandwidth x Weight of this queue/Sum of weights = 75 Mbit/s x Weight/15.
– Bandwidth allocated to the EF queue = 75 Mbit/s x 5/15 = 25 Mbit/s < PIR. The
bandwidth allocated to the EF queue is fully used.
– Bandwidth allocated to the AF4 queue = 75 Mbit/s x 4/15 = 20 Mbit/s > PIR. The
AF4 queue actually obtains the bandwidth 10 Mbit/s (PIR). The remaining
bandwidth is 10 Mbit/s.
– Bandwidth allocated to the AF3 queue = 75 Mbit/s x 3/15 = 15 Mbit/s = PIR. The
l The remaining bandwidth is 15 Mbit/s, which is allocated to the queues, whose PIRs are
higher than the obtained bandwidth, based on the weight.
– Bandwidth allocated to the EF queue = 15 Mbit/s x 5/8 = 9.375 Mbit/s. The sum of
bandwidths allocated to the EF queue is 34.375 Mbit/s, which is also lower than the
PIR. Therefore, the bandwidth allocated to the EF queue is exhausted.

– Bandwidth allocated to the AF2 queue = 15 Mbit/s x 2/8 = 3.75 Mbit/s. The sum of
bandwidths allocated to the AF2 queue is 13.75 Mbit/s, which is also lower than the
PIR. Therefore, the bandwidth allocated to the AF2 queue is exhausted.
– Bandwidth allocated to the AF1 queue = 15 Mbit/s x 1/8 = 1.875 Mbit/s. The sum
of bandwidths allocated to the AF1 queue is 6.875 Mbit/s, which is also lower than
the PIR. Therefore, the bandwidth allocated to the AF1 queue is exhausted.
l The bandwidth is exhausted, and the BE queue is not scheduled. The output BE
bandwidth is 0.

(bit/s) (bit/s)
CS7 PQ 15 M 25 M 15 M
CS6 PQ 30 M 10 M 10 M
EF WFQ with the 90 M 100 M 34.375 M

weight 5
AF4 WFQ with the 10 M 10 M 10 M

weight 4

weight 3
AF2 WFQ with the 20 M 25 M 13.75 M

weight 2
AF1 WFQ with the 20 M 20 M 6.875 M

weight 1
BE LPQ 100 M Not configured 0

In this example, the traffic shaping rate is set to 100 Mbit/s on an interface. The input
bandwidth and PIR of each service are configured as follows.

(bit/s)
CS7 PQ 15 M 25 M
CS6 PQ 30 M 10 M
EF WFQ with the weight 90 M 10 M

5

4


(bit/s)

3

2

1
l Packets in the PQ queue are scheduled preferentially to ensure the PIR of the PQ queue.
Mbit/s).
bandwidth x weight of this queue/sum of weights = 75 Mbit/s x weight/15.
– Bandwidth allocated to the EF queue = 75 Mbit/s x 5/15 = 25 Mbit/s > PIR. The EF
queue actually obtains the bandwidth 10 Mbit/s (PIR). The remaining bandwidth is
15 Mbit/s.
– Bandwidth allocated to the AF4 queue = 75 Mbit/s x 4/15 = 20 Mbit/s > PIR. The
AF3 queue actually obtains the bandwidth 10 Mbit/s. The remaining bandwidth is 5
Mbit/s.
l The remaining bandwidth is 30 Mbit/s, which is allocated to the AF1 queue, whose PIRs
are higher than the obtained bandwidth, based on the weight. Therefore, the bandwidth
allocated to the AF1 queue is 5 Mbit/s.
l The remaining bandwidth is 25 Mbit/s, which is allocated to the BE queue.

(bit/s) (bit/s)
CS7 PQ 15 M 25 M 15 M
CS6 PQ 30 M 10 M 10 M


(bit/s) (bit/s)
EF WFQ with the 90 M 10 M 10 M

weight 5

weight 4

weight 3

weight 2

weight 1
BE LPQ 100 M Not configured 25 M
8.3 Congestion Avoidance

Congestion avoidance is a flow control technique used to relieve network overload. By
monitoring the usage of network resources for queues or memory buffers, a device
automatically drops packets that shows a sign of traffic congestion.
Huawei routers support two drop policies:
l Tail drop
l Weighted Random Early Detection (WRED)
Tail Drop
Tail drop is the traditional congestion avoidance mechanism used to drop all newly arrived
packets when congestion occurs.
Tail Drop causes TCP global synchronization. If TCP detects packet loss, TCP enters the
slow-start state. Then TCP probes the network by sending packets at a lower rate, which
speeds up until packet loss is detected again. In Tail drop mechanisms, all newly arrived
packets are dropped when congestion occurs, causing all TCP sessions to simultaneously
enter the slow start state and the packet transmission to slow down. Then all TCP sessions
restart their transmission at roughly the same time and then congestion occurs again, causing
another burst of packet drops, and all TCP sessions enters the slow start state again. The
behavior cycles constantly, severely reducing the network resource usage.
WRED
WRED is a congestion avoidance mechanism used to drop packets before the queue
overflows. WRED resolves TCP global synchronization by randomly dropping packets to
prevent a burst of TCP retransmission. If a TCP connection reduces the transmission rate
when packet loss occurs, other TCP connections still keep a high rate for sending packets. The
WRED mechanism improves the bandwidth resource usage.

WRED sets lower and upper thresholds for each queue and defines the following rules:
l When the length of a queue is lower than the lower threshold, no packet is dropped.
l When the length of a queue exceeds the upper threshold, all newly arrived packets are
tail dropped.
l When the length of a queue ranges from the lower threshold to the upper threshold,
newly arrived packets are randomly dropped, but a maximum drop probability is set. The
maximum drop probability refers to the drop probability when the queue length reaches
the upper threshold. Figure 8-13 is a drop probability graph. The longer the queue, the
larger the drop probability.
Figure 8-13 WRED drop probability

Drop probability
100% Drop probability curve

Maximum drop
probability
Actual queue length

Lower Upper Maximum
threshold threshold queue length
As shown in Figure 8-14, the maximum drop probability is a%, the length of the current
queue is m, and the drop probability of the current queue is x%. WRED delivers a random
value i to each arrived packet, (0 < i% < 100%), and compares the random value with the drop
probability of the current queue. If the random value i ranges from 0 to x, the newly arrived
packet is dropped; if the random value ranges from x to 100%, the newly arrived packet is not
dropped.
Figure 8-14 WRED implementation
Drop
prabability
100% Drop probability curve
a%
x%
Random value I < x
i%
Actual queue length
Lower Upper Maximum
m queue
threshold threshold
length
i: a random value in the range of [0,a]
x%: drop probability when the queue length is m
a%: maximum drop probability

As shown in Figure 8-15, the drop probability of the queue with the length m (lower
threshold < m < upper threshold) is x%. If the random value ranges from 0 to x, the newly
arrived packet is dropped. The drop probability of the queue with the length n (m < n < upper
threshold) is y%. If the random value ranges from 0 to y, the newly arrived packet is dropped.
The range of 0 to y is wider than the range of 0 to x. There is a higher probability that the
random value falls into the range of 0 to y. Therefore, the longer the queue, the higher the
drop probability.
Figure 8-15 Drop probability change with the queue length

Drop probability
100%
a%
y%
x% [0,y]
[0,x]
Actual queue length
Lower m n Upper Maximum
threshold threshold queue
length
x%: drop probability when the queue length is m
y%: drop probability when the queue length is n
a%: configured drop probability, which determines the random value range
As shown in Figure 8-16, the maximum drop probabilities of two queues Q1 and Q2 are a%
and b%, respectively. When the length of Q1 and Q2 is m, the drop probabilities of Q1 and
Q2 are respectively x% and y%. If the random value ranges from 0 to x, the newly arrived
packet in Q1 is dropped, If the random value ranges from 0 to y, the newly arrived packet in
Q2 is dropped. The range of 0 to y is wider than the range of 0 to x. There is a higher
probability that the random value falls into the range of 0 to y. Therefore, When the queue
lengths are the same, the higher the maximum drop probability, the higher the drop
probability.
Figure 8-16 Drop probability change with the maximum drop probability
Drop probability
100%
Q2 drop probability=b%
Q1 drop probability=a%
[0,y] [0,x]
Actual queue length

Lower m Maximum
Upper
threshold queue
threshold
length

You can configure WRED for each flow queue (FQ) and class queue (CQ) on Huawei routers.
WRED allows the configuration of lower and upper thresholds and drop probability for each
drop precedence. Therefore, WRED can allocate different drop probabilities to service flows
or even packets with different drop precedences in a service flow.
Drop Policy Selection

Tail drop applies to PQ queues for services that have high requirements for real-time
performance. Tail drop drops packets only when the queue overflows. In addition, PQ queues
preempt bandwidths of other queues. Therefore, when traffic congestion occurs, highest
bandwidths can be provided for real-time services.
WRED applies to WFQ queues. WFQ queues share bandwidth based on the weight and are
prone to traffic congestion. Using WRED for WFQ queues effectively resolves TCP global
synchronization when traffic congestion occurs.
WRED Lower and Upper Thresholds and Drop Probability Configuration

In actual applications, the WRED lower threshold is recommended to start from 50% and
change with the drop precedence. As shown in Figure 8-17, a lowest drop probability and
highest lower and upper thresholds are recommended for green packets; a medium drop
probability and medium lower and upper thresholds are recommended for yellow packets; a
highest drop probability and lowest lower and upper thresholds are recommended for red
packets. When traffic congestion intensifies, red packets are first dropped due to low lower
threshold and high drop probability. As the queue length increases, the device drops green
packets at last. If the queue length reaches the upper threshold for red/yellow/green packets,
red/yellow/green packets respectively start to be tail dropped.
Figure 8-17 WRED drop probability for three drop precedences

Drop probability
100%
Red drop probability
Yellow drop probability
Green drop probability
Actual queue
Red length
Red Yellow Yellow Green Green Maximum
lower
upper lower higher lower higher queue
threshold
length
Maximum Queue Length Configuration

The maximum queue length can be set on Huawei routers. As 8.2 Queues and Congestion
Management describes, when traffic congestion occurs, packets accumulate in the buffer and
are delayed. The delay is determined by the queue buffer size and the output bandwidth
allocated to a queue. When the output bandwidths are the same, the shorter the queue, the
lower the delay.

The queue length cannot be set too small. If the length of a queue is too small, the buffer is
not enough even if the traffic rate is low. As a result, packet loss occurs. The shorter the
queue, the less the tolerance of burst traffic.
The queue length cannot be set too large. If the length of a queue is too large, the delay
increases along with it. Especially when a TCP connection is set up, one end sends a packet to
the peer end and waits for a response. If no response is received within the timer timeout
period, the TCP sender retransmits the packet. If a packet is buffered for a long time, the
packet has no difference with the dropped ones.
Setting the queue length to 10 ms x output queue bandwidth is recommended for high-priority
queues (CS7, CS6, and EF); setting the queue length to 100 ms x output queue bandwidth is
recommended for low-priority queues.
8.4 Impact of Queue Buffer on Delay and Jitter

Queue Buffer
The Traffic Manager (TM) on the forwarding plane houses high-speed buffers, for which all
interfaces have to compete. To prevent traffic interruptions due to long-time loss in the buffer
battle, the system allocates a small buffer to each interface and ensures that each queue on
each interface can use the buffer.
Impact of Queue Buffer on Delay

The TM puts received packets into the buffer and allows these packets to be forwarded in
time when traffic is not congested. In this case, the period during which packets are stored in
the buffer is at microsecond level, and the delay can be ignored.
When traffic is congested, packets accumulate in the buffer and wait to be forwarded. The
delay greatly prolongs. The interval from the time when a packet enters the buffer to the time
when the packet is forwarded is called the buffer delay or queue delay.
The buffer delay is determined by the buffer size for a queue and the output bandwidth
allocated to the queue. The format is as follows:
Buffer delay = Buffer size for the queue/Output bandwidth for the queue
The buffer size is expressed in bytes, and the output bandwidth (also called the traffic shaping
rate) is expressed in bit/s. Therefore, the preceding format can also be expressed as follows:
Buffer delay = (Buffer size for the queue x 8)/Traffic shaping rate for the queue
As the format indicates, the larger the buffer size, the longer the buffer delay.
Impact of Queue Buffer on Jitter

Jitter refers to the delay-difference between packets in the same flow. Typically, packets are
sent at evenly spaced intervals. However, this interval may fluctuate and cause an irregularity
of delay-difference between the packets. This irregularity is known as a jitter. Jitter negatively
affects real-time services, such as voice and video services, by creating noticeable
intermittence. A voice/video receiving terminal generally uses a buffer mechanism to
minimize jitters. However, if jitters are too severe for the buffer mechanism to mitigate, the
receiving terminal may experience voice or video distortion and intermittence.

Severe jitters are mainly caused by the following two scenarios: 1. Route status on IP
networks frequently changes, causing packets to be transmitted through different routes. 2.
Packets are buffered on various nodes during traffic congestion, resulting in different delays.
Scenario 2 is commonly seen on live networks.
Jitters increase when packet delays become increasingly varied. If packet delays are
controlled at lower levels, jitters are then also controlled. Therefore, you can control jitters by
controlling delays. For example, if delays are controlled below 5 us, delay variations (jitters)
are definitely below 5 us.
As described in Impact of Queue Buffer on Delay, large buffer sizes increase buffer delays.
Controlling buffer sizes means control over packet delays.
Queue Buffer Settings

The maximum buffer size can be set on NE40Es.
The buffer size cannot be set too small. If the length of a queue is too small, the buffer is
insufficient even if the traffic rate is low. As a result, packet loss occurs, triggering jitters.
The buffer size cannot be set too large. If the length of a queue is too large, the delay increases
along with it. Especially when a TCP connection is set up, the sender sends a packet and waits
for a response. If no response is received within the timer timeout period, the sender
retransmits the packet. If a packet is buffered for a long time, the packet has no difference
with the dropped ones.
As described in Impact of Queue Buffer on Delay, Buffer delay = (Buffer size for the queue
x 8)/Traffic shaping rate for the queue.
The following format can be inferred:
Buffer Size for a Queue (bytes) = Traffic shaping rate (bit/s) x maximum delay that is
allowed (s)/8
high-priority services require that the delay be shorter than 10 ms, and Low-priority services
require that the delay be shorter than or equal to 100 ms. Therefore, setting the buffer size to a
value (10 ms x traffic shaping rate) is recommended for high-priority queues (CS7, CS6, and
EF); setting the buffer size to a value (100 ms x traffic shaping rate) is recommended for low-
priority queues.
8.5 HQoS
Hierarchical Quality of Service (HQoS) is a technology that uses a queue scheduling
mechanism to guarantee the bandwidth of multiple services of multiple users in the DiffServ
model.
Traditional QoS performs 1-level traffic scheduling. The device can distinguish services on an
interface but cannot identify users. Packets of the same priority are placed into the same
queue on an interface and compete for the same queue resources.
HQoS uses multi-level scheduling to distinguish user-specific or service-specific traffic and
provide differentiated bandwidth management.
Basic Scheduling Model

The scheduling model consists of two components: scheduler and scheduled object.

Scheduler
Scheduler Attribute: scheduling algorithm based on priority or weight
Behavior: choose queues
Scheduled object
Attribute: 1) Priority/weight
Queue 2) Traffic shaping rate PIR
Queue
3）Drop policy (Tail-Drop/WRED)
Behavior: 1) Enter a queue: Based on tail drop or WRED, drop packets
or allow packets to enter the tail of a queue.
2) Leave a queue: Shape and send packets.
l Scheduler: schedules multiple queues. The scheduler performs a specific scheduling

algorithm to determine the order in which packets are forwarded. The scheduling
algorithm can be Strict Priority (SP) or weight-based scheduling. The weight-based
scheduling algorithms include Deficit Round Robin (DRR), Weighted Round Robin
(WRR), Deficit Weighted Round Robin (WDRR), and Weighted Fair Queuing (WFQ).
For details about scheduling algorithms, see 8.2 Queues and Congestion Management.
The scheduler performs one action: selecting a queue. After a queue is selected by a
scheduler, the packets in the front of the queue are forwarded.
l Scheduled object: refers to a queue. Packets are sequenced in queues in the buffer.
Three configurable attributes are delivered to a queue:
(1) Priority or weight
(2) PIR
(3) Drop policy, including tail drop and Weighted Random Early Detection (WRED)
Packets may enter or leave a queue:
(1) Entering a queue: The device determines whether to drop a received packet based on
the drop policy. If the packet is not dropped, it enters the tail of the queue.
(2) Leaving a queue: After a queue is selected by a scheduler, the packets in the front of
the queue are shaped and then forwarded out of the queue.
Hierarchical Scheduling Model

HQoS uses a tree-shaped hierarchical scheduling model. As shown in Figure 8-18, the
hierarchical scheduling model consists of three types of nodes:
l Leaf node: is located at the bottom layer and identifies a queue. The leaf node is a
scheduled object and can only be scheduled.
l Transit node: is located at the medium layer and refers to both a scheduler and a
scheduled object. When a transit node functions as a scheduled object, the transit node
can be considered a virtual queue, which is only a layer in the scheduling architecture but
not an actual queue that consumes buffers.
l Root node: is located at the top layer and identifies the top-level scheduler. The root node
is only a scheduler but not a scheduled object. The PIR is delivered to the root node to
restrict the output bandwidth.

Figure 8-18 Hierarchical scheduling model
Scheduler Root node
Branch node
Scheduler /transit node
Branch node
Scheduler Scheduler Scheduler /transit node
Queue
Queue
Queue
Queue
Queue
Queue
Queue
Leaf node
A scheduler can schedule multiple queues or schedulers. The scheduler can be considered a
parent node, and the scheduled queue or scheduler can be considered a child node. The parent
node is the traffic aggregation point of multiple child nodes.
Traffic classification rules and control parameters can be specified on each node to classify
and control traffic. Traffic classification rules based on different user or service requirements
can be configured on nodes at different layers. In addition, different control actions can be
performed for traffic on different nodes. This ensures multi-layer/user/service traffic
management.
HQoS Hierarchies
In HQoS scheduling, one-layer transit node can be used to implement three-layer scheduling
architecture, or multi-layer transit nodes can be used to implement multi-layer scheduling
architecture. In addition, two or more hierarchical scheduling models can be used together by
mapping a packet output from a scheduling model to a leaf node in another scheduling model,
as shown in Figure 8-19. This provides flexible scheduling options.

Figure 8-19 Flexible HQoS hierarchies

Root Root
node Scheduler Scheduler node
Transit Transit
Scheduler Scheduler Scheduler node
node Scheduler Scheduler Scheduler
e e e e e e e
e e e e e e e u u u u u
e
u u
e Leaf
Leaf u u u u u u u e
u
e
u
e
u
e
u u
e
u u
e
u
e
u
e
u
e
u
e
u
e
u
e
u Q Q Q Q Q Q Q node
node Q Q Q Q Q Q Q
Three-level
scheduling Mapping
Root
Scheduler Root
node Schedule
r node
Transit Schedule Transit
node r Scheduler node
Schedule
Scheduler
r
Transit Schedul Schedul Schedul Schedul
node er er Schedule Schedul Schedule Schedule Transit
er er node
r er r r
Leaf e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u
e
u Leaf
e e e e e e e e e e e e e e e e e e
node u
Q
u
Q
u
Q
u
Q
u
Q
u
Q
u
Q
u
Q
u
Q
u u u u u u u u u node
Q Q Q Q Q Q Q Q Q
Four-level scheduling Overlapped scheduling
HQoS hierarchies supported by devices of different vendors may be different. HQoS

hierarchies supported by different chips of the same vendor may also be different.
Scheduling Architecture of NE40Es

Figure 8-20 shows class queues (CQs) and port schedulers. NE40Es not configured with
HQoS have only CQs and port schedulers.
Figure 8-20 Scheduling architecture without HQoS
Destination port
Port
SP/
Scheduler attribute:
WFQ
Scheduling algorithm
(SP/WFQ)
CQ
CS7 CS6 EF AF4 AF3 AF2 AF1 BE Attribute: 1) priority/weight
2) traffic shaping rate PIR
3）Drop policiy
(Tail drop/WRED)

A CQ has the following configurable attributes:

l Queue priority and weight
l PIR
l Drop policy, including tail drop and WRED
As shown in Figure 8-21, when HQoS is configured, a router specifies a buffer for flow
queues that require hierarchical scheduling and performs a round of multi-layer scheduling for
these flow queues. After that, the router puts HQoS traffic and non-HQoS traffic together into
the CQ for unified scheduling.
Figure 8-21 HQoS scheduling
CS7 CS6 EF AF4 AF3 AF2 AF1 BE CQ
HQoS traffic Non- HQoS traffic

Mapping
PIR GQ
SP 1) scheduling algorithm
(DRR+SP)
2) PIR
DRR DRR
SQ
CIR
Virtual queue attribute:
1) CIR
EIR 2) PIR
(PIR-CIR)
SP/
SP/
WFQ WFQ Scheduling algorithm
(SP/WFQ)
FQ
FQ2
FQ8
FQ1
FQ2
FQ8
FQ1
… … Attribute:
1) priority/weight
3) drop policy (Tail drop/WRED)
l Leaf node: flow queue (FQ)

A leaf node is used to buffer data flows of one priority for a user. Data flows of each user
can be classified into one to eight priorities. Each user can use one to eight FQs.
Different users cannot share FQs. A traffic shaping value can be configured for each FQ
to restrict the maximum bandwidth.

FQs and CQs share the following configurable attributes:

– Queue priority and weight
– PIR
– Drop policy, including tail drop and WRED
l Transit node: subscriber queue
An SQ indicates a user (for example, a VLAN, LSP, or PVC). You can configure the CIR
and PIR for each SQ.
Each SQ corresponds to eight FQ priorities, and one to eight FQs can be configured. If
an FQ is idle, other FQs can consume the bandwidth of the FQ, but the bandwidth that
can be used by an FQ cannot exceed the PIR of the FQ.
An SQ functions as both a scheduler and a virtual queue to be scheduled.
– As a scheduler: schedules multiple FQs. Priority queuing (PQ), Weighted Fair
Queuing (WFQ), or low priority queuing (LPQ) applies to an FQ. The FQs with the
service class EF, CS6, and CS7 use SP scheduling by default. The flow queues with
the service class BE, AF1, AF2, AF3, and AF4 use WFQ scheduling by default,
with the weight 10:10:10:15:15.
– As a virtual queue to be scheduled: is allocated two attributes, CIR and PIR. Using
metering, the SQ traffic is divided into two parts, the part within the CIR and the
burst part within the PIR. The former part is paid by users, and the latter part is also
called the excess information rate (EIR). The EIR can be calculated using this
format: EIR = PIR - CIR. The EIR refers to the burst traffic rate, which can reach a
maximum of PIR.
– Root node: group queue (GQ)
To simplify operation, you can define multiple users as a GQ, which is similar to a
BGP peer group that comprises multiple BGP peers. For example, all users that
require the same bandwidth or all premium users can be configured as a GQ.
A GQ can be bound to multiple SQs, but an SQ can be bound only to one GQ.
A GQ schedules SQs. DRR is used to schedule the traffic within CIR between SQs.
If any bandwidth is remaining after the first round, DRR is used to schedule the EIR
traffic. The bandwidth of CIR is preferentially provided, and the burst traffic
exceeded the PIR is dropped. Therefore, if a GQ obtains the bandwidth of PIR, each
SQ in the GQ can obtain a minimum bandwidth of CIR or even a maximum
bandwidth of PIR.
In addition, a GQ, as a root node, can be configured with a PIR attribute to restrict
the sum rate of multiple member users of the GQ. All users in this GQ are restricted
by the PIR. The PIR of a GQ is used for rate limit but does not provide bandwidth
guarantee. The PIR of a GQ is recommended to be greater than the sum of CIRs of
all its member SQs. Otherwise, a user (SQ) cannot obtain sufficient bandwidth.
The following example illustrates the relationship between an FQ, SQ, and GQ.
In an example, 20 residential users live in a building. Each residential user purchases the
bandwidth of 20 Mbit/s. To guarantee the bandwidth, an SQ with both the CIR and PIR
of 20 Mbit/s is created for each residential user. The PIR here also restricts the maximum
bandwidth for each residential user. With the subscription of VoIP and IPTV services as
well as the HSI services, carriers promote a new bandwidth packages with the value-
added services (including VoIP and IPTV) added but the bandwidth 20 Mbit/s
unchanged. Each residential user can use VoIP, IPTV, and HSI services.
To meet such bandwidth requirements, HQoS is configured as follows:

– Three FQs are configured for the three services (VoIP, IPTV, and HSI).
– Altogether 20 SQs are configured for 20 residential users. The CIR and PIR are
configured for each SQ.
– One GQ is configured for the whole building and correspond to 20 residential users.
The sum bandwidth of the 20 residential users is actually the PIR of the GQ. Each
of the 20 residential users uses services individually, but the sum bandwidth of them
is restricted by the PIR of the GQ.
The hierarchy model is as follows:
– FQs are used to distinguish services of a user and control bandwidth allocation
among services.
– SQs are used to distinguish users and restrict the bandwidth of each user.
– GQs are used to distinguish user groups and control the traffic rate of twenty SQs.
FQs enable bandwidth allocation among services. SQs distinguish each user. GQs enable
the CIR of each user to be guaranteed and all member users to share the bandwidth.
The bandwidth exceeds the CIR is not guaranteed because it is not paid by users. The
CIR must be guaranteed because the CIR has been purchased by users. As shown in
Figure 8-21, the CIR of users is marked, and the bandwidth is preferentially allocated to
guarantee the CIR. Therefore, the bandwidth of CIR will not be preempted by the burst
traffic exceeded the service rates.
On NE40Es, HQoS uses different architectures to schedule upstream or downstream
queues.
HQoS Scheduling for Upstream Queues
Figure 8-22 Scheduling architecture for upstream queues
SFU
CQ
SP SP/WFQ
WFQ Default configuration, not to be changed
COS0
DRR COS1 DRR COS2 DRR COS3 DRR
(CS7, (AF4, (AF2, (BE) TB
CS6, AF3) AF1) (Target Blade)
EF) DRR DRR DRR DRR Scheduler attribute:
ts 1 t
s t t t 1 t 2 t n t DRR
a s s s s s s
a B to 2 o
…t n
cti o
t 1 o
t 2 o
t a o
t 1 o
t 2 to
a
ic aB aB aB a
ic
Queue attribute:
ic T CB CB CB … CnB ic
t CB …C n t c
i T c
i T …c
i T t
no UT CTB u u CB B u no no no u Default configuration,
Ut U M UT UT UT M UT UT UT M Ut Ut Ut M
not to be changed.
Mapping HQoS traffic Non-HQoS traffic

GQ
PIR Scheduler attribute:
S 1) scheduling algorithm (DRR+SP)
P 2) PIR
DRR DRR
CIR SQ
EIR 1) CIR
(PIR-CIR) 2) PIR
SP/WFQ SP/WFQ Scheduler attribute:
Scheduling algorithm (SP/WFQ)
PIR FQ
1 2 …8 1 2 …8
Q Q Q Q Q Q Attribute:
F F F F F F
1) priority/weight 2) traffic shaping rate PIR 3) drop policy (Tail-Drop/WRED)

The scheduling path of upstream HQoS traffic is FQ -> SQ -> GQ, and then joins the non-
HQoS traffic for the following two-layer scheduling:
l Target Blade (TB) scheduling
TB scheduling is also called Virtual Output Queue (VOQ) scheduling.
On a crossroad shown in the following figure, three vehicles (car, pallet trunk, and
carriage truck) come along crossing A and are bound for crossing B, C, and D
respectively. If crossing B is jammed at this time, the car cannot move ahead and stays in
the way of the pallet trunk and carriage truck, although crossing C and D are clear.
Traffic congestion
Carriage
truck B
Pallet A
trunk
Car
C
D
If three lanes destined for crossing B, C, and D are set on crossing A, the problem is
resolved.
Pallet
trunk
Traffic congestion
Car
Carriage trunk
B
A
C
D
Switched network on a router are similar to crossing A, B, C, and D. If each board

allocates queues to packets destined for different switched network module, such queues
are called VOQs, which allow packets destined for different boards to pass when one
board is congested.
A device automatically specifies VOQs. Users cannot modify the attributes of VOQs.
NOTE
Upstream multicast traffic has not been duplicated and does not show a destination. Therefore,
multicast traffic is put into a separate VOQ.
For unicast traffic, a VOQ is configured for a destination board.

DRR is implemented between unicast VOQs and then between unicast and multicast
VOQs.
l Class queue (CQ) scheduling
Four CQs, COS0 for CS7, CS6, and EF services, COS1 for AF4 and AF3 services,
COS2 for AF2 and AF1 services, and COS3 for BE services, are used for upstream
traffic.
SP scheduling applies to COS0, which is preferentially scheduled. WFQ scheduling
applies to COS1, COS2, and COS3, with the WFQ weight of 1, 2, and 4 respectively.
Users cannot modify the attributes of upstream CQs and schedulers.
Non-HQoS traffic directly enters four upstream CQs, without passing FQs. HQoS traffic
passes FQs and CQs.
The process of upstream HQoS scheduling is as follows:
1. Entering a queue: An HQoS packet enters an FQ. When a packet enters an FQ, the
system checks the FQ status and determines whether to drop the packet. If the packet is
not dropped, it enters the tail of the FQ.
2. Applying for scheduling: After entering the FQ, the packet reports the queue status
change to the SQ scheduler and applies for scheduling. The SQ scheduler reports the
queue status change to the GQ scheduler and applies for scheduling. Therefore, the
scheduling request path is FQ -> SQ -> GQ.
3. Hierarchical scheduling: After receiving a scheduling request, a GQ scheduler selects an
SQ, and the SQ selects an FQ. Therefore, the scheduling path is GQ -> SQ -> FQ.
4. Leaving a queue: After an FQ is selected, packets in the front of the FQ leave the queue
and enters the VOQ tail. The VOQ reports the queue status change to the CQ scheduler
and applies for scheduling. After receiving the request, the CQ selects a VOQ. Packets in
the front of the VOQ leave the queue and are sent to an Switch Network.
Therefore, the scheduling process is (FQ -> SQ -> GQ) + (VOQ -> CQ).
Table 8-1 Parameters for upstream scheduling

Queue/Scheduler Queue Attribute Scheduler Attribute
FQ l Queue priority and -

weight, which can be
configured.
l PIR, which can be
configured. The PIR is
not configured by
default.
l Drop policy, which can
be configured as WRED.
The drop policy is tail
drop by default.

SQ l CIR, which can be l To be configured.

configured. l PQ and WFQ apply to
l PIR, which can be FQs. By default, PQ
configured. applies to EF, CS6, and
CS7 services; WFQ
applies to BE, AF1, AF2,
AF3 and AF4 services.
GQ l PIR, which can be l Not to be configured.

configured. The PIR is l DRR is used to schedule
used to restrict the the traffic within CIR
bandwidth but does not between SQs. If any
provide any bandwidth bandwidth is remaining
guarantee. after the first round,
DRR is used to schedule
the EIR traffic. The
bandwidth of CIR is
preferentially provided,
and the burst traffic
exceeded the PIR is
dropped.
VOQ l Not to be configured. l Not to be configured.

l DRR is implemented
between unicast VOQs
and then between unicast
and multicast VOQs.
CQ l Not to be configured. l Not to be configured.

l SP scheduling applies to
COS0, which is
preferentially scheduled.
Weight-based scheduling
applies to COS1, COS2,
and COS3, with the
WFQ weight of 1, 2, and
4 respectively.
This function applies only to incoming traffic of user queues. The application scenario is the
same as that of common HQoS.
This function is implemented by replacing FQ queues with CAR token buckets. The
scheduling at other layers is the same as that of common HQoS. In this case, the forwarding
chip notifies the eTM chip of the status of the token bucket, and then the eTM chip adds
tokens and determines whether to discard or forward the packets.

HQoS Scheduling for Downstream Queues

On NE40Es, some Physical Interface Cards (PICs) are equipped with a Traffic Manager (TM)
chip, which is called an egress Traffic Manager (eTM) subcard.
If the PIC is equipped with an eTM subcard, downstream scheduling is implemented on the
eTM subcard. If the PIC is not equipped with an eTM subcard, downstream scheduling is
implemented on the downstream TM chip. The scheduling processes vary in the two cases.
l Downstream TM scheduling
Figure 8-23 TM Scheduling architecture for downstream queues

Destination port
PIR
SP/WFQ Port
PIR 1) scheduling algorithm (SP/WFQ)
2) PIR
CQ
Attribute:
CS7 CS6 EF AF4 AF3 AF2 AF1 BE 1) priority/weight
3) drop policy
(Tail-Drop/WRED)
Mapping HQoS traffic Non-HQoS traffic
PIR GQ
SP Scheduler attribute:
1) scheduling algorithm (DRR+SP)
DRR DRR 2) PIR
CIR SQ
EIR Virtual queue attribute:
(PIR-CIR) 1) CIR 2) PIR
SP/WFQ SP/WFQ
FQ
PIR Attribute:
1 2 …8 1 2 …8 1) priority/weight
Q Q Q Q Q Q 2) traffic shaping rate PIR
F F F F F F
3) drop policy (Tail-Drop/WRED)
Downstream TM scheduling includes the scheduling paths FQ -> SQ -> GQ and CQ ->
port. There are eight CQs for downstream traffic, CS7, CS6, EF, AF4, AF3, AF2, AF1,
and BE. Users can modify the queue parameters and scheduling parameters.
The process of downstream TM scheduling is as follows:
a. Entering a queue: An HQoS packet enters an FQ.
b. Applying for scheduling: The downstream scheduling application path is (FQ -> SQ
-> GQ) + (GQ -> destination port).
c. Hierarchical scheduling: The downstream scheduling path is (destination port ->
CQ) + (GQ -> SQ -> FQ).
d. Leaving a queue: After an FQ is selected, packets in the front of the FQ leave the
queue and enters the CQ tail. The CQ forwards the packet to the destination port.
Non-HQoS traffic directly enters eight downstream CQs, without passing FQs.

Table 8-2 Parameters for downstream TM scheduling


configured.
l PIR, which can be
not configured by
default.
be configured as
WRED. The drop
policy is tail drop by
default.

CS7 services; WFQ
applies to BE, AF1,
AF2, AF3 and AF4
services.

configured. The PIR is l DRR is used to
used to restrict the schedule the traffic
bandwidth but does not within CIR between
provide any bandwidth SQs. If any bandwidth
guarantee. is remaining after the
first round, DRR is
used to schedule the
EIR traffic. The
bandwidth of CIR is
exceeded the PIR is
dropped.

CQ l Queue priority and -

configured.
l PIR, which can be
not configured by
default.
be configured as
WRED. The drop
default.
Port l PIR, which can be l To be configured.

FQs. By default, PQ
applies to EF, CS6, and
CS7 services; WFQ
applies to BE, AF1,
AF2, AF3 and AF4
services.
l Downstream eTM scheduling
Figure 8-24 eTM scheduling architecture for downstream queues

Destination Port
Port
PI Scheduler attribute:
R 1) scheduling algorithm (DRR)
DR
R 2) PIR
Parent GQ/VI
PIR
DRR/ PIR
WFQ Scheduler attribute:
Scheduling algorithm (DRR)
PIR PIR GQ
SP SP PIR
DRR DRR DRR DRR Scheduling algorithm (DRR+SP)
SQ
CIR
EIR PIR 1) CIR
(PIR-CIR) 2) PIR
… Scheduler attribute:
SP/WFQ SP/WFQ SP/WFQ
FQ
PIR PIR Attribute:
1 2 …8 … 1
Q
2
Q
…8 1 2 … 8Q
Q Q Q Q Q Q
F F F F F F C C C 1) priority/weight
HQoS flow Non-HQoS traffic 3) drop policy(Tail-Drop/WRED)

Unlike downstream TM scheduling, downstream eTM scheduling is a five-level

scheduling architecture. Downstream eTM scheduling uses only FQs but not CQs. In
addition to the scheduling path FQ -> SQ -> GQ, a parent GQ scheduling, also called a
virtual interface (VI) scheduling, is implemented.
NOTE
The VI is only a name of a scheduler but not a real virtual interface. In actual applications, a VI
corresponds to a sub-interface or a physical interface. The VI refers to different objects in different
applications.
The differences between downstream TM scheduling and downstream eTM scheduling
are as follows:
– Downstream TM scheduling uses two scheduling architectures. The five-level
scheduling consists of two parts, (FQ -> SQ -> GQ) + (CQ -> port). HQoS traffic is
scheduled in the path of FQ -> SQ -> GQ, and enters a CQ with non-HQoS traffic
in the scheduling path of CQ -> port.
– Downstream eTM, an entity queue scheduling method, uses the scheduling path of
FQ -> SQ -> GQ -> VI -> port. The system sets a default SQ for eight CQs
configured for non-HQoS traffic. The SQ directly participates in the port
scheduling.
Table 8-3 Parameters for downstream eTM scheduling


configured.
l PIR, which can be
not configured by
default.
be configured as
WRED. The drop
default.

CS7 services; WFQ
applies to BE, AF1,
AF2, AF3 and AF4
services.


configured. The PIR is l DRR is used to
used to restrict the schedule the traffic
bandwidth but does not within CIR between
provide any bandwidth SQs. If any bandwidth
guarantee. is remaining after the
first round, DRR is
used to schedule the
EIR traffic. The
bandwidth of CIR is
exceeded the PIR is
dropped.
Parent GQ/VI l PIR, which can be l Different scheduling

configured. The PIR is algorithms, such as
used to restrict the DRR and WFQ, are
bandwidth but does not used on different
provide any bandwidth boards.
guarantee.
Port l PIR, which can be l To be configured.

FQs. By default, PQ
applies to EF, CS6, and
CS7 services; WFQ
applies to BE, AF1,
AF2, AF3 and AF4
services.
l Difference Between Downstream TM Scheduling and eTM Scheduling
Table 8-4 Difference between downstream TM scheduling and eTM scheduling

Difference Downstream TM Downstream eTM
Scheduling Scheduling
port-queue command This command takes This command takes effect

effect for traffic of a for traffic of a specific
specific priority on an priority in a non-flow
interface. queue.
GQ bandwidth share The GQ bandwidth can be The GQ bandwidth can be

shared by its member SQs shared by its member SQs
on a TM chip if the same only on the same interface,
GQ profile is used. even if the same GQ
profile is used.

Difference Downstream TM Downstream eTM

Scheduling Scheduling
SQ bandwidth share Multiple SQs in the same Multiple SQs in the same
GQ on different physical GQ on different sub-
interfaces share the interfaces but not physical
bandwidth. interfaces share the
bandwidth.
Trunk and member The port-queue command The port-queue command

interfaces can be configured on a can be configured on a
trunk interface or its trunk interface or its
member interfaces. The member interfaces, but the
command configuration command configuration
on a member interface takes effect only for non-
takes effect preferentially flow queues.
and schedules all traffic on
the member interface.
Use traffic shaping as an example to illustrate the difference between downstream TM

scheduling and eTM scheduling. Assume that traffic shaping is configured for port
queues on an interface and for flow queues or user queues on a sub-interface.
flow-queue FQ
queue ef shaping 10M
interface gigabitethernet1/0/0
port-queue ef shaping 100M
interface gigabitethernet1/0/0.1
user-queue cir 50m pir 50m flow-queue FQ
interface gigabitethernet1/0/0.2
//Note: user-queue and qos-profile are not configured on
gigabitethernet1/0/0.2.
– For downstream TM scheduling, the traffic shaping rate configured using the port-
queue command determines the sum bandwidth of both HQoS and non-HQoS
traffic. Based on the preceding configuration:
n The rate of EF traffic sent from GE 1/0/0.1 does not exceed 10 Mbit/s.
n The rate of EF traffic sent from GE 1/0/0 (including GE 1/0/0, GE 1/0/0.1, and
GE 1/0/0.2) does not exceed 100 Mbit/s.
– For downstream eTM scheduling, the traffic shaping rate configured using the port-
queue command determines the sum bandwidth of non-HQoS traffic (default SQ
bandwidth). Based on the preceding configuration:
n The rate of EF traffic sent from GE 1/0/0 and GE 1/0/0.2 (non-HQoS traffic)
does not exceed 100 Mbit/s.
n The rate of EF traffic sent from GE 1/0/0.1 does not exceed 10 Mbit/s.
n The rate of EF traffic sent from GE 1/0/0 can reach a maximum of 110 Mbit/s.
HQoS Priority Mapping

Both upstream and downstream HQoS scheduling uses two entity queues: eight FQs and four
CQs for upstream scheduling, and eight FQs and eight CQs for downstream scheduling.

Packets enter an FQ based on the service class. After that, packets in the front of the FQ
queue leave the queue. and enter a CQ based on the mapping.
The mapping from FQs to CQs can be in Uniform or Pipe mode.
l Uniform: The system defines a fixed mapping. Upstream scheduling uses the uniform
mode.
l Pipe: Users can modify the mapping. The original priorities carried in packets will not be
modified in pipe mode.
By default, in the downstream HQoS, the eight priority queues of an FQ and eight CQs are in
one-to-one mapping. In the upstream HQoS, COS0 corresponds to CS7, CS6, and EF, COS1
corresponds to AF4 and AF3, COS2 corresponds to AF2 and AF1, and COS3 corresponds to
BE.
Share Shaping
Share shaping, also called Flow Group Queue shaping (FGQ shaping), implements traffic
shaping for a group that two or more flow queues (FQs) in a subscriber queue (SQ) constitute.
This ensures that other services in the SQ can obtain bandwidths.
For example, a user has HSI, IPTV, and VoIP services, and IPTV services include IPTV
unicast and multicast services. To ensure the CIR of IPTV services and prevent IPTV services
from preempting the bandwidth reserved for HIS and VoIP services, you can configure four
FQs, each of which is specially used for HSI, IPTV unicast, and IPTV multicast, and VoIP
services. As shown in Figure 8-25, share shaping is implemented for IPTV unicast and
multicast services, and then HQoS is implemented for all services.
Figure 8-25 Share shaping
FQ: VoIP
FQ: IPTV Unicast

SQ GQ
scheduling scheduling
FQ: IPTV Multicast
FQ: HSI
Shaping rate PIR
Share-shaping
Currently, a maximum of two share shaping configurations can be configured for eight FQs
on each SQ, as shown by the first two modes in Figure 8-26. The third share shaping mode
shown in Figure 8-26 is not available on the NE40E.

Figure 8-26 Share shaping modes not yet supported on a NE40E

FQ1 FQ1 FQ1
FQ2 FQ2 FQ2
FQ3 FQ3 FQ3
FQ4 SQ FQ4 SQ FQ4 SQ

FQ5 FQ5 FQ5
FQ6 FQ6 FQ6
FQ7 FQ7 FQ7
FQ8 FQ8 FQ8
Shaping rate PIR

Share-shaping
Share shaping can be implemented in either of the following modes on a NE40E:

l Mode A: Share shaping only shapes but not schedule traffic, and queues to which share
shaping applies can use different scheduling algorithms.
l Mode B: Queues to which share shaping applies must share the same scheduling
algorithm. These share-shaping-capable queues are scheduled in advance, and then
scheduled with other queues in the SQ as a whole. When share-shaping-capable queues
are scheduled with other queues in the SQ, the highest priority of the share-shaping-
capable queues becomes the priority of the share-shaping-capable queues as a whole, and
the sum of weights of the share-shaping-capable queues become the weight of the share-
shaping-capable queues as a whole.
Example 1: As shown in Figure 8-27, the traffic shaping rate of the BE, EF, AF1, AF2, and
AF3 queues is 90 Mbit/s, and the sum bandwidth of the SQ is 100 Mbit/s. All queues use the
strict priority (SP) scheduling. After share shaping is configured, the sum bandwidth of the
AF1 and AF3 queues is set to 80 Mbit/s.
Figure 8-27 SP scheduling for share shaping

All FQ PIR=90Mbps
EF
Share-shaping rate=80Mbps
AF3
SQ PIR=100Mbps
AF2
SP
AF1
BE Shaping rate PIR
Share-shaping

Assume that the PIR is ensured for the SQ. The input rate of the EF queue is 10 Mbit/s, and
that of each other queue is 70 Mbit/s. Share shaping allocates bandwidths to the queues in
either of the following modes:
l Mode A: SP scheduling applies to all queues.

– The EF queue obtains the 10 Mbit/s bandwidth, and the remaining bandwidth is 90
Mbit/s.
– The bandwidth allocated to the AF3 queue is calculated in the following format:
Min { AF3 PIR, share-shaping PIR, SQ PIR, remaining bandwidth} = Min {90
Mbit/s, 80 Mbit/s, 100 Mbit/s, 90 Mbit/s} = 80 Mbit/s. The traffic rate of the AF3
queue, however, is only 70 Mbit/s. Therefore, the AF3 queue actually obtains the 70
Mbit/s bandwidth, leaving the 20 Mbit/s bandwidth available for other queues.
Min { AF2 PIR, SQ PIR, remaining bandwidth}= Min {90 Mbit/s, 100 Mbit/s, 20
Mbit/s} = 20 Mbit/s. Therefore, the AF2 queue obtains the 20 Mbit/s bandwidth,
and no bandwidth is remaining.
– The AF1 and BE queues obtain no bandwidth.
l Mode B: The EF is scheduled firstly, and then AF3 and AF1 are scheduled as a whole,
and then BE is scheduled.
– The EF queue obtains the 10 Mbit/s bandwidth, and the remaining bandwidth is 90
Mbit/s.
Min { AF3 PIR, share-shaping PIR, SQ PIR, remaining bandwidth} = Min {90
Mbit/s, 80 Mbit/s, 100 Mbit/s, 90 Mbit/s} = 80 Mbit/s. The input rate of the AF3
queue, however, is only 70 Mbit/s. Therefore, the AF3 queue actually obtains the 70
Mbit/s bandwidth, leaving the 20 Mbit/s bandwidth available for other queues.
Min { AF1 PIR, share-shaping PIR - AF3 bandwidth, SQ PIR, remaining
bandwidth } = Min {90 Mbit/s, 10 Mbit/s, 20 Mbit/s} = 10 Mbit/s. Therefore, the
AF1 queue obtains the 10 Mbit/s bandwidth, and the remaining bandwidth becomes
10 Mbit/s.
Min { AF2 PIR, SQ PIR, remaining bandwidth } = Min { 90 Mbit/s, 100 Mbit/s, 10
Mbit/s } = 10 Mbit/s. Therefore, the AF2 queue obtains the 10 Mbit/s bandwidth,
and no bandwidth is remaining.
– The BE queue obtains no bandwidth.
The following table shows the bandwidth allocation results.
Queu Sched Input PIR Output Bandwidth (Mbit/s)

e uling Bandwidt (Mbit/s)
Algori h (Mbit/s) Mode A Mode B
thms
EF SP 10 90 10 10
AF3 SP 70 90 70 70
AF2 SP 70 90 20 10


thms
AF1 SP 70 90 0 10
BE SP 70 Not 0 0
configure
d
Example 2: Assume that the WFQ scheduling applies to the EF, AF1, AF2, and AF3 queues
with the weight ratio as 1:1:1:2 (EF:AF3:AF2:AF1) in example 1. The LPQ scheduling
applies to the BE queue. The PIR 100 Mbit/s is ensured for the SQ. The input rate of the EF
and AF3 queues is 10 Mbit/s, and that of each other queue is 70 Mbit/s. Share shaping
allocates bandwidths to the queues in either of the following modes:
l Mode A: The WFQ scheduling applies to all queues.
First-round WFQ scheduling:
– The bandwidth allocated to the EF queue is calculated as follows: 1/(1 + 1 + 1 + 2)
x 100 Mbit/s=20 Mbit/s. The input rate of the EF queue, however, is only 10 Mbit/s.
Therefore, the remaining bandwidth is 90 Mbit/s.
– The bandwidth allocated to the AF3 queue is calculated as follows: 1/(1 + 1 + 1
+ 2) x 100 Mbit/s=20 Mbit/s. The input rate of the AF3 queue, however, is only 10
Mbit/s. Therefore, the remaining bandwidth is 80 Mbit/s.
+ 2) x 100 Mbit/s=20 Mbit/s. Therefore, the AF2 queue obtains the 20 Mbit/s
bandwidth, and the remaining bandwidth becomes 60 Mbit/s.
+ 2) x 100 Mbit/s=40 Mbit/s. Therefore, the AF2 queue obtains the 40 Mbit/s
bandwidth, and the remaining bandwidth becomes 20 Mbit/s.
Second-round WFQ scheduling:
– The bandwidth allocated to the AF2 queue is calculated as follows: 1/(1 + 2) x 20
Mbit/s = 6.7 Mbit/s.
– The bandwidth allocated to the AF1 queue is calculated as follows: 2/(1 + 2) x 20
No bandwidth is remaining, and the BE queue obtains no bandwidth.
l Mode B: The AF3 and AF1 queues, as a whole, are scheduled with the EF and AF2
queues using the WFQ scheduling. The weight ratio is calculated as follows: EF:
(AF3+AF1):AF2 = 1:(1+2):1 = 1:3:1.
First-round WFQ scheduling:
– The bandwidth allocated to the EF queue is calculated as follows: 1/(1 +3 + 1) x
100 Mbit/s=20 Mbit/s. The input rate of the EF queue, however, is only 10 Mbit/s.
Therefore, the EF queue actually obtains the 10 Mbit/s bandwidth, and the
remaining bandwidth is 90 Mbit/s.
– The bandwidth allocated to the AF3 and AF1 queues, as a whole, is calculated as
follows: 3/(1 + 3 + 1) x 100 Mbit/s = 60 Mbit/s. Therefore, the remaining
bandwidth becomes 30 Mbit/s. The 60 Mbit/s bandwidth allocated to the AF3 and

AF1 queues as a whole are further allocated to each in the ratio of 1:2. The 20
Mbit/s bandwidth is allocated to the AF3 queue. The input rate of the AF3 queue,
however, is only 10 Mbit/s. Therefore, the AF3 queue actually obtains the 10 Mbit/s
bandwidth, and the remaining 50 Mbit/s bandwidth is allocated to the AF1 queue.
– The bandwidth allocated to the AF2 queue is calculated as follows: 1/(1 +3 + 1) x
100 Mbit/s=20 Mbit/s. Therefore, the AF2 queue obtains the 20 Mbit/s bandwidth,
and the remaining bandwidth becomes 10 Mbit/s.
Second-round WFQ scheduling:
– The bandwidth allocated to the AF3 and AF1 queues as a whole is calculated as
follows: 3/(3 + 1) x 10 Mbit/s=7.5 Mbit/s. The 7.5 Mbit/s bandwidth, not exceeding
the share shaping bandwidth, can be all allocated to the AF3 and AF1 queues as a
whole. The PIR of the AF3 queue has been ensured. Therefore, the 7.5 Mbit/s
bandwidth is allocated to the AF1 queue.
– The bandwidth allocated to the AF2 queue is calculated as follows: 1/(3 + 1 ) x 10
No bandwidth is remaining, and the BE queue obtains no bandwidth.
The following table shows the bandwidth allocation results.
thms
EF SP 10 90 10 10
AF3 SP 70 90 10 70
AF2 SP 70 90 26.7 22.5
AF1 SP 70 90 53.3 57.5
BE SP 70 Not 0 0
configure
d
HQoS Applications and Classification

Use the following HQoS applications as examples:
l No QoS configuration: If QoS is not configured on an outbound interface and the default
configuration is used, the scheduling path is 1 port queue -> 1 VI queue -> 1 GQ -> 1 SQ
-> 1 FQ. Only one queue is scheduled at each layer. Therefore, the scheduling path can
be considered an FIFO queue.
l Distinguishing only service priorities: If an outbound interface is configured only to
distinguish service priorities, the scheduling path is 1 port queue -> 1 VI queue -> 1 GQ
-> 1 SQ -> 8 FQs. Therefore, the scheduling path can be considered port->FQ.
l Distinguishing service priorities + users: As shown in the following figure, an L3
gateway is connected to an RTN that has three base stations attached. To distinguish the
three base stations on GE 1/0/0 of the L3 gateway and services of different priorities
from the three base stations, configure the hierarchy architecture as port -> base station -
> base station service, corresponding to the scheduling path: port -> 1 VI queue -> 1 GQ
-> 3 SQs -> 8 FQs.

2G
RTN RTN
3G L3 Gateway
IP
backbone
interface1 Internet
LTE
l Distinguishing priorities + users + aggregation devices: As shown in the following

figure, an L3 gateway is connected to two RTNs (aggregation devices) that each has
three base stations attached. To distinguish the two RTNs on GE 1/0/0 of the L3 gateway,
three base stations on each RTN, and services of different priorities on the three base
stations, configure the hierarchy architecture as port -> RTN -> base station -> base
station services, corresponding to the scheduling path: 1 port -> 1 VI queue -> 2 GQs ->
3 SQs -> 8 FQs.
2G
RTN RTN
3G L3 Gateway
IP
backbone
interface1 Internet
RTN
LTE
HQoS is configured by Qos profile.

l Class-based HQoS
Class-based HQoS uses multi-field (MF) classification to classify packets that require
HQoS scheduling and treats all packets that match the classification rule as a user.
An SQ is configured in the traffic behavior view, then a traffic policy that defines the
traffic behavior is applied to an interface.
Class-based HQoS takes effect only for upstream queues.
As shown in Figure 8-28, two interfaces on a router connect to RTNs, and each RTN has
three base stations attached, each of which runs services of different priorities. The
router, as an L3 gateway, is required to distinguish the three base stations and services of
different priorities on the three base stations.

Figure 8-28 Class-based HQoS
2G
RTN RTN
IP
3G L3 Gateway backbone
Internet
/0
1/0
GE
Flow
FQ: video SQ:
LTE FQ: voice
FQ:other 2G GQ:
2G FQ: video SQ: RTN1
FQ: voice
FQ:other 3G
RTN RTN FQ: video SQ:
FQ: voice
FQ:other LTE GE1/0/0
3G
FQ: video SQ:
FQ: voice
FQ:other 2G
FQ: video SQ: GQ:
FQ: voice
FQ:other 3G RTN1
LTE FQ: video
FQ: voice
SQ:
FQ:other LTE
You can configure class-based HQoS to meet the preceding requirements. An interface
has two user groups (two RTNs), one user group has three users (three base stations), and
one base station runs multiple services. The hierarchical architecture is configured as
port -> RTN -> base station -> base station services, corresponding to the scheduling
path: port -> GQ -> SQ ->FQ.
l Profile-based HQoS
Traffic that enters different interfaces can be scheduled in an SQ.
Profile-based HQoS implements QoS scheduling management for access users by
defining various QoS profiles and applying the QoS profiles to interfaces. A QoS profile
is a set of QoS parameters (such as the queue bandwidth and flow queues) for a specific
user queue.
Profile-based HQoS supports upstream and downstream scheduling.
As shown in Figure 8-29, the router, as an edge device on an ISP network, accesses a
local area network (LAN) through E-Trunk 1. The LAN houses 1000 users that have
VoIP, IPTV, and common Internet services. Eth-Trunk 1.1000 accesses VoIP services;
Eth-Trunk 1.2000 accesses IPTV services; Eth-Trunk 1.3000 accesses other services.
The 802.1p value in the outer VLAN tag is used to identify the service type (802.1p
value 5 for VoIP services and 802.1p value 4 for IPTV services). The VID that ranges
from 1 to 1000 in the inner VLAN tag identifies the user. The VIDs of Eth-Trunk 1.1000
and Eth-Trunk 1.2000 are respectively 1000 and 2000. It is required that the sum
bandwidth of each user be restricted to 120 Mbit/s, the CIR be 100 Mbit/s, and the
bandwidth allocated to VoIP and IPTV services of each user are respectively 60 Mbit/s
and 40 Mbit/s. Other services are not provided with any bandwidth guarantee.

Figure 8-29 Profile-based HQoS

LAN
Inner-Q=1~1000
E-Trunk1 IP backbone
Internet
E-Trunk1.1000 FQ: EF, PIR=60M SQ1

VoIP,OuterQ=1000 FQ: AF4, PIR=40M InnerQ=1
FQ: Others
E-Trunk1.2000 GQ
IPTV,OuterQ=2000 ....
E-Trunk1.3000 FQ: EF, PIR=60M SQ1000

Others,OuterQ=3000 FQ: AF4, PIR=40M InnerQ=1000
FQ: Others
Trust 8021p SQ CIR=100M, PIR=120M
You can configure profile-based HQoS to meet the preceding requirements. Only traffic
with the same inner VLAN ID enters the same SQ. Therefore, 1000 SQs are created.
Traffic with the same inner VLAN ID but different outer VLAN IDs enter different FQs
in the same SQ.
HQoS Implementation on a Low-speed Links Interface

Interface-based HQoS allows an interface to function as a user. It means that all packets on an
interface belong to an SQ.
Interface-based HQoS supports upstream and downstream scheduling.
HQoS Scheduling for Common Users

HQoS for common users identifies services by priority and then performs uniform scheduling.
HQoS Scheduling for Family Users

A family may use multiple terminals to demand various services, such as VoIP, IPTV, and
HSI. These services have different requirements for delay, jitter, and bandwidth. In addition,
requirements of high-priority services must be satisfied preferentially when network resources
are insufficient. QoS scheduling must be performed based on a family rather than on each
separate terminal.

Feature Description - QoS 9 MPLS QoS
9 MPLS QoS
About This Chapter
9.1 MPLS QoS Overview

9.2 MPLS DiffServ
9.3 MPLS HQoS
9.1 MPLS QoS Overview

Multiprotocol Label Switching (MPLS) uses label-based forwarding to replace traditional
route-based forwarding. MPLS has a powerful and flexible routing function and can meet the
requirements of various applications for the network. MPLS can be implemented on various
physical media, such as the Ethernet, PPP, ATM.
Currently MPLS widely applies to large-scale networks. Therefore, quality of service (QoS)
for MPLS networks must be deliberately deployed.
MPLS establishes label switched paths (LSPs) to implement connection-oriented forwarding.
QoS for LSP provides QoS guarantee for data flows transmitted over LSPs. Therefore, the
DiffServ and IntServ models are applied to MPLS networks. The combination of MPLS and
IntServ shapes multiprotocol label switching traffic engineering (MPLS TE), and the
combination of MPLS and DiffServ shapes MPLS DiffServ.
MPLS TE - Combination of MPLS and IntServ

IntServ uses the Resource Reservation Protocol (RSVP) to apply for resources over the entire
network and maintains a forwarding state for each data flow, hindering the extensibility.
Therefore, IntServ does not prevail on networks. Relevant standards, however, extends RSVP
by allowing RSVP PATH messages to carry label requests and RSVP RESV messages to
support label allocation. The extended RSVP is called Resource Reservation Protocol-Traffic
Engineering (RSVP-TE). RSVP-TE allows MPLS to control the path through which traffic
traverses and reserve resources during LSP establishment so that traffic can bypass congestion
nodes. This method of balancing network traffic is called MPLS TE.
MPLS TE controls the path through which traffic traverses but cannot identify services.
Traffic is transmitted along LSPs, regardless of service priorities. Therefore, if the actual

traffic rate exceeds the specification, requirements for services that are sensitive to QoS are
not satisfied. Therefore, MPLS TE alone cannot provide the QoS guarantee.
MPLS DiffServ - Combination of MPLS and DiffServ

The DiffServ model can distinguish services based on packet contents and allow packets with
high priorities to be forwarded preferentially. Therefore, DiffServ widely applies to MPLS
networks.
However, DiffServ reserves resources only on a single node and cannot specify the bandwidth
for each service in advance. When the traffic rate exceeds the allowed bandwidth, high-
priority services are forwarded preferentially at the cost that delays and packet loss of low-
priority services deteriorate. In the case of severe traffic congestion, even high-priority
services are delayed or lost. Therefore, MPLS DiffServ alone can hardly provide the end-to-
end QoS guarantee or allow services to comply with the Service Level Agreement (SLA).
VPN QoS - MPLS QoS Application on MPLS VPNs

VPN QoS combines MPLS QoS and MPLS VPN to serve for networking that bears services
of various priorities. VPN QoS distinguishes services of different priorities and ensures that
high-priority services are forwarded preferentially. This guarantees the QoS for important
services on VPNs.
DiffServ, RSVP-TE, and MPLS VPN can be jointly used based on actual requirements to
isolate services, distinguish services of different priorities, ensure bandwidth resources for
important services or important VPNs, and forwards packets on VPNs or MPLS-TE tunnels
based on packet priorities. This provides a solid technical basis for carriers to develop voice,
video, and SLA-complying VPN services.
9.2 MPLS DiffServ

MPLS DiffServ Traffic Classification
In DiffServ model, traffic classification is implemented at network edges to classify packets
into multiple priorities or service classes. If the IP precedence fields in IP headers are used to
identify packets, the packets can be classified into eight (23) classes. If the DSCP fields in IP
headers are used to identify packets, the packets can be classified into 64 (26) classes. On each
node through which packets pass, the DSCP or IP precedence fields in IP headers are checked
to determine the per-hop behavior (PHB) of the packets.
On an MPLS network, however, the label switching router (LSR) does not check the IP
header information. Therefore, traffic classification cannot be implemented based on the ToS
or DSCP fields of packets. Relevant standards defines two schemes for traffic classification
on an MPLS network.
Scheme 1: E-LSP
The EXP-Inferred-PSC LSP (E-LSP) scheme uses the 3-bit EXP value in an MPLS header to
determine the PHB of the packets. Figure 9-1 shows an MPLS header.
Figure 9-1 MPLS header

Label Exp S TTL

The EXP value can be copied from the DSCP or IP precedence in an IP packet or be set by
MPLS network carriers.
The label determines the forwarding path, and the EXP determines the PHB.
The E-LSP is applicable to networks that support not more than eight PHBs. The precedence
field in an IP header also has three bits, same as the EXP field length. Therefore, one
precedence value in an IP header exactly corresponds to one precedence value in an MPLS
header. However, the DSCP field in an IP header has six bits, different from the EXP length.
Therefore, more DSCP values correspond to only one EXP value. As the IEEE standard
defines, the three left-most bits in the DSCP field (the CSCP value) correspond to the EXP
value, regardless of what the three right-most bits are.
During traffic classification, the EXP value in an MPLS packet is mapped to the scheduling
precedence and drop precedence. Except traffic classification, QoS operations on an MPLS
network, such as traffic shaping, traffic policing, and congestion avoidance, are implemented
in the same manner as those on an IP network.
Figure 9-2 E-LSP

LSR LSR LSR
Exp=5 Exp=0 Exp=0 Exp=5
E-LSP
BE queue
EF queue
When the MPLS packet is leaving the LSR, the scheduling precedence and drop precedence
are mapped back to the EXP value for further EXP-based operations on the network.
NOTE
For more details about the default mapping between the EXP value, service class, and color on NE40Es,
see 6.3.2 QoS Priority Mapping.
Scheme 2: L-LSP
The Label-Only-Inferred-PSC LSP (L-LSP) scheme uses labels to transmit PHB information.
The EXP field has only three bits, and therefore cannot be used alone to identify more than
eight PHBs. Instead, only the 20-bit label in an MPLS header can be used to identify more
than eight PHBs. The L-LSP is applicable to networks that support more than eight PHBs.
During packet forwarding, the label determines the forwarding path and scheduling behaviors
of the packets; the EXP carries the drop precedence. Therefore, the label and EXP both
determine the PHB. PHB information needs to be transmitted during LSP establishment. The
L-LSPs can transmit single-PHB flow, and also multi-PHB flow that has packets of the same
scheduling behavior but different drop precedences.

Comparison Between the Two Schemes

In the L-LSP scheme, an LSP must be established for each type of services from the ingress
LSR to the egress LSRs. An L-LSP supports only one type of service.
In the E-LSP scheme, only one LSP needs to be established between the ingress and egress
LSRs to support up to eight PHBs.
Table 9-1 Comparison between E-LSPs and L-LSPs

E-LSP L-LSP
The EXP determines the PHB (including the The label and EXP determine the PHB.
drop precedence).
No additional signals are needed. Signals are needed to establish LSPs.
Not supported on ATM links. Supported on ATM links.
Each LSP supports up to eight behavior Each LSP supports only one BA.
aggregates (BAs).
The principles for selecting L-LSPs or E-LSPs are as follows:

l Link layer: On PPP networks or LANs, E-LSPs, L-LSPs, or both can be used. On ATM
networks, the EXP value is invisible, and therefore only L-LSPs can be used.
l Service type: Up to eight PHBs are supported if only E-LSPs are used. To support more
than eight PHBs, you must use L-LSPs or use both E-LSPs and L-LSPs.
l Network load: Using E-LSPs reduces the LSP quantity, label resource consumption, and
signaling. Using L-LSPs is more resource-consuming.
Generally a network provides a maximum of four types of services, which can be transmitted
using E-LSPs. L-LSPs are used on ATM networks or networks that require the QoS guarantee
for various services with different drop precedences. NE40Es support E-LSPs only.
CoS Processing in MPLS DiffServ

The DiffServ model allows transit nodes in a DS domain to check and modify the IP
precedence, DSCP, or EXP value, which is called the class of service (CoS). Therefore, the
CoS value may vary during packet transmission.

Figure 9-3 CoS Processing in MPLS DiffServ
Trust EXP or IP
Trust IP DSCP?
DSCP?
IP network MPLS IP network

network
MPLS Exp=2
IP DSCP=40 IP DSCP=40
IP packet IP over MPLS
packet
Carriers need to determine whether to trust the CoS information in an IP or MPLS packet that
is entering an MPLS network or is leaving an MPLS network for an IP network. Relevant
standards defines three modes for processing the CoS: Uniform, Pipe, and Short Pipe.
Uniform Mode
When carriers determine to trust the CoS value (IP precedence or DSCP) in a packet from an
IP network, the Uniform mode can be used. The MPLS ingress LSR copies the CoS value in
the packet to the EXP field in the MPLS outer header to ensure the same QoS on the MPLS
network. When the packet is leaving the MPLS network, the egress LSR copies the EXP
value back to the IP precedence or DSCP in the IP packet.
Figure 9-4 Uniform mode
MPLS Network
IP Network
IP Network
Ingress Penultimate Egress
node node node
MPLS->MPLS pop
MPLS->MPLS swap Outer label
Outer label MPLS Exp=5 MPLS->IP pop
MPLS->MPLS push Outer label
MPLS Exp=5 MPLS Exp=5 Inner label
Inner label MPLS Exp=5
Inner label Inner label
IP->MPLS push MPLS Exp=5 MPLS Exp=5 MPLS Exp=5
IP DSCP=40 IP DSCP=40 IP DSCP=40 IP DSCP=40
As its name implies, Uniform mode ensures the same priority of packets on the IP and MPLS
networks. Priority mapping is performed for packets when they are entering or leaving an
MPLS network. Uniform mode has disadvantages. If the EXP value in a packet changes on an

MPLS network, the PHB for the packet that is leaving the MPLS network changes
accordingly. In this case, the original CoS of the packet does not take effect.
Figure 9-5 CoS change in Uniform mode
MPLS Network
IP Network
IP Network
node node node
MPLS->MPLS pop
Outer label MPLS Exp=6 MPLS->IP pop
MPLS->MPLS push Outer label
MPLS Exp=5 MPLS Exp=6 Inner label
Inner label MPLS Exp=6
IP->MPLS push MPLS Exp=5 MPLS Exp=5 MPLS Exp=6
Pipe Mode
When carriers determine not to trust the CoS value in a packet from an IP network, the Pipe
mode can be used. The MPLS ingress delivers a new EXP value to the MPLS outer header,
and the QoS guarantee is provided based on the newly-set EXP value from the MPLS ingress
to the egress. The CoS value is used only after the packet leaves the MPLS network.
Figure 9-6 Pipe mode

Performs PHB
before the pop
MPLS Network
IP Network
IP Network
Ingress Penultimate
Egress
node node
node
MPLS->MPLS pop
Outer label
MPLS->MPLS swap MPLS->MPLS swap MPLS Exp=1
MPLS->MPLS push Outer label Outer label Outer label
MPLS Exp=1
MPLS->IP pop
MPLS Exp=1 MPLS Exp=1
Inner label
IP->MPLS push Inner label Inner label Inner label MPLS Exp=1
MPLS Exp=1 MPLS Exp=1 MPLS Exp=1
In Pipe mode, the MPLS ingress does not copy the IP precedence or DSCP to the EXP field
for a packet that enters an MPLS network. Similarly, the egress does not copy the EXP value

to the IP precedence or DSCP for a packet that leaves an MPLS network. If the EXP value in
a packet changes on an MPLS network, the change takes effect only on the MPLS network.
When a packet leaves an MPLS network, the original CoS continues to take effect.
NOTE
In Pipe mode, the egress implements QoS scheduling for packets based on the CoS value defined by
carriers. The CoS value defined by carriers is relayed to the egress using the outer MPLS header.
Short Pipe Mode

The Short Pipe mode is an enhancement of the Pipe mode. Packet processing on the MPLS
ingress in Short Pipe mode is the same as that in Pipe mode. On the egress, however, the
egress pops the label and then implements QoS scheduling. The packets are scheduled based
on the CoS value that carriers define from the MPLS ingress to the penultimate hop, and are
scheduled based on the original CoS value by the MPLS egress.
Figure 9-7 Short Pipe mode

Performs PHB
after the pop
MPLS Network
IP Network
IP Network
node node node
MPLS->MPLS pop
MPLS->MPLS push Outer label MPLS Exp=1 MPLS->IP pop
Outer label
New EXP value MPLS Exp=1 MPLS Exp=1 Inner label
IP->MPLS push Inner label MPLS Exp=2
New EXP value MPLS Exp=2 MPLS Exp=2 MPLS Exp=2
In Pipe or Short Pipe mode, carriers can define a desired CoS value for QoS implementation
on the carriers' own network, without changing the original CoS value of packets.
The difference between Pipe mode and Short Pipe mode lies in the QoS marking for the
outgoing traffic from a PE to a CE. In Pipe mode, outgoing traffic is scheduled based on a
CoS value defined by carriers, whereas outgoing traffic uses the original CoS value in Short
Pipe mode, as shown in Figure 9-8.

Figure 9-8 Difference between Pipe and Short Pipe
ff Pipe: uses carrier-defined CoS

for packets from PE to CE.
ff Short Pipe: uses user-defined
CoS for packets from PE to CE
IP MPLS
IP
network network
network
CE PE PE CE
Flow direction
9.3 MPLS HQoS
9.3.1 Implementation Principle
QoS Resource Verification After the PW Bandwidth Is Configured (L2VPN)

The tunnel to which a PW is to be iterated must meet specific requirements. The tunnel
bandwidth must be higher than the PW bandwidth. In addition, the SQ resources of the tunnel
must meet the HQoS hardware resource requirements.
The tunnel to which a PW is to be iterated may vary after the PW bandwidth is configured. If
the PW bandwidth does not meet specific requirements or the SQ resources are insufficient,
the PW may fail to be iterate to the tunnel and becomes Down.
Implementing HQoS (L2VPN) at the Public Network Side Based on VPN + Peer
PE
In the MPLS VPN, bandwidth agreement may need to be reached between PE devices of an
operator, so that traffic between two PEs is restricted or guaranteed according to the
bandwidth agreement. To achieve this end, HQoS at the public network side that is based on
VPN + Peer PE can be adopted.
As shown in Figure 9-9, bandwidth and class of service are specified for traffic between PEs
at the MPLS VPN side. For example, in VLL1, the specified bandwidth for traffic between
PE1 and PE2 is 30 Mbit/s, and higher priority services are given bandwidth ahead of lower
priority services.
NOTE
If, however, you need to implement bandwidth restriction rather than bandwidth guarantee at the
network side, you can simply specify the CIR to be 0, and the PIR to the desired bandwidth.

Figure 9-9 Implementing HQoS at the public network side based on VLL + peer PE
flow1
flow2
Scheduler
flow3
classfier
flow4 port
flow5
flow6
Base t
flow7
flow8
por
CE2
d on
Base t
VLL 1
por
VLL 1
d on
PE2
CE1
PE1 VLL 2
VLL 2
CE4
P2
CE5
PE3
P3
VLL 2
PW1 Traffic:30M CE3
PW2 Traffic:20M
The preceding traffic model is described as follows:

1. In this scenario, the LDP tunnel or the TE tunnel that is not allocated bandwidth is
adopted.
QoS is configured based on VLL + peer PE to implement bandwidth restriction at the
network side.
As shown in Figure 9-10, at the egress PE, traffic is mapped to eight flow queues in the
service queue (SQ) according to the preference information carried by the traffic. Flow
queues belonging to the same VLL + peer PE are mapped to the same SQ (namely, for
traffic destination PE from VLLs is PE2, one queue among the eight queues is mapped;
for traffic destination PE from VLLs is PE3, the other queue is mapped).
Based on the user's requirements, multiple VLLs + peer PEs can be configured to be in
one user group, and undergo group queue (GQ) scheduling.
On the outbound interface, traffic first undergoes port queue scheduling, and is then
forwarded.

Figure 9-10 Model of hierarchical traffic scheduling
Based on PW + Based on Based on

peer PE user queue outbound interface
Flow queue 1
(Service flow 1)
User queue 1 User group
Port queue 1
(VLL1 + PE1) queue 1
Flow queue 8
(Service flow 8) User queue 2 User group
(VLL2 + PE2) queue 2
In this scenario, priority scheduling of flow queues is supported, bandwidth restriction

and bandwidth guarantee of SQs are supported, whereas only traffic shaping is supported
for GQs.
2. In this scenario, the TE tunnel that is allocated bandwidth is adopted.
Scenario 2 differs from scenario 1 in the following aspects.
– After traffic undergoes the SQ scheduling, by default, traffic undergoes GQ
scheduling that is based on the TE tunnel. That is, by default, all traffic in one TE
tunnel is regarded as being in one GQ. GQ scheduling, however, can also be
configured to be based on VLL + peer PE according to the user's requirements. In
this case, the default GQ scheduling that is based on the TE tunnel is no longer
effective.
– PE-to-PE bandwidth guarantee for traffic is supported. This is because in this
scenario, bandwidth resources are reserved for the TE tunnel.
Figure 9-11 Model of hierarchical traffic scheduling
Based on Based on
Based on PW
user queue outbound
Flow queue 1 interface
(Service flow 1) User queue 1 User group
queue 1 Port queue 1
(VLL1 + PE1)
Flow queue 8 User queue 2 User group

(Service flow 8) (VLL2 + PE2) queue 2
If traffic is load-balanced among TE tunnels on peer PEs, all traffic that is load-balanced
undergoes priority scheduling and bandwidth restriction according to the traffic
scheduling procedure as shown in Figure 9-11.
NOTE
In this scenario, it is recommended that the TE tunnel that is configured with bandwidth resources be
adopted to achieve PE-to-PE bandwidth guarantee for traffic.

Implementing VPN-based Traffic Statistics

In a VPLS or VPWS network, the PE device supports statistics on both the incoming and
outgoing traffic at the AC or PW side (at the AC side, traffic statistics are based on the
interface, whereas at the PW side, traffic statistics are based on the PW table). After MPLS
HQoS is configured, traffic statistics can be produced on packets at the PW side that are sent
based on their priorities.
In a L3VPN, the PE device supports statistics on both the incoming and outgoing traffic of the
VPN user (based on interfaces at the VPN side). After MPLS HQoS is configured, traffic
statistics can be produced on packets at the public network side of the ingress PE that are sent
based on their priorities.
In a VLL or VSI, the PE device supports statistics on both the incoming and outgoing traffic
of the VLL or VSI user (based on interfaces at the VPN side). After MPLS HQoS is
configured, traffic statistics can be produced on packets at the public network side of the
ingress PE that are sent based on their priorities.
9.3.2 Application
End-to-End MPLS HQoS Solution
Figure 9-12shows the procedures of implementing end-to-end MPLS HQoS.
Figure 9-12 Implementing end-to-end QoS in L2VPN
Configuring interface- Configuring interface-

based QoS attributes of based or tunnel-based QoS
incoming/outgoing attributes of
packets incoming/outgoing packets
VLL 1 CE2
PE2 VLL 1
CE1 PW1 P2
PE1
VLL 2
CE3 PE3
PW2 P3 CE4
VLL 2
Configuring interface-
based QoS attributes
of incoming/outgoing
packets
On the CE-side interfaces of PEs, interface-based QoS policies are configured to implement
QoS enforcement on packets that are received from CEs or sent to CEs.

On the ingress PE that is, PE1, QoS policies are configured based on VLL/VLL instance +
peer PE for packets that are sent to the public network side. Besides, to deliver end-to-end
QoS guarantee for traffic, the TE tunnel that is allocated bandwidth can be adopted to carry
VLL traffic. In addition, on PEs, QPPB can be configured to propagate QoS policies, and the
MPLS DiffServ model can be configured so that in MPLS VPN services, both the private
network and the public network that are configured with the DiffServ QoS model can
communicate.
On the P node, QoS policies are enforced based on interface/TE tunnel without distinguishing
between VLL services and non-VLL services.

Feature Description - QoS 10 Multicast Virtual Scheduling
10 Multicast Virtual Scheduling
About This Chapter
10.1 Introduction
10.2 Principles
10.3 Applications
10.1 Introduction
Definition
Multicast virtual scheduling is a traffic scheduling mechanism for subscribers who demand
multicast programs. After a subscriber joins a multicast group, if multicast traffic needs to be
copied based on the multicast VLAN or if the replication point is a downstream device, the
bandwidth for the unicast traffic of the subscriber is adjusted accordingly. As a result,
bandwidths for the unicast traffic and multicast traffic of the subscriber are adjusted in a
coordinated manner.
Purpose
Multicast virtual scheduling is a subscriber-level traffic scheduling. It adjusts the bandwidths
for the unicast traffic and multicast traffic of a subscriber in a coordinated manner without
changing the total bandwidth of the subscriber, thus ensuring the quality of BTV services of
the subscriber.
Multicast Virtual Scheduling Overview

With the development of the broadband TV (BTV) service, the quality of video services on
TVs is now a key issue that has a direct impact on the experiences and feelings of users.
Therefore, carriers must ensure that users can view TV programs smoothly by managing the
available bandwidth subtle.

Figure 10-1 Schematic view of the background of multicast virtual scheduling
As shown in Figure 10-1, a family views multicast programs (multicast data) through a Set
Top Box (STB) and browses the Internet (unicast data) through a PC. For example, the
maximum bandwidth for the family is 3 Mbit/s. The Internet service occupies all the 3 Mbit/s
bandwidth, and then the user demands a multicast program requiring a bandwidth of 2 Mbit/s
through the STB.
As the multicast data and unicast data require the bandwidth of 5 Mbit/s in total, data
congestion will occur in the access network, and some packets will be discarded. Therefore,
the quality of the multicast program cannot be ensured.
Multicast Virtual Scheduling Solutions

The multicast virtual scheduling technology is developed to solve this problem. The multicast
virtual scheduling technology is the user-class scheduling. When the total bandwidth of a user
is limited, this technology realizes the dynamic adjustment of the unicast bandwidth and the
multicast bandwidth for the user. Therefore, the quality of the BTV service is ensured.
The multicast virtual scheduling can solve the problem shown in Figure 10-1. The router is
configured with the multicast virtual scheduling feature. When the sum of the multicast traffic
and unicast traffic received by a user is greater than the bandwidth assigned to the user, the
router reduces the bandwidth for unicast traffic of the user to 1 Mbit/s to meet the requirement
of bandwidth for multicast traffic. Therefore, the multicast program can be played normally.
Multicast Shaping Overview

After the IPTV multicast service is deployed, the multicast source may jitter when the
multicast traffic is huge. The NE40E can shape the multicast traffic so that the jitter of the
multicast source can be limited in an acceptable degree. If the NE40E is configured with the
function of multicast shaping, the NE40E can control the multicast traffic of users in the
domain when users use multicast services. This prevents the multicast traffic from bursting
and can help to send multicast packets smoothly and control the jitter of the multicast source
in an acceptable degree.
10.2 Principles

10.2.1 Basic Principles of Multicast Virtual Scheduling
Multicast virtual scheduling is a subscriber-level scheduling. When a downstream device

functions as the multicast replication point, the BRAS copies the multicast traffic of all
subscribers rather than the specified subscriber to the downstream device. As a result, the
multicast traffic of the subscriber does not enter the subscriber queue along with the unicast
traffic on the BRAS for traffic scheduling. Unicast traffic of the subscriber is still forwarded
using the maximum bandwidth of the subscriber, leaving insufficient bandwidth to forward
the multicast traffic of the subscriber. When a subscriber joins a multicast group, the BRAS
needs to deduct the bandwidth of the multicast traffic from the unicast bandwidth of the
subscriber, to ensure bandwidth for the multicast traffic of the subscriber. In this manner, the
total bandwidth of the subscriber remains unchanged, but bandwidth is ensured for the
multicast traffic of the subscriber. When the subscriber leaves the multicast group, the BRAS
releases the multicast bandwidth. In this process, coordinated bandwidth allocation for both
the unicast traffic and multicast traffic of a subscriber is implemented, ensuring the QoS for
the BTV service and the subscriber.
As shown in Figure 10-2, the maximum bandwidth for traffic from the DSLAM to the
subscriber is 3 Mbit/s. Assume that the subscriber uses up the 3 Mbit/s of bandwidth for
unicast traffic service, and then demands a multicast program which requires 2 Mbit/s of
bandwidth. In this case, the total traffic required by the subscriber is 5 Mbit/s, much higher
than the allowed 3 Mbit/s bandwidth. As a result, the link between the DSLAM and LAN
Switch is congested, and packets begin to be dropped. Because the DSLAM does not provide
QoS treatment, packets are randomly discarded. As a result, multicast traffic is discarded, and
the subscriber cannot have quality service for the requested multicast program. To ensure
quality service for the requested multicast program, the BRAS needs to be configured to
dynamically adjust the bandwidth for unicast traffic according to the bandwidth for multicast
traffic. The DSLAM sends the IGMP Report message of the subscriber through the
subscriber's VLAN to the BRAS. After receiving the IGMP Report message, the BRAS
reduces the bandwidth for the subscriber's unicast traffic to 1 Mbit/s, leaving the remaining 2
Mbit/s for the subscriber's multicast traffic. In this manner, quality service is ensured for the
requested multicast program.
Figure 10-2 Diagram of the multicast virtual scheduling
RADIUS server
STB
interface2 Internet
interface1 100.1.1.1/24
LAN switch DSLAM
Device
Internet user
10.3 Applications

When the replication point of multicast traffic is not on the BRAS, multicast virtual
scheduling can be applied in the following two typical scenarios.
10.3.1 Typical Single-Edge Network with Multicast Virtual

Scheduling
In a typical single-edge network where a BRAS is accessed by users and forwards multicast
traffic and multicast virtual scheduling is configured, the BRAS needs to provide both the
functions of virtual scheduling and multicast data replication.
As shown in Figure 10-3, when getting online, the subscriber sends online request packets to
NE40E. These request packets carry Option 82 information or outer VLAN information about
the subscriber. NE40E identifies all the service flows belonging to the same family according
to the Option 82 information or outer VLAN information, and implements scheduling of these
services as a whole.
After detecting that the subscriber demands a multicast program and determining that virtual
scheduling needs to be implemented, NE40E adjusts the bandwidth for the subscriber's
unicast traffic based on the bandwidth of the demanded multicast program and the total
bandwidth of the subscriber. Then, NE40E forwards the requested multicast traffic through
the multicast VLAN, and the downstream device (for example, the DSLAM) copies the
multicast traffic to the subscriber.
Figure 10-3 Typical single-edge network with multicast virtual scheduling

RADIUS server
STB
interface2 Internet
interface1 100.1.1.1/24
LAN switch DSLAM
Device
Internet user
10.3.2 Typical Double-Edge Network with Multicast Virtual

Scheduling
In a typical double-edge network that is configured with multicast virtual scheduling, the
Device A needs only to implement virtual scheduling. Multicast data replication is
implemented by the Device B.
The double-edge network shown in Figure 10-4 is similar to the single-edge network. Device
A identifies all the service flows belonging to the same family according to the Option 82
information or outer VLAN information, and implements virtual scheduling of these services
as a whole.
After detecting that the subscriber demands a multicast program and determining that virtual
scheduling needs to be implemented, Device A adjusts the bandwidth for the subscriber's
unicast traffic based on the bandwidth of the demanded multicast program and the total

bandwidth of the subscriber. Then, Device B forwards the requested multicast traffic through
the multicast VLAN, and the downstream device copies the multicast traffic to the subscriber.
In addition to forwarding multicast data to the subscriber, Device B also forwards the
multicast data to Device A. Device A measures the received multicast data, and implements
multicast virtual scheduling based on the measurement result.
Figure 10-4 Typical double-edge network with multicast virtual scheduling

RADIUS server
interface2
100.1.1.1/24
interface1
STB DeviceA Internet
LAN switch DSLAM

Internet user DeviceB

Feature Description - QoS 11 L2TP QoS
11 L2TP QoS
About This Chapter
11.1 Introduction to L2TP QoS

11.2 Principles
11.1 Introduction to L2TP QoS

Definition
Layer 2 Tunneling Protocol Quality of Service (L2TP QoS) provides QoS solutions, such as
traffic rate limiting, traffic scheduling, and flow queue mapping on a network where L2TP is
configured.
Purpose
In the L2TP service wholesale scenario, an L2TP Access Concentrator (LAC) is responsible
for service wholesale, whereas an L2TP Network Server (LNS) is the service control point.
There is an L2TP tunnel between the LAC and the LNS. Traffic being transmitted over the
L2TP tunnel needs to be controlled in a refined manner on the LNS. This is to minimize the
impact on service quality caused by out-of-order resource competition between the LAC and
the LNS. In addition, an ISP can control the traffic that enters different service tunnels on the
ISP network. This prevents burst traffic of different users.
L2TP HQoS provides QoS scheduling on the traffic of LNS-connected users to control traffic
in the L2TP tunnel in a refined manner.
11.2 Principles

Feature Description - QoS 11 L2TP QoS
11.2.1 Principles
Figure 11-1 L2TP networking diagram
Remote user
LAC Internet backbone
PSTN/ISDN LNS
L2TP tunnel
NAS
Internet sever
Remote branch
As shown in Figure 11-1, users need to log in to a private network through a Layer 2
network.
The LAC is a Layer 2 network device that can process PPP packets and support L2TP
functions. Usually it is an access device on the local Internet Service Provider (ISP) network.
The LAC is deployed between an LNS and a remote system (a remote user or a remote
branch).
The LAC performs traffic rate limiting, flow queue mapping, and traffic scheduling.
The LNS is the receiving end of a PPP session. Users authenticated by the LNS can log in to
the private network to access resources.
The LNS can also perform traffic rate limiting, flow queue mapping, and traffic scheduling.
The LNS supports the following QoS scheduling modes:
l Tunnel-specific scheduling
In this mode, services of each user are not differentiated, and therefore there is no need
to allocate a Subscriber Queue (SQ) for each user. Instead, an L2TP tunnel is allocated
an SQ.
l Session-specific scheduling
In this mode, each user is allocated an SQ and an L2TP tunnel is allocated a Group
Queue (GQ). Each user has one to eight Priority Queues (PQs) on which Strict Priority
(SP) scheduling or Weighted Fair Queue (WFQ) scheduling can be performed.

Feature Description - QoS 12 Acronyms and Abbreviations
12 Acronyms and Abbreviations
Acronyms & Abbreviations Full Name
3G 3rd Generation
AAL ATM Adaptation Layer
ABR Available Bit Rate
ACL Access Control List
ATM Asynchronous Transfer Mode
BA Behavior Aggregation
BC Bandwidth Control
BE Best-Effort
BGP Border Gateway Protocol
BRAS Broadband Remote Access Server
CAR Committed Access Rate
CBR Constant Bit Rate
CBS Committed Burst Size
CDVT Cell Delay Variation Tolerance
CE Customer Edge
CIR Committed Information Rate
CLP Cell Loss Priority

CLR Cell Loss Rate
COS Class Of Service
CQ Class Queue
CR Core Router
CR-LSP Constraint-based Routed LSP
CRC Clic Redundancy Check
CS Class Selector
CSCP Class Selector Code Point
CSPF Constraint Shortest Path First
CT Class Type
CTD Cell Transfer Delay
DF Don't Fragment
DiffServ Differentiated Service
DRR Deficit Round Robin
DS Differentiated Service
DSCP Differentiated Services Code Point
DSLAM Digital Subscriber Line Access Multiplexer
DWRR Deficit Weighted Round Robin
E-LSP EXP-Inferred-PSC (PHB Scheduling Class)

LSP
EBS Extended burst size
EF Expedited Forwarding
EFCI Explicit Forward Congestion Indication
eTM egress Traffic Manager
EXP Experimental Bits
FIC Fabric Interface Controller
FIFO First In First Out

FL Flow Label
FQ Flow Queue
FR Frame Relay
FRR Fast ReRoute
FTP File Transfer Protocol
GQ Group Queue
GRE Generic Routing Encapsulation
HG Home Gateway
HQoS Hierarchical Quality of Service
HSI High Speed Internet
HTML Hypertext Markup Language
HTTP Hypertext Transfer Protocol
ICMP Internet Control Message Protocol
IEEE Institute of Electrical and Electronics

Engineers
IETF Internet Engineering Task Force
IGMP Internet Group Management Protocol
IGW Internet Gateway
IntServ Integrated Service
IP Internet Protocol
IPinIP IP in IP Encapsulation
IPoA IP over ATM
IPoE Internet Protocol over Ethernet
IPTV Internet Protocol Television
IPv4 Internet Protocol version 4
IPv6 Internet Protocol version 6
ISP Internet Service Provider

L-LSP Label-Only-Inferred-PSC (PHB Scheduling

Class) LSP
L3VPN Layer 3 Virtual Private Network
LAN Local Area Network
LDP Label Distribution Protocol
LPQ Low Priority Queue
LPU Line Processing Unit
LR Line Rate
LSP Labeled Switch Path
LSR Labeled Switching Router
LSW LAN Switch
MAC Medium Access Control
MAM Maximum Allocation Model
MBS Maximum Burst Size
MCDT Maximum Cell Transfer Delay
MCR Minimum Cell Rate
MF Multiple Field
MP Merge Point
MPLS Multiprotocol Label Switching
MTU Maximum Transmission Unit
NGN Next Generation Network
NPC Network Parameter Control
NRT-VBR Non Real Time-Variable Bit Rate
OLT Optical Line Terminal
ONT Optical Network Terminal
OSPF Open Shortest Path First

PBS Peak Burst Size
PC Personal Computer
PCP Priority Code Point
PCR Peak Cell Rate
PE Provider Edge
PFE Packet Forward Engine
PHB Per Hop Behaviors
PIC Physical Interface Card
PIR Peak Information Rate
PLR Point of Local Repair
PPP Point-to-Point Protocol
PQ Priority Queuing
PSN Packet Switched Network
PWE3 Pseudo-Wire Emulation Edge to Edge
PVC Permanent Virtual Circuit
PVP Permanent Virtual Path
QinQ 802.1Q in 802.1Q
QoS Quality of Service
QPPB QoS Policy Propagation on BGP
RDM Russian Dolls Model
RED Random Early Detection
RFC Request For Comments
RR Round Robin
RSVP Resource Reservation Protocol
RT-VBR Real Time-Variable Bit Rate
RTN Radio Transmission Node
RTT Round Trip Time

SCR Sustain Cell Rate
SFD Start-of-Frame Delimiter
SLA Service Level Agreements
SLS Service Level Specification
SP Strict Priority
SQ Subscriber Queue
SR Service Router
STB Set Top Box
TB Target Blade
TC Traffic Class
TCA Traffic Conditioning Agreement
TCP Transmission Control Protocol
TCS Traffic Conditioning Specification
TE Traffic Engineering
TLV Type-Length-Value
TM Traffic Manager
ToS Type of Service
TP Traffic Policing
TTL Time To Live
UBR Unspecified Bit Rate
UCL User Control List
UDP User Datagram Protocol
UPC Usage Parameter Control
URPF Unicast Reverse Path Forwarding
VBR Variable Bit Rate
VC Virtual Circuit

VCC Virtual Circuit Connection
VCI Virtual Channel Identifier
VE Virtual Ethernet
VI Virtual Interface
VLAN Virtual Local Area Network
VLL Virtual Leased Line
VoD Video on Demand
VoIP Voice over IP
VOQ Virtual Output Queue
VP Virtual Path
VPI Virtual Path Identifier
VPN Virtual Private Network
WFQ Weighted Fair Queuing
WRED Weighted Random Early Detection
WRR Weighted Round Robin
WWW World Wide Web


NE40E V800R010C00 Feature Description QoS PDF

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

NE40E V800R010C00 Feature Description QoS PDF

Uploaded by

Copyright:

Available Formats

HUAWEI NetEngine40E Universal Service Router

Feature Description - QoS

HUAWEI TECHNOLOGIES CO., LTD.

Trademarks and Permissions

Huawei Technologies Co., Ltd.

Issue 02 (2018-06-20) Huawei Proprietary and Confidential i

1 About This Document.................................................................................................................. 1

4 End-to-End QoS Service Models.............................................................................................. 20

7 Traffic Policing and Traffic Shaping....................................................................................... 87

8 Congestion Management and Avoidance.............................................................................111

Issue 02 (2018-06-20) Huawei Proprietary and Confidential ii

8.1 Traffic Congestion and Solutions................................................................................................................................111

10 Multicast Virtual Scheduling................................................................................................168

12 Acronyms and Abbreviations............................................................................................... 175

Issue 02 (2018-06-20) Huawei Proprietary and Confidential iii

1 About This Document

Product Name Version

NE40E Series V800R010C00

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 1

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 2

Indicates an imminently hazardous situation which, if not

Indicates a potentially hazardous situation which, if not

Indicates a potentially hazardous situation which, if not

Indicates a potentially hazardous situation which, if not

Calls attention to important information, best practices and

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 3

About This Chapter

2.1 What Is QoS

2.1 What Is QoS

Figure 2-1 Internet services

Email, FTP, WWW

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 4

2.2 QoS Specifications

As services become increasingly diversified, Internet Citizens expect higher bandwidths so

Figure 2-2 Insufficient bandwidth

The Internet is slow,

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 5

Figure 2-3 Long delay

(2s later) Hello! (4s

Delay variations (jitter)

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 6

Figure 2-4 High jitter

I can leave you?

Packet Loss Rate

Figure 2-5 High packet loss rate

I have sent a I have sent you

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 7

2.3 Common QoS Specifications

Voice + Mainly 4-32 < 1 s for <1 < 3% PLR

High-quality Mainly 16-128 < 10 s << 1 < 1% PLR

Vi Video phone Two- 16-384 < 150 ms < 1% PLR

Video media One- 16-384 < 10 s < 1% PLR

Da Web browser Mainly ~10 KB Preferred < 2 N.A. Zero

Batch data Mainly 10 Preferred < 15 N.A. Zero

Issue 02 (2018-06-20) Huawei Proprietary and Confidential 8

M Application Symm Typical Key Performance Parameters and

Transaction Two- < 10 KB Preferred < 2 s N.A. Zero

Command/ Two- ~ 1 KB < 250 ms N.A. Zero

Still image One- < 100 Preferred < 15 N.A. Zero

Interactive Two- < 1 KB < 200 ms N.A. Zero

Telnet Two- < 1 KB < 200 ms N.A. Zero

E-mail Mainly < 10 KB Preferred < 2 s N.A. Zero

E-mail Mainly < 10 KB Several N.A. Zero