Professional Documents
Culture Documents
NORTEL, NORTEL NETWORKS, NORTEL NETWORKS LOGO, the GLOBEMARK, BAYSTACK, CALLPILOT, CONTIVITY,
DMS, MERIDIAN, MERIDIAN 1, NORSTAR, OPTERA, OPTIVITY, PASSPORT, SUCCESSION and SYMPOSIUM are trademarks
of Nortel Networks.
ALTEON is a trademark of Alteon WebSystems, Inc.
ARIN is a trademark of American Registry of Internet Numbers, Ltd.
APACHE is a trademark of Apache Micro Peripherals, Inc.
APPLE, APPLETALK, MAC OS and QUICKTIME, are trademarks of Apple Computer Inc.
CAPEX is a trademark of Solyman Ashrafi
CABLELABS, DOCSIS and PACKETCABLE are trademarks of Cable Television Laboratories, Inc.
C7 and CALIX are trademarks of Calix Networks Inc.
CANAL + is a trademark of Canal + Corporation
KEYMILE is a trademark of Datentechnik Aktiengesellschaft
MPEGABLE is a trademark of Dicas Digital Image Coding GmbH
CINEPAK is a trademark of Digital Origin, Inc.
ECI is a trademark of ECI Telecom Limited
DIGICIPHER and GENERAL INSTRUMENT are trademarks of General Instrument Corporation
OPEX is a trademark of Gensym Corporation
INFOTECH is a trademark of Infotech, Inc.
ESCON and LOTUS NOTES are trademarks of International Business Machines Corporation (dba IBM Corporation).
IANA and ICANN are trademarks of Internet Corporation of Assigned Names and Numbers.
NAGRA and NAGRAVISION are trademarks of Kudelski A.B.
INDEO is a trademark of Ligos Corporation
ENHYDRA is a trademark of Lutris Technologies, Inc.
SIP is a trademark of Merrimac Industries, Inc.
FORE SYSTEMS is a trademark of Marconi Communications, Inc.
ACTIVEX, NETMEETING, MICROSOFT WINDOWS, OUTLOOK, WINDOWS, and WINDOWS MEDIA are trademarks of Microsoft
Corporation
NETIQ is a trademark of NetIQ Corporation
TIMBUKTU is a trademark of Netopia, Inc.
OPNET is a trademark of OPNET Technologies, Inc.
ECAD is a trademark of Pentek, Inc.
REALAUDIO, REALNETWORKS, REALPLAYER, REALPROXY, and REALVIDEO are trademarks of RealNetworks, Inc.
PESQ is a trademark of Psytechnics Limited
POWERTV and SCIENTIFIC ATLANTA are trademarks of Scientific-Atlanta, Inc.
SILKROAD is a trademark of SilkRoad Technology, Inc.
SPRINT is a trademark of Sprint Communications Company L.P.
CDMA2000 is a trademark of Telecommunications Industry Association
NETBSD is a trademark of The NetBSD Foundation
THE YANKEE GROUP is a trademark of The Yankee Group
VERIZON is a trademark of Verizon Trademark Services LLC
BSD is a trademark of Wind River Systems, Inc.
ZENITH is a trademark of Zenith Electronics Corporation
Trademarks are acknowledged with an asterisk (*) at their first appearance in the document.
i
Contents 1
Author Biographies .................................................................................. ix
Acknowledgments ................................................................................................ xiv
Video .....................................................................................................................67
Video Impairments ................................................................................................68
Digital video impairments ......................................................................................68
Causes of video signal impairments .....................................................................69
Digital video ..........................................................................................................70
Sequences of frames ............................................................................................74
What you should have learned ..............................................................................79
Glossary .................................................................................................723
Author Biographies
Dave Anderson is a Senior Manager of Nortel Networks Wireless
Engineering and is responsible for the engineering aspects of Nortel’s
responses to global Wireless proposals and network designs. Dave has
been with Nortel since earning his B.S.E.E. in 1986 and has held a number
of Engineering positions within the company, including customer
supporting engineering roles for DMS switching, and more recently,
Wireless Network Engineering including aspects of Radio as well as core
network. He is familiar with the evolving global standards for wireless
systems including CDMA, EV-DO, GSM, UMTS and Wireless LAN.
Elwyn Davies is currently leading the CTO Office team setting the
strategy for the introduction of IPv6 into Nortel products. His background
includes an M.A. in Mathematics and research into aspects of Noise
Reduction in electronic and other systems. He is a regular attendee and
contributor to the IETF in a number of areas, including network layer
signaling and to the IRTF in routing research.
Stéphane Duval is a Product Line Manager for OME Data. He has twelve
years of experience with data infrastructure solutions design for private and
public sector organizations and extensive customer interaction, which
helped him develop his skills to deliver reliable and secure data
infrastructures.
Acknowledgments
The writing, editing, and assembly of a large textbook is a formidable task.
We are fortunate that the corporate culture at Nortel encourages
collaboration and teamwork. Many people contributed to the successful
completion of this project, whether by direct contributions to the text, by
supporting the steering committee or individual authors, by removing
obstacles, or by championing this project to the senior executives. Thank
you to everyone who helped us move forward.
A big thank you to our many reviewers. Whether they read one chapter or
many, whether they focused on technical accuracy or clarity and
readability, their feedback and suggestions have greatly improved the
quality of the published version.
Shardul Joshi, Leigh Thorpe, Steve Dudley, and Tim Mendonca, Editors
Steering Committee:
Lorelea Moore – Certification
Leigh Thorpe – Editor, CTO's Office
Shardul Joshi – Editor, Wireline Engineering
Stephen Dudley – Editor, Wireline Engineering
Tim Mendonca – Editor, Enterprise Engineering
Carelyn Monroe – Wireline Engineering
Joe King – Wireline Engineering
Michelle Bigham – Marketing communications
Ann-Marie Bishop – Project Manager
Contributors:
The following made significant contributions to the contents of this book:
Mark Armstrong
Benedict Bauer
Roger Britt
Peter Bui
James Chanco
Paul Coverdale
Steve Elliott
Matt Michels
Mustapha Moussa
Tom Taylor
Andrew Timms
Chapter 1
Introduction
Joseph King
You have an emergency. You know you can dial three digits from any
phone, anywhere, any time and within milliseconds you have help. Most
people don't understand how it happens, and frankly they don't care. They
just count on it to work. The network design engineer both knows and cares
how it works. Would you have the same confidence dialing that number if it
was being routed across an IP core today? Not if that IP network is the
Internet or one of about 85% of the IP networks out there today.
Consumer and engineers alike have heard the technology hype:
convergence, VoIP, triple play, interactive applications. Voice, data, and
video networks are finally becoming one. Why? Because consumers
demand it, want it and need it. It is becoming a way of life. Technology is a
driving force for the way people communicate. The paradigm has shifted.
Consumers are driving this change in technology to support the way they
want to work and live.
Convergence is occurring between real-time and non–real-time data
networks. Voice over IP is being deployed on networks that were originally
designed with a router architecture and best-effort delivery philosophy.
Many of these networks are not capable of meeting the performance quality
requirements of real-time services such as voice. Voice services are critical.
As convergence proceeds and networks begin to carry voice and other real-
time services, these networks must adapt to the mission critical nature of
those services. The people who design and operate these networks must
meet a new set of constraints. Best-effort cannot guarantee the performance
of mission critical real-time applications. Throwing more bandwidth at the
problem is not sufficient. What is needed is proper network planning and
design, which in turn requires a thorough understanding of the operation
and constraints of real-time networking and how that interacts with the
operation and constraints of IP networking.
Because convergence of real-time applications with data networking is new
ground for so many people, a group of Nortel subject matter experts have
created a real-time networking manual to serve as a shared foundation for
engineers and other professional from various areas of the industry. As part
of this effort, Nortel has also developed a certification that is focused
purely on real-time networking: Nortel Certified Technology Specialist
(NCTS)—Real-Time Networks. It is a baseline certification in real-time
networking, intended to be as applicable to the managers of engineers as it
The first step will be to introduce the concepts of convergence and real
time. Network, service, and application convergence are discussed.
Examples of real-time services are presented, and the constraints around
operating real-time services in a packet environment are discussed. The
concept of Quality of Experience is introduced as the fundamental
performance requirement for all services and applications.
To design a real-time network and assure your customers excellent Quality
of Experience, you need to understand the concepts of real-time
applications. As discussed earlier, real-time challenges network
performance. What are the performance requirements for the major real-
time applications and services, and what are the protocols and mechanisms
we use to control network behavior?
Most all will agree that if the video freezes for seconds while watching the
news but the audio continues, the interruption is a mere annoyance.
However, if the reverse occurs, and the audio is lost for seconds while the
video continues, your comprehension of the news will be severely
degraded. For other content, such as sporting events, loss of audio may be
tolerable, while video loss is not. That said, interactive voice is one of the
most demanding communications services. Consequently, a significant
portion of Section I is focused on the quality of voice services and voice
codecs.
For conversational voice services on a converged network, there are many
contributing factors to the final quality. You, as a convergence engineer,
need to understand the contributions of various parameters such as delay,
packet loss, and echo. Network planning for voice is essential, and tools
like the E-Model and its associated quality metric R are invaluable in
designing and provisioning a network. Other metrics such as MOS are also
used to quantify voice performance.
As with voice, there are aspects of video signals that need to be understood.
Understanding the concepts of video is critical to a convergence engineer.
Impairments such as noise (luminance and chrominance), loss of
synchronization signals, co-channel interference, and RF interference are
all critical factors for video.
Real-time applications are often concerned with the transmission of signals
originating in analog mode. Sound signals because of their wave nature
necessarily begin as analog signals. An NTSC (ordinary TV) video signal
captures the information needed to reconstruct the visual display as an
analog stream. To be transported across a digital network, analog signals
must be converted to digital information by means of a codec. We discuss
the basic characteristics of codecs used for telephony (speech), those for
general audio signals, and codecs used for video signals. Also covered are
the parameters that underlie the performance of the codec from both a
human and technical perspective, common coding standards defined for
each of these areas, and the boundary conditions for the effective use of
compression codecs in Real-time communications systems.
From the other side of the street, Service Providers need to understand
what their Enterprise customers are working with if they are going to serve
them well. Frame relay continues to be used extensively in Enterprise
networks. Packets crossing the Enterprise boundary encounter NAT. How
are these things going to affect your SLA and the final user quality? Both
Service Provider and Enterprise networks today are quite complex.
Designing a real-time network to work across the combined domain is
doubly complex.
IP was designed to be a simple protocol. A few entries in a routing table,
connect your cables, and you're up and running. It's a great concept, but the
complexities of convergence will not allow us to maintain that simplicity.
In Section IV, access, WAN, and core technologies are discussed. What are
the drivers to move to an MPLS network? What QoS mechanisms are
available in ATM and how are they invoked? What are the important things
to know about ATM, frame relay, MPLS, SONET, and Optical Ethernet
with respect to real-time networking? A convergence engineer needs to
understand these network technologies, to be able to comprehend the
concepts, and to understand their influence on real-time operation.
Convergence is happening in many places. It's already happened at Layer 1.
We are now seeing convergence at Layers 2 and 3, and VoIP is just one of
the driving factors. The characteristics of the LAN, the WAN, and of
course, the access network will come into play in the determination of the
final network performance.
hardware level, but how can you carry this through to the logical layer?
You need to know, or suffer the consequences.
Chapters 17 and 18 cover survivability at the network level. Together, let's
explore the concepts of Network Reconvergence and MPLS Recovery. In
other words, how can you build in survivability at the logical layer. This is
a key factor to successful network convergence.
The previous sections have discussed the relationship of applications,
protocols, and technologies to real-time networking. Once you get to this
point, you will have been introduced to some basic real-time services, how
packet transport affects them, and some techniques for controlling and
enhancing the performance of the packet network to meet the demands of
these real-time services. You will understand the concepts of core network
technologies and how they need to work on your existing LAN.
In Chapters 19 and 20, the concepts are all brought together. Now is the
time for the Converged Network Engineer to shine. Managing the
complexity of the converged network takes planning. This section of the
book helps consolidate the concepts you've learned, and shows how
network planning can polish real time over IP to a brilliant shine. These
chapters consider network planning for real-time voice and data, and how
to translate Quality of Service settings from one network technology to
another. In these chapters, potential issues related to real-time networking
will be described along with mitigations and best-practice engineering
guidelines.
Nortel solution can help both Enterprise and Carrier customers move to the
converged network of tomorrow.
Many readers will come to this book with strong expertise in one or more
of the areas covered, but may have little or no familiarity with other areas.
While we assume that readers will have basic knowledge of data
networking and TCP/IP, we cover a range of topics associated with
convergence and real time. The reader can pick and choose sections and
chapters, and does not necessarily need to read the various parts in order.
If you're not planning to take the certification, it is our hope that you can
use this as a guide as you embark on the journey to convergence. It is our
hope that everyone will be able to get something out of this book,
regardless of their background or the environment they work in today.
The arrows used on the diagram are not meant to indicate that a process is
strictly one-way, but to illustrate the perspective of application-level data or
decision-making looking into the network. In general, the arrows point
down to indicate that the real-time application looks towards the Wide Area
Network through these layers. In many cases, there are different ways that
a real-time application could reach the Wide Area Network. Some
technologies, such as Cable and xDSL, have special Layer 2 relationships
between the Local Area Network and the Wide Area Network. Not shown
on the diagram are the encapsulation mechanisms used by these types of
applications to bridge Local Area Network level traffic through Cable or
xDSL transport and back into the core IP/ATM networks.
The diagram highlights some of the different aspects of real-time issues
that are addressed in the book, including transport protocols, session
control protocols, Quality of Service (QoS) protocols, and reliability-
related protocols. The latter have been included because real-time
applications can significantly increase the requirements for network
reliability. The braces on the right side illustrate where in the transport path
the QoS and reliability features are applied.
Conclusion
It is no longer enough to be solely a data or voice engineer. The networks
of today carry essential applications. These applications are not just data
anymore. It is about a real-time interactive world. To support convergence,
the underlying network must support real-time applications and service
with delays less than 250 ms. Convergence demands “One Network” that
brings together all the threads into one fabric. It is no longer about pieces of
knowledge; convergence is all about how to weave all the pieces together. It
is all about building your engineering toolkit. The NCTS – Real-Time
Networks certification is part of that kit.
The Nortel Certified Technology Specialist (NCTS) – Real-Time Networks
is just the first step of becoming a Convergence Engineer. This certification
was created to not only assist you, the IP certified engineer, but also to help
us at Nortel. Convergence is part of our culture and our everyday life.
Nortel has built real-time capability into our converged networks. There are
advantages of the converged and real-time world, and you as a certified
engineer, will be ready to embrace them.
I know you will find value in this study guide as you get ready for your
certification. The subject matter experts who created this book hope you
enjoy reading this guide as much as we enjoyed creating it.
Thank you and best of luck in building your tool kit.
Section I:
Real-Time Applications and Services
Let's begin by looking at the applications that run on real-time networks.
Section I examines the characteristics of real-time applications and the
issues that arise when we run them over IP. This section looks at the
applications as the user sees them, and the implications of IP transport
performance on the quality the user experiences. These chapters will give
you a detailed view of how network design and implementation decisions
can affect the user's experience.
Chapter 2, The Real-Time Paradigm Shift, defines what we mean by
Real-Time networking. Real time is defined, along with convergence in
general and some specific types of convergence. Familiar applications are
sorted into real-time and non–real-time categories, and some potentially
distinguishing features of real-time packet traffic are described. Quality of
Experience (QoE) is introduced, and its relationship to network and
application performance is discussed in detail. The chapter concludes with
a discussion of the difference between QoE and QoS (Quality of Service).
Chapters 3 and 4 take a look at issues around quality in two popular
applications, voice and video. Voice Telephony continues to be the “Killer
Application” of telecommunications. Examining voice impairments
provides a good reference point for us to understand both the implications
of IP network behavior on real-time applications, and how to interface IP
networks with the existing TDM network. Chapter 4 looks at video and the
impairments to the image that can result from IP transport.
Chapter 5, Codecs for Voice and Other Applications, introduces
digitization and encoding, which make it possible to put analog signals
over digital networks. The discussion here provides some background on
(1) voice and video analog signals and how characteristics of those signals
translate into digital mode, (2) codecs that are used to remove redundancy
to reduce the amount of data needed to carry the signal information, and (3)
how various errors and disturbances of the compressed digital signal affect
the reconstituted analog output. Common coding standards for telephony,
audio streaming, and video streaming and conferencing are summarized,
and guidelines for selecting a codec for VoIP are provided.
Chapter 2
The Real-Time Paradigm Shift
Concepts Covered
Telecommunications convergence
Types of convergence: network convergence, service convergence,
and application convergence.
Convergence removes constraints for users but adds constraints
for network operators
Real-time telecommunications
How to separate real-time and non–real-time applications and
services
Service quality and performance
Quality of Experience (QoE)
Measuring QoE
Quality of Service (QoS)
Introduction
For more than a decade, telecom scholars have been publicizing the
advantages of convergence to multiservice networks; anticipated benefits
range from cost savings from operating a single network infrastructure to
productivity gains and/or new revenue from advanced services. Depending
on who you talk to, next generation networks are expected to reduce capital
expenditures, reduce operating expenses, increase revenues, decrease user
cost of telecom services, improve quality, reduce quality, increase the
reliability and survivability of the network, increase competition among
carriers, and reduce churn in the customer base. Only time will tell which
of these predictions are correct, but one thing is certain: meeting user
requirements and expectations on converged packet networks is an
enormous challenge. The crucial component of this challenge is making
real-time services operate over a packet infrastructure.
This chapter introduces the concepts of convergence, real-time operation,
and Quality of Experience (QoE). No matter what network they run on,
real-time services like voice telephony1 and video conferencing require
careful engineering to deliver acceptable performance. Convergence of
applications and services means that real-time and non–real-time functions
must share a common network environment and/or run side-by-side within
the same application. Mixed traffic types from services and applications
with differing requirements, can end up bumping heads as the traffic moves
across the network.
Successful deployment of converged networks depends on careful planning
and engineering. The building blocks of this success include an
appreciation of the characteristics and constraints of the services and
applications the network will support, as well as a solid understanding of
the network environment. A solid understanding includes knowledge of the
transport technologies, the protocols, the available choices for
implementation, and issues around interconnection with other networks.
The details of network implementation will affect network performance
and resiliency. The balance of this book reviews these building blocks, the
choices and options available around network architecture, deployment of
services on IP infrastructure, and design guidelines for achieving the
performance and reliability users and network operators need from real-
time services.
The criteria for successful performance are based on user QoE. It doesn’t
matter how smoothly packets move through your network, if the users find
that services and applications don’t meet expectations. Planning must
address the factors that underlie QoE for each service that runs on the
network, as well as any interactions or inconsistencies between them.
This book introduces real-time networking and many of the real-time
concepts.
What is convergence?
Narrowly defined, the term convergence, refers to the merging of traffic
from two or more separate networks, onto a single network. At present, we
are witnessing the convergence of traditional voice traffic (consisting
mostly of standard voice telephony) and LAN-based data traffic (consisting
mostly computer communications such as e-mail and file transfer) onto a
common packet-based infrastructure. More broadly, convergence is used to
describe the fusion of function across all aspects of communications. Three
main kinds of convergence have been defined:
Network convergence–Combining network traffic from different
services (for example, voice, video, data) on one infrastructure
Service convergence–Combining previously distinct services (for
example, wireline and wireless voice; wireless voice and short
message service) into a single service
Application convergence–Merging of previously distinct
applications into a coordinated suite (for example, multimedia,
collaboration, and the integrated desktop)
Although convergence has recently gained prominence and notoriety, it is
not a recent phenomenon. The telecommunications industry recognized
over thirty years ago that the hierarchical network architecture of the voice
network could not continue to grow indefinitely. The introduction of digital
networking in the 1970s made it possible for the network to carry non-
voice data along with digitized voice. This trend continued in the 1980s
with frame relay and ATM. On the data side, private and then public
multiprotocol best-effort networks followed, and forerunners to the IP
protocols emerged. On the voice side, FAX and voice-band data services
ran on a common infrastructure with traditional voice. More recently,
convergence continues with the introduction of Storage Area Networking
(SAN) and IP telephony; these bring with them the more stringent
requirements of real-time operation and business-critical reliability.
Network convergence
Network convergence brings all types of traffic onto a single network
infrastructure, such as voice, audio, video, and data; bearer and signaling.
Such convergence may occur at the level of the transfer protocol (for
example, IP), the data link protocol (for example, Ethernet), and/or the
physical medium (for example, optical fiber).
The overwhelming presence of the Internet has led to the choice of IP as
the converged transmission environment for both Service Providers and
Enterprises. The advantages of this choice include the economics of a
single network platform as well as seamless connectivity with existing
Internet infrastructure. The IP protocol suite now includes higher level
protocols for all forms of data applications, for audio and video streaming,
and for real-time applications such as telephony and conferencing.
Service convergence
Convergence enriches telecommunications by bringing together familiar
features and services that traditionally operate on different systems. For
example, a user’s wired phone and cell phone can share a single number.
Paging, voice messaging, instant messaging, e-mail, and FAX can be
managed by a single agent device. User mobility can be enhanced by
wireless LAN and follow-me services; various media (voice, pager, PDA,
e-mail) converge onto a single user device. At the same time, the total cost
of ownership of the supporting network may be reduced.
Service convergence brings with it, new client devices, communications
servers, and media gateways. It may be realized in a fully distributed
system, running on top of an IP network (service provider, large
Enterprise), or through an integrated office-in-a-box (small Enterprise). It
may be realized as an evolution from an existing installed base, as a stand-
alone system, or as a managed or hosted solution. Users want service
convergence without compromising their familiar telephone operation.
They want the same features/functionality, voice quality, security, and
reliability they are getting from their current individual services. Services
converged on a carefully engineered packet infrastructure can deliver this,
and more.
Service convergence enables a highly mobile and distributed work force.
You can use any IP desktop phone, register and your desktop is where you
are. You can work at home or in a wireless LAN hot spot while you run a
Session Initiation Protocol (SIP) client on your laptop, have your phone
number and telephony features with you, and make secure calls over the
Internet. Or you can have system-wide roaming for your IP wireless
telephone or telephony-enabled PDA. Service convergence will ultimately
allow voice and data roaming across the WAN, bringing down the
boundaries between Enterprise wireless LAN systems and public wireless
services.
Application convergence
The full potential of the IP multimedia networking will bring significant
changes in how people communicate and collaborate. Application
convergence can do for person-to-person communications what browsers,
HTML, and Domain Name System (DNS) have done for information
access and transaction services. It will put the end-user back in control of
the communications space, enhance how users collaborate with colleagues,
and enrich how Enterprises communicate with their customers.
Application convergence is realized through the development of
anticipatory, media-adaptive, and time-sensitive applications. Employee-
facing converged applications allow Enterprises to create distributed teams
to address business opportunities and challenges more effectively and
dynamically. Customer-facing converged applications serve to strengthen
Enterprise/customer relationships and leverage investments in contact
centers and self service applications with integrated databases and back-
office systems. Converged applications will form one facet of the new
revenue-generating services that Service Providers are anticipating.
Real-time processes
You may be familiar with the term real time with respect to computing
operations. There, real time is used to describe processes that take more or
less continuous input and run fast enough to keep up with the rate at which
new input arrives. If the process does not run fast enough, the input backs
up. The execution time of the process is not critical, but the rate at which
new input arrives determines the minimum throughput rate. Taking an
example of digitization and compression of an analog video signal, the
codec must be able to accept and process video frames at the given rate of
the analog input. Variability in input rate can be buffered out, but the
process needs to keep abreast of the average rate to perform as a real-time
computing process.
Real-time networking
As a real-time process, real-time networking requires a minimum
throughput rate defined by the operation of the application. In networking,
we refer to the throughput capacity of a transport path as the bandwidth2. In
contrast to real-time computing processes, however, the execution time is
also a key factor. The “execution time” of a real-time networking service or
application is the end-to-end (one-way) delay. This delay is made up of
both processing (for example, time needed to parse input, execution time
for computations, and other operations), and transport delays (mostly
queuing, buffering, and propagation time).
Among other functions, networking processes mediate information transfer
between the endpoints. As well as bandwidth, sufficient to keep up with the
2. The term bandwidth is derived from the relationship between the frequency bandwidth of an analog
carrier and the maximum rate that the carrier can be modulated to signal one bit of information. The
broader the bandwidth, the faster the maximum modulation rate, and so the more bits can be sent per
unit time.
3. As usual, in reality, things are more complex. Increasing bandwidth in the network does have some effect
on the total delay. First, there may be alternatives available at higher bandwidth (such as higher rate
codecs that can reduce processing time). Second, where congestion is occurring, increasing the total
bandwidth available can reduce queuing in the network, which in turn decreases the delay.
Packet Loss
10%
Interactive Responsive Timely Non-critical
5%
Voice/video
Conversational messaging
voice and video Streaming
audio/video
100 ms 1 sec 10 sec Fax 100 sec
0%
Command/control Transactions Background
Paging,
e.g., Telnet, e.g., E-commerce e.g., Usenet
Downloads
interactive games web-browsing, E-mail delivery
E-mail access
Delay
Figure 2-1: Sensitivity of applications to delay and loss of data (from Rec.
G.1010, End User Multimedia QoS; figure reproduced with the kind
permission of ITU)
Aside from delay and bandwidth requirements, there are differences
between real-time and non–real-time flows. Real-time applications often
use small, more regularly generated packets, and for many, the flows may
last minutes or even hours. Simultaneous two-way traffic is common. Voice
traffic, for example, consists of small packets carrying the speech signal
that are generated on a regular schedule. Voice packet generation is
predictable either deterministically (where both speech and silence are
sent), or statistically, according to the distribution of conversational
utterances (where silence suppression is used). Video conferencing
services will have larger packets, but these are still generated regularly,
compared to the short, burst traffic associated with data communications
such as file transfer or e-mail delivery. The characteristics of interactive
command-and-control games show burst traffic of small packets containing
the joystick movements or mouse clicks associated with rapid-fire play.
The packet generation statistics depend on the characteristics of the
particular game.
In contrast, most non–real-time computing data traffic is highly bursty and
consists of large packets. Flows are generally shorter-lived and have a
back-and-forth nature (one direction, then the other) and the amount of data
transferred may be highly asymmetric, where the return traffic consists
mostly of TCP acknowledgements, user commands, and so on. Streaming
flows consist of regularly generated packets, more like telephony flows,
than like typical data flows. However, streaming flows are unidirectional
and use larger packets than are usually found in interactive applications.
Network signaling traffic is different again. Signaling traffic is usually
time-sensitive, and is often associated with session setup or another system
function. Similar to other real-time traffic, signaling is comprised of small
packets. However, flows are short, and the pattern is back-and-forth, rather
than simultaneous traffic on the two paths.
Table 2-1 provides a point-by-point comparison of the characteristics of
real-time and non–real-time traffic through a packet network.
Real-time services
We can use the preceding definitions to categorize common services and
applications. Figure 2-2 classifies many applications as real-time (right,
shaded background) versus non–real-time (left, white background). In
addition, the diagram also differentiates between applications where a
Password
(delivery) Screening
Process
File Monitoring
Transfer Network
Computing Security
Monitoring
Remote Command &
Control Games
App
Move/Response eCommerce
Games SMS Video
Human
What is QoE?
Quality of Experience (QoE) is the user’s perception of the performance of
a device, a service, or an application. User perception of quality is a
fundamental determinant of acceptability and performance for any service
platform. The design and engineering of telecommunications networks
must address the perceptual, physical, and cognitive abilities of the humans
that will use them; otherwise, the performance of any service or application
that runs on the network is likely to be unacceptable4. Successful design
4. Without proper understanding of user requirements, there is a risk of both under-engineering, where the
network fails to meet the needs of the users, and over-engineering, where the specifications go
beyond the user’s needs, needlessly driving up the cost to provide the device or service.
Ea
y ilit
sy
Reliab
to
y
lit
Us
bi
la
e
ai
Av
Pro
ind gress
ic a
tors
QoE Dexterity
sive
p on Se
Res cu
Clear
e
Us rit
y
& pict
t to
sound
n
ure
icie
Eff
Figure 2-3: Some of the factors influencing the QoE of a service, application,
or device
Efforts are more successful where QoE is an integral part of the design
process. Retrofitting to improve low QoE is likely to be difficult,
expensive, or inadequate. For example, external echo cancellers are more
expensive than integrated echo control. Tweaking the network to reduce
delay may achieve some minor improvement, but many sources of delay
will be hard-coded and therefore inaccessible to tuning. What does this
mean for buyers of real-time converged networks? Vendors whose
performance targets are derived from a comprehensive set of QoE
parameters, and whose design intent begins with these targets are likely to
achieve better overall QoE. Vendor selection criteria should include the
vendor’s attention to QoE, as well as system reliability and cost.
Measuring QoE
Aside from the obvious grossly malfunctioning cases and user complaints,
how can we determine the level of QoE our network or service provides?
Quality of Experience is a subjective quantity and can be measured directly
using behavioral science techniques. QoE can be measured in a laboratory
n
le
lle
ab
ab
ce
pt
pt
Ex
ce
e
cc
Ac
Some Rating
a
Un
is properly engineered to meet them, the resulting device will have high
QoE.
What is QoS?
Quality of Service (QoS) refers to a set of technologies (protocols and
other mechanisms) that enable the network administrator to manage
network behavior by avoiding or relieving congestion, expediting time
sensitive packets, limiting access to congested links, and so on.
The aim of QoS mechanisms is to ensure efficient use of network
resources. The alternative is overprovisioning capacity, which may not
solve the problem of contention for specific resources, and, as we have
discussed earlier, may not improve the performance of real-time
applications. Quality of Service mechanisms do not create bandwidth but
instead manage available bandwidth more efficiently, especially during
peak congestion periods. Congestion occurs when a node or a link reaches
its maximum capacity, that is, when the sum of ingress traffic at a given
node exceeds the egress port capacity. QoS mechanisms may not be
sufficient or effective in a network that is continually congested; to address
this, redimensioning may be necessary.
An important aspect of QoS is assigning packet priorities corresponding to
specific service classes (for example, with specific payload types) or within
specific flows (for example, User X vs. User Y). They can raise the priority
of a given class of packets or a given flow or limit the priority of competing
flows. Providing differentiated services requires first determining the
desired services and user performance requirements, and second defining
and evaluating the appropriate QoS mechanisms required to balance the
resulting traffic. QoS mechanisms allow us to manage network
performance (for example, bandwidth, delay, jitter, loss rate, and response
time) to maintain stable, predictable network behavior.
Conclusion
The main challenge of converged networks is to create a network
environment that allows all the services and applications it carries to
perform well, regardless of whether they are real-time or non–real-time.
The combination of real-time applications with traditional non–real-time
computing data applications on a single network can result in a widely
variable packet traffic characteristic. The network must be able to
comfortably carry many different types of traffic without degrading any of
the applications riding on them. Combined with the delay requirements for
individual real-time applications, the design challenge is formidable
indeed. Table 2-2 summarizes the demands of converging real-time and
non–real-time services and applications onto a common packet
infrastructure.
References
ITU-T Recommendation G.1010, End-user Multimedia QoS Categories,
Geneva: International Telecommunication Union Telecommunication
Standardization Sector (ITU-T), 2001.
ITU-T Recommendation G.114, One-way transmission time, Geneva: ITU-
T, 2003.
ITU-T Recommendation G.107, The E-Model, a computational model for
use in transmission planning, Geneva: ITU-T, 1998.
ITU-T Recommendation P.800, Methods for subjective determination of
transmission quality, Geneva: ITU-T, 1996.
Chapter 3
Voice Quality
Leigh Thorpe
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Video
Audio
Voice
Real-Time
Application
Control
/ NCS
RTCP
H.323
RTSP
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
Characteristics of conversational voice services (“voice”)
Steps involved in transporting voice over an IP network
The main factors affecting VoIP conversation quality
Effects of delay and jitter
Effects of packet loss, and its mitigation using packet loss
concealment
Effects of echo, and its mitigation through control of signal level and
echo cancellation
Quality metrics including MOS, PESQ, and E-Model R
Introduction
Everyone uses the telephone every day and has well established
expectations about how it should work. What users may not realize is that
the standard voice call consists of two narrow-band (300-3400 Hz) sound
channels, one in each direction, and that these operate independently. This
means that even if the two users talk simultaneously, each will be heard at
the other end. This operating mode is called full-duplex. The situation is
more complex for advanced features like handsfree (speakerphone) or
conferencing, but for now we will limit our discussion to the simple desk-
to-desk handset call.
Traditional voice networks have provided very high quality voice, setting a
high benchmark for IP voice services. In this chapter, we will discuss the
main factors that contribute to voice quality on converged networks. Voice
is a demanding real-time application. How will typical IP network behavior
affect Quality of Experience (QoE) for conversational voice?1 What are
the challenges we face in making voice meet user expectations, and what
can be done to ensure we meet them?
1. The subjective quality of a voice call is based on many parameters, some of which involve how the
output sounds (for example, level, distortion) and some of which involve the conversation dynamics
(for example, echo, round-trip delay). For the purposes of the present discussion, these are combined
under the designation voice quality.
2. An echo canceller is used in this example, but it is not the only option. Other methods of echo control
may be used to control local echo in the phone. This is discussed in “Echo control for VoIP”.
network routers. The signal may pass through several routing nodes before
reaching the packet receive side.
A B C D E F G
Packet Core Gateway TDM
Transport Network Network
Terminal Gateway
Synchronous
Packet Packet Synchronous
(Non-packet)
Side Side Side
Side
A B C D E F G
Figure 3-2: Block diagram of the processes making up the VoIP voice path
In this example, the packet receive side is a media gateway. There, the
packets are delivered and stored in a buffer called the jitter buffer. The jitter
buffer applies a short delay to the data to ensure that a steady stream of data
can be sent to the decoder. Packets are unbundled (D), and the data
reassembled into a synchronous stream. If compression was used, the
signal is decoded. The output of this process is a G.711 bit stream (E),
which is handed over to the TDM network. The echo canceller shown is a
network canceller, and it is essential at the interface between an IP network
and a TDM network where analog access lines may be in use. A loss pad
(F) at the output may be needed to match the loss plan of the packet
network to that of the TDM network. When the G.711 stream reaches the
end of the TDM network (detail not shown), it is converted back to analog,
and the analog signal travels over the local line to the telephone at the far
end (G).
Certain special features are included in the diagram. Speech Activity
Detection (SAD) indicates a silence suppression function, which may be
used to determine whether the data contains speech, which is sent across
the packet network, or only silence, which is not sent. PLC refers to Packet
Loss Concealment. PLC is a process by which the output G.711 bit stream
is repaired to smooth over the gaps left by any missing data.
Note that all Digital Signal Processing (DSP) components (codecs, echo
cancellers, silence suppression, and packet loss concealment) are situated
in the synchronous (non-packet) portion of the path. The speech data are
not read or modified in the packet portion of the network.
Although Figure 3-2 shows a particular connection from an IP phone to a
conventional phone, other connections (for example, IP-to-IP, wireless-to-
wireless over IP) are similar. Only minor changes to the diagram would be
needed to describe most alternate VoIP scenarios.
Router
Packet Packet
PSTN POTS
IP Phone Network Network TO
TO EO
V2I Edge
core
MG
Router
L2 SW router
V2I TDM handoff Non-controllable
Source Jitter parameters
Enterprise Queue Size Transmission Delay
access Network Jitter
Controllable
Voice/data loading parameters
Speech codec
The speech codec chosen will have a strong influence on the final obtained
quality, both because of the baseline quality of the codec (that is, the
quality of the codec without other impairments) as well as the response of
the codec to other factors, such as presence of background noise, packet
loss, and transcoding with itself or another codec. The choice of codec is an
important determinant of the overall performance of VoIP. See Chapter 5
for additional discussion of the contribution of various telephony speech
codecs to VoIP service quality.
End-to-end delay
The end-to-end delay of a voice signal is the time taken for the sound to
enter the transmitter at one end of the call, be encoded into a digital signal,
travel through the network, and be regenerated by the receiver at the other
end. Delay is sometimes called latency. When delay is too long, it may
cause disruptions in conversation dynamics. As well, increasing delay
makes echo more noticeable.
Variation in delay, caused by differences in the time taken for packets to
cross the network, is called jitter. Jitter is a concern because the decoding
of the digital signal is a synchronous process and must proceed at the same
constant pace that was used during encoding. The data must be fed to the
decoder at a constant rate. Variation in packet arrival times is smoothed out
by the jitter buffer, which adds to the end-to-end (mouth-to-ear) delay.
Jitter is not considered a separate impairment because the effects of jitter in
the packet network are realized in the output either as delay or as distortion
from packet loss.
Packet loss
In VoIP, packets sometimes get lost. Packets may be dropped during their
journey across the network, or more commonly, they are late in arriving at
the destination and miss their turn to be played out. The missing
information degrades the voice quality, and a Packet Loss Concealment
(PLC) algorithm may be needed to smooth over the gaps in the signal.
Echo control
Because of the longer delay introduced by VoIP, echo control is a major
concern. A given level of echo sounds much worse when the delay is
longer. Echo control at the appropriate places in the connection will protect
the users at both ends. Echo control relies on the correct signal levels (see
Signal Level, below) as well as on echo cancellers and other techniques
that prevent or remove echo from the connection.
Signal level
The level or amplitude of the transmitted speech signal is determined by
amplitude gains and loss across the network. There are a number of
contributors to the final signal level, and most are defined in the loss plan
(sometimes called the loss/level plan) of the network. The loss plan for
TDM ensures that the output speech is heard at the proper level and
contributes to the control of echo. The loss plan for VoIP is reasonably
simple; the sensitivities of the sending device (say, an IP phone) and the
receiving device (say, a media gateway) are defined by standards, and there
is no gain or loss in the packet portion of the network.
100
90
no
d re
80 fr e g co n
om ra im mi m ot
R
de dat pa ld m
en
la ion irm to de
70 y en sig d
t f nif
ro ic
m an
de t
60 la
y
50
0 100 200 300 400 500
One-way Delay (mouth-to-ear) (ms)
3. Jitter buffer for VoIP was described with a single value (delay in ms), and the recommended size
was twice the packet size. This formulation does not adequately specify either the buffer capacity
(which may need to be higher than two packets to prevent packet loss through buffer overflow fol-
lowing a congestion event) or the wait time (which should be much lower where jitter behavior
allows).
Sources of delay
The end-to-end delay is the total of all delays incurred in the voice path.
The principal sources of delay are summarized in Table 3-1. The four main
categories of delay are shown in the left-hand column.
Processing delay is an inevitable part of VoIP. Voice packet payloads
contain speech associated with a chunk of time, and the system must wait
for that speech to accumulate before it can be put in a packet. The packet
can not be loaded and sent until all the speech for that chunk is collected.
Where speech compression is used, the time needed for coding is added as
well. The speed of any processors (DSP, CPU) involved also contributes to
the final delay.
Serialization delay (the time needed to push a packet onto the wire) is a
small but predictable contribution. It is determined by the channel speed
(bits/sec) and the number of bits in the packet. On high speed links (> T1)
serialization delay becomes negligible compared to other sources of delay.
Queuing delay accumulates at network nodes (routers and switches)
across the network. Congestion can increase packet waiting times in
buffers. Variation in queuing and buffering delays in the network account
for most of the variation in packet transport times (that is, jitter). The jitter
buffer wait time is another instance of queuing delay.
Propagation delay is the time taken for the signal to travel through a cable
or fiber. In the conventional public network, propagation delay is the
largest contributor to end-to-end delay. For international calls, propagation
delay through terrestrial circuits can exceed 100 ms, so it remains an
important contributor to VoIP delay.
Propagation delay across a fixed distance is not a controllable parameter,
since it is determined by the speed of the signal through the transmission
medium (usually light through a fiber). However, it is possible to ensure
that packets take the most direct route through the network to minimize
queuing and propagation delay. Note that where the shortest route is
congested, queuing delays on that route may exceed the additional time
needed to take an alternate route.
Distortion
The remaining VoIP impairments to the conversation quality are different
types of distortion. These are summarized in Table 3-2. Codecs are
included in the table, but the details are discussed in
Chapter 5. Signal level is included with echo, since signal level through the
network plays an important part in the control of audible echo.
Packet loss
Packet loss can be a significant source of distortion to VoIP. Lost packets
create gaps in the voice data, which can result in clicks, muting, and
artifacts associated with attempts at smoothing and repair. Non–real-time
data transmission is robust to packet loss because packets can be resent.
Delay-sensitive applications such as interactive voice can not wait for the
time it takes to resend.
Generally, there are two ways that packets can be “lost.” The first way is
that some packets never make it to the destination. They may be lost at
network nodes either through a buffer overflow at a congested network
node (insufficient memory to store packets waiting for forwarding), or
because a congested router deliberately discarded them to reduce packet
load. These packets are truly gone, and will never arrive at the destination.
Disabled devices or fiber cuts can also result in lost packets, until the
network responds by establishing an alternate route. Packets lost in these
ways will be spread across all the flows being handled at the time, so losses
on individual channels are likely to be small.
The second type of loss is that packets arrive too late. Queuing and other
network delays can cause variability in packet arrival time at the receiving
end. The jitter buffer smooths out the variability by holding packets for a
fixed wait time relative to the expected arrival time before they are sent to
the decoder. The jitter buffer waiting time determines the longest time that
a packet can take to arrive. Packets delayed longer than this lose their turn
in line, and are as good as lost since the voice playout can not wait for the
late data to show up. The total packet loss will be a combination of losses
from these two sources.
When significant congestion occurs at a transmission node, packets may be
held up long enough that the decoder uses up all the data waiting in the
jitter buffer, resulting in underflow. When the congestion clears, the several
packets that have been backed up are forwarded quickly, one after another,
with the possibility that there may be more packets arriving at the jitter
buffer than the memory can hold. When this happens, packets are lost
because there is no room to store them (jitter buffer overflow).
FAX is generally affected during the FAX handshake, where a lost packet
may result in a failed call attempt. Modem calls can have similar setup
difficulties, and may be subject to data rate downspeeding or call drop if
packet loss is encountered during data transfer.
Controlling echo
Talker Listener
A B
Talker Echo A's voice
(Delay > 5 ms)
A's voice, delayed
Talker Listener
A B
Listener Echo A's voice
Talker Echo
A's voice, delayed
Figure 3-5: The type of echo is named after who hears the echo.
Because echo cannot be generated in the digital portion of the path, the
only sources of echo in all-digital networks such as ISDN, cellular, and
packet networks (for example, IP, ATM, frame relay) are audio and
acoustic coupling in the end device. These echoes are best controlled in the
end device itself, and TIA-810-A gives requirements for the maximum
allowable coupling in an end device (TCLw), which applies to any
handsets, headsets, and speakerphones used on wireline digital networks,
including IP.
The degree to which echo impairs a conversation depends on two main
factors: the level (loudness) of the echo and the time it takes the echo to
come back (delay). Other sound (such as the talker's own voice, the far
user's voice, room noise, circuit noise) may mask the echo and change the
threshold of audibility. Figure 3-6 shows the quantitative relationship
between the level of the echo (measured in dB TELR) and delay (expressed
as the mean one-way delay of the echo path), for a talker in a quiet location.
The echo delay refers to the delay between the talker and the reflection
point. Since the reflection point is usually a hybrid in the access circuit at
the other end of the call, the echo path delay is typically the same as the
end-to-end delay.
Talker Echo Loudness Rating (TELR), which is a measure of how much
attenuation is applied to an echo along the echo path, weighted for the
perceptibility of the frequency components of the echo. TELR accounts for
all gains and losses in the echo path (including those supplied by an echo
canceller or echo suppressor). The computation of TELR also takes into
account the sensitivity of human hearing to the sound frequencies making
up the echo. TELR measures the loss (attenuation) of an echo rather than
its absolute level, and is thus independent of the level of the talker's voice.
This means that a single TELR requirement applies to all talker levels.
The color code in Figure 3-6 reflects the audibility and annoyance of echo.
There is no audible echo for TELR/delay combinations falling in the green
region. The contour between green and yellow shows the average threshold
of echo audibility, where echo is “just noticeable.” A given level of echo is
more easily detected at longer delays. Therefore, the “just noticeable” echo
is progressively quieter with increasing delay, that is, the TELR gets higher.
The yellow region represents combinations of TELR/delay for which the
echo is noticeable but is not loud enough to be annoying. TELR/delay
combinations where echo is loud enough to be irritating or annoying fall in
the red region. Limits on TELR are defined in terms of subjective
acceptability. Maintaining adequate TELR for the maximum expected
delay will ensure acceptable echo performance.
Figure 3-6: This graph shows the limit of audibility (green/yellow contour)
and the limit of acceptability (yellow/red contour) of echo as a function of
delay. The x-axis gives the one-way delay on the echo path, while the y-axis
gives the level of echo measured in dB TELR. (Higher TELR denotes quieter
echo.) Also shown are the positions of common types of telephone
connections, with an indication of the improvement associated with adding
echo cancellation to the call. These contours are taken from ITU Rec. G.131,
and are based on subjective ratings of echo in telephone conversations.
Receiver
Four-Wire Four-Wire
Analog Digital
inductive
coupling
hybrid
inductive coupling echo
(in handset cord)
Types of MOS
Mean Opinion Score began life as a subjective measure. Currently, it is
more often used to refer to one or another objective approximation of
subjective MOS. Although all “MOS” metrics are intended to quantify
QoE performance and they all look very similar (values between one and
five with one or two decimal places), the various metrics are not directly
comparable to one another. This can result in a fair amount of confusion,
since the particular metric used is almost never reported when “MOS”
values are cited. Appendix C provides more details on the distinction
between different types of MOS, and how to distinguish them. There are
fundamental differences between individual metrics, and numerical values
are not necessarily directly comparable just because they are both called
MOS.
Subjective MOS
Subjective MOS is a direct measure of user perception of voice quality (or
some other quality of interest), and is thus a direct measure of QoE.
Subjective MOS is the mean (average) of ratings assigned by subjects to a
specific test case using methods described in ITU-T P.800 and P.830.
Subjective MOS can be obtained from listening tests (where people rate the
quality of recorded samples) or conversation tests (where people rate the
PESQ (P.862)
Subjective studies take significant time and effort to carry out. MOS
estimators such as PESQ6 (Perceptual Evaluation of Speech Quality) can
provide a quick, repeatable estimate of distortion in the signal. However,
the score does not reflect the conversational voice quality, since listening
level, delay, and echo are excluded from the computation. Separate
measures of these characteristics must be considered along with a PESQ
score to appreciate the overall performance of a channel.
PESQ is an intrusive test, which means that the tester must commandeer a
channel and put a test signal through it. To perform a test, one or more
speech samples are put through a device or channel, and the output (test
signal) is compared to the input (reference signal). The more similar the
two waveforms, the less distortion there is, and the better the assigned
score. The algorithm does some preprocessing to equalize the levels, time
4. MOS are sometimes quoted to many decimal places. The appropriate number of decimal places de-
pends on the reliability, which in turn is determined by the number of independent ratings that con-
tribute to the mean. Usually, one decimal place is appropriate. Two places may be justified if a
large numbers of ratings are averaged (more than about fifty).
5. “Context” refers to things like the order in which the test cases are presented in the experiment, the
range of quality between the worst and best test cases used in the experiment, and whether the sub-
jects are asked to do a task before making a rating. If an experiment is repeated exactly (with dif-
ferent subjects), similar scores will be obtained within a known margin of error. This is not the
case from one experiment to another. Consistency from test to test is found in the pattern of scores,
not in the absolute value of the scores. For example, the MOS-LQS for G.711 may be 4.1 in one
study, 3.9 in another, and 4.3 in a third, but whatever the value obtained, we expect to obtain a
higher score for G.711 than G.729, and approximately equal scores for G.729 and G.726 (32 kb/s).
6. Many objective quality algorithms have been defined. Aside from PESQ, the best-known are PSQM
(Perceptual Speech Quality Measure), standardized as P.861, and PAMS (Perceptual Analysis
Measurement System), a proprietary method developed by BT. As the current standard, PESQ is
preferred to the older measures.
align the signals, and remove any time slips (where some time has been
inserted or deleted). PESQ then applies perceptual and cognitive models
that represent an average listener's auditory and judgment processes. A
diagram of the process is shown in Figure 3-7.
The raw PESQ score is usually converted to a MOS estimate using one of
several available conversion rules, for example, PESQ-LQ7.
Original
Perceptual
Difference Model
System under
test ?
Output
Cognitive
Model
7. A conversion defined by Psytechnics, a company holding intellectual property rights for PESQ.
There is now an ITU-T standard conversion defined in P.862.1.
8. The E-Model can also be used to compute a “MOS” estimate. Note that MOS computed with the E-
Model is not comparable to MOS computed with PESQ.
100
PSTN Reference
R=78.7, Delay=125 ms
90 (G.711 @ 20 ms)
G.711 > DCME > G.711
R=75.8, Delay=191 ms
80 (G.711 @ 40 ms)
G.711 > DCME > G.711
R R=71.0, Delay=231 ms
70
14057 km
60
50
0 100 200 300 400 500
Figure 3-8: R vs. delay for a particular class of terrestrial international calls.
G.711 is used in the national links with a Digital Circuit Multiplexing
Equipment (DCME), which generally uses G.726 speech coding at 32 kb/s in
the undersea cable. Specific points on the curve show R for the benchmark
(PSTN reference, TDM end-to-end), and for each of two calls using IP in the
national portions of the call (20-ms and 40-ms packets, respectively). Bars
under the curve indicate the sources for the cumulative delay associated
with each call. Since only one coding scenario is considered (G.711> G.726
> G.711), the model generates only one contour. The model assumes best
practices for any factors not specified.
G.726, Ie = 7
90
G.729, Ie = 10
80
R
70
60
50
0 100 200 300 400 500
Listening @ 0ms
One-Way Delay (ms)
Minimum delay for 20-ms payload
Figure 3-9: R vs. delay for G.711, G.726 (32 kb/s), and G.729 (8 kb/s). The
model assumes best practices for any factors not specified. Note that
although R is plotted for all delays, there will be a non–zero minimum delay
(yellow points) for interactive calls. For these points, propagation delay is
zero. This is the lowest delay for the modeled call scenario (the minimum
delay will depend on the codec as well as the packetization selected). In this
chart, we have assumed similar equipment delays beyond those associated
with the codec; however, in actual network situations, these can change as
well. The blue points represent the quality differences heard when listening
to recorded speech samples.
100
No Audible Echo
TELR = 65 dB
90
TELR = 60 dB
80 TELR = 55 dB
R
TELR = 50 dB
70
TELR = 45 dB
60
50
0 100 200 300 400 500
One-Way Delay (ms)
Figure 3-10: R vs. delay for various levels of echo. Note how R drops off
more quickly with smaller values of TELR. The increasing rate of
degradation for louder echo reflects the interaction of delay and echo
discussed above. The model assumes best practices for any factors not
specified.
Combining Factors
Loss Plan, Speech Compression, & Packet Loss
R
ISDN (all-digital)
Digital Loss Plan
100
POTS > POTS
Analog Loss Plan
POTS > G.726 > POTS
90
Analog Loss Plan + POTS > G.729 > POTS
Waveform Compression
80
Analog Loss Plan POTS > G.729 >POTS (3% PL)
+ Speech Compression
50
0 100 200 300 400 500
One-Way Delay (ms)
Figure 3-11: R vs. delay for multiple distortion factors, showing the effect of
successive addition of non–ideal factors: loss plan, compression coding,
and packet loss. Since delay does not exacerbate any of these factors, each
contour has the same relative shape as the one above. The model assumes
best practices for any factors not specified.
:.
Some commonly used quality metrics were discussed: MOS, PESQ, and
the E-Model R. Subjective MOS is a directly measure of user perception of
quality, and may be carried out in lab or field studies. PESQ is a method of
estimating subjective MOS with an objective algorithm, and has been
standardized as P.862. The E-Model is a standard network planning tool
(ITU G.107) that generates another overall quality metric, R. R combines
fifteen objective measures, including the listening level, the delay, the
encoding distortion. While both PESQ and R can be translated to a MOS
value, such MOS should be considered for indication only. No comparison
of MOS derived from these different sources.
References
ITU-T Recommendation G.131, Talker Echo and its control, Geneva:
International Telecommunication Union Telecommunication
Standardization Sector (ITU-T), 1996.
ITU-T Recommendation G.711, Pulse code modulation (PCM) of voice
frequencies, Geneva: ITU-T, 1988.
ITU-T Recommendation G.726, 40, 32, 24, 16 kbit/s Adaptive Differential
Pulse Code Modulation (ADPCM), (includes Annex A: Extensions of
Recommendation G.726 for Use with Uniform-Quantized Input and
Output-General Aspects of Digital Transmission Systems), Geneva: ITU-T,
1990.
ITU-T Recommendation G.729, Coding of Speech at 8 kbit/s Using
Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-
ACELP), Rec. G.729, (includes Annex A, Reduced Complexity 8 kbit/s
CS-ACELP Speech Codec, and Annex B, Silence Compression Scheme for
G.729 Optimized for Terminals Conforming to Recommendation V.70),
Geneva: ITU-T, 1996.
ITU-T Recommendation P.800, Methods for subjective determination of
transmission quality, Geneva: ITU-T, 1996.
ITU-T Recommendation P.800.1, Mean Opinion Score (MOS) terminology,
Geneva: ITU-T, 2003.
ITU-T Recommendation P.830, Subjective performance assessment of
telephone-band and wideband digital codecs, Geneva: ITU-T, 1996.
ITU-T Recommendation P.861, Objective quality measurement of
telephone-band (300-3400 Hz) speech codecs, Geneva: ITU-T, 1998
(withdrawn).
ITU-T Recommendation P.862, Perceptual evaluation of speech quality
(PESQ): An objective method for end-to-end speech quality assessment of
narrowband telephone networks and speech codecs, Geneva: ITU-T, 2001.
9. TIA-810-A, TSB-116, and other VoIP standards are available from TIA for free at the following site:
http://www.tiaonline.org/standards/sfg/committee.cfm?comm=tr%2D41&name=User%20Premises
%20Telecommunications%20Requirements. Click on the first link (TR-41 VoIP Standards). Answer
the questions. This takes you to the page where you can download these standards for free.
Chapter 4
Video Quality
Peter Chapman
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
codec SIP To
Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
An overview of analog video systems
Impairments to analog systems
MPEG coding principles
A brief description of each of the common effects of coding
principles
Video
Video is the means of transmitting images by sending the instantaneous
brightness, color hue and color intensity of a picture element. The picture
Video Impairments
Video impairments come from many sources, such as capturing, digitizing,
compressing, and distributing videos. Nortel is concerned primarily with
problems resulting from distribution or transmission. The analog standards
are designed to ensure that a picture is created even when there is
corruption of a broadcast signal. Due to the brain and eye correlating
information from line to line (spatial redundancy), a picture can be
perceived in extreme signal degradation, provided the structure of the
picture is maintained.
delay in transmission
artifacts introduced into the image
loss of some information
Luminance noise
Random noise manifests itself as random speckles on the picture. To
minimize the effect of noise, some versions of broadcast TV use negative
picture modulation, which means that the peak output of the transmitted RF
signal corresponds to a black signal and minimum amplitude corresponds
to a white signal. Negative picture modulation is effective against
interfering noise, such as that generated by motor vehicle ignition systems,
which it was designed to counter. It generates random black spots in the
picture, which is far less intrusive than white spots. (Modern vehicle
electrical systems have largely eliminated the source of this problem.)
Chrominance noise
The effect of noise on the chrominance channel is to change the hue of the
chrominance signal with no effect on the luminance. Due to the
uncorrelated nature of noise, this effect reduces the saturation and intensity
of the reproduced color, producing a washed-out effect. High-frequency
chrominance noise manifests itself as dots or specks of varying color, but
these are not well defined, due to the lower chrominance bandwidth. They
do not have well defined edges because there is no change in luminance at
the edges.
rate at which they create lines to be exactly the same as the received line
rate. They base this line rate on a composite rate from a number of received
lines so that a single corrupted or missing synchronization pulse does not
cause loss of synchronization. Sustained loss of synchronization pulses
causes the receiver to revert to its default line generation rate, which is the
nominal rate specified in the standards.
Co-channel interference
Co-channel interference is possibly the most annoying of all the broadcast
video impairments. It manifests itself in one of two ways. If the source of
the interfering signal is the same as the primary signal, but the propagation
path is different, it appears as a second signal but slightly delayed in time.
On the screen, it appears as a second signal displaced horizontally from the
first. The effect of the delayed field/frame synchronization pulse appears as
a darker column on the left side of the screen where the blacker-than-black
synchronization pulse is being added to the primary signal. If the source of
the reflected signal is moving, as if caused by a reflection from an aircraft,
then this second image position changes in position on the screen due to the
varying path length. If the co-channel interference is from a separate
transmitter, it appears as a second different image on the screen. Because
the two sources are not perfectly synchronized, the two images move
spatially in relation to each other.
Electrical RF interference
RF interference that emanates from a source other than a video manifests
itself as diagonal lines or patterns superimposed on the picture. This type of
interference can be very annoying.
Digital video
Compression
Compression invariably uses one of the MPEG standards. MPEG does not
define the compression codec, instead, it defines the format of the
compressed information and suggests how it is reconverted to video. This
method allows for the development of compression tools and techniques
based on experience. There is significant development in this area. Early
Encoding issues
MPEG provides a bit stream definition. To create this bit stream, encoders
use a toolbox of different techniques, which are proprietary and dependent
on the encoder used. Details of these techniques are beyond the scope of
this document. However, the video quality is dependent on the encoding
scheme and the tools used. There are wide variations in the perceived
quality of encoded video.
Decoding issues
B and P frame
Rather than send complete information for every frame, the MPEG
standard provides for sending two other types of frames, containing only
difference information from the I frames. Such frames, known as B and P
frames, send considerably less information than I frames do. P are
“Predictive” frames that carry only information relating to the difference
between the last frame and the current frame. B frames are “bidirectional”
and carry difference information based on the immediate past and future
frames. The MPEG decoder then takes this information and together with
the information from the most recent I frame or reconstituted frame from I
frame and previous P frame or B frame, makes a prediction of the next
frame.
A result of packet loss in an I frame, and its associated error in that frame,
is likely to result in an error being propagated until the next I frame.
Typically, the error is propagated for ten or twelve frames, though this is
entirely at the discretion of the coding system. These ten or twelve frames
last about one third of a second and are noticeable.
Missing packets in received video streams manifest themselves as
displaced static blocks. A block of 8 x 8 pixels appears at the wrong
location in the picture frame and is static. This distortion is corrected when
the next complete sequence is received, normally an I frame.
These P and B frames use motion compensation to further reduce the
information being sent.
Motion detection works on a macro block, which is a matrix of four blocks
in a 2 x 2 matrix (16 x 16 pixels). The encoder determines the motion by
looking for a match within adjacent blocks on the subsequent frame. It then
sends a vector indicating direction and position of the matching macro
block. The first vector (left upper-most element) is sent as a complete
vector and this vector is followed by subsequent horizontal elements being
sent as differences from this first vector. This vector and group of
differences is known as a slice. Over what range the match attempt is made
is determined by the encoder. If the range is a wide area, it causes
significant delay in encoding, and inevitably there is a trade off between
encoding delay and compression efficiency. For non–real-time coding, for
example, when preparing a movie for DVD, the delay is not such a
problem, but for live events it is.
Because of the resolution of the vectors and possible nonlinear motion of
the object, the resultant image is not necessarily accurate. Therefore, the
encoder calculates the new image from the calculated vector, and it then
compares this calculated image with the actual image. The encoder sends
as part of the P frame, the difference information, as well as the vector.
Sending an approximate vector with the difference information is more
efficient than sending an accurate vector.
from each other so they do not propagate errors. Because B frames, when
coded from future frames, need those frames ahead of the B frame itself,
the sequence of frames, usually referred to as a Group of Pictures (GOP) is
sent in a sequence other than purely chronological so that both before and
after images are available to the decoder before the B frame is calculated.
This calculation requires a buffer of several frames duration to be present in
the decoder. B frames can be calculated from the previous frame, from the
next frame, or by an interpolation from both before and after. When the
calculation is made using both before and after frames, MPEG supports
simple linear interpolation from two frames, before and after, but does not
support weighted interpolation to handle multiple B frames interpolating
between P and I frames.
Sequences of frames
I frames are much more critical than P frames or B frames, therefore, loss
of an I frame is much more serious than loss of a P or B frame.
A typical MPEG sequence is as follows:
IBBPBBPBBPBBI
A reordering sequence is as follows:
IPBBPBBPBBPBI
Due to different error performance among DVD, broadcast, Digital Video
Broadcast (DVB), and Internet streaming, different sequences are suitable
for different applications. Material needs to be encoded for the medium for
which it is designed. This is a new area and requires experience at finding
the best method.
Compression issues
The distortions resulting from compression are as follows:
blocking
blurring
shimmer
smearing
edge distortion
jerkiness
luminance noise
chrominance noise
contouring
Blocking
Blocking (Figure 4-2) is the effect of blocks of pixels appearing in the
wrong location on the image. Sometimes, blocks appear in the correct
position but in the wrong color because corruption to the color information
has occurred.
Smearing
Smearing typically occurs if the luminance does not significantly change at
an edge, but the color does. Because the chrominance resolution is less than
the luminance resolution, edges tend to blur. This effect is often seen on
colored titles superimposed on a colored background.
Edge distortion
Edge distortion (Figure 4-5) is caused by a number of effects. A common
cause is the application of compression to an interlaced image. Edge
distortion or a comb effect, is caused by the two fields, each representing a
sample of information that represents a different time, presented as a single
frame. Horizontally moving objects display in a different horizontal
position on the two fields. Another cause of edge distortion is problems
with the MPEG motion vectors.
Jerkiness
Jerkiness, resulting from a lack of smooth information, is caused by a
number of effects. Change of frame rate from 24 or 25 to 30 or vice versa,
causes nonsmooth motion, noticeable on the background during horizontal
panning. Other causes are problems with motion estimating vectors,
particularly during accelerating objects.
Luminance noise
Luminance noise appears as specks on the screen. It is caused by a noisy
input source. If possible, remove noise using analog techniques prior to
digitizing.
Chrominance noise
Chrominance noise appears as color spots with no obvious brightness
change. If possible, remove the source of the noise.
Contouring
Contouring (Figure 4-6) is an effect appearing as lines indicating
quantization level changes. It is seen on areas where there is a smooth
intensity gradient. It is a known problem in digitizing video and is normally
dealt with by adding a random signal equal to one half quantization level,
known as dither. Dither moves the quantization level slightly, resulting in
randomizing the level changes so that they do not appear as a line and
Audio/video synchronization
Audio/video synchronization needs to be ± 50 ms or better. Many
compressed broadcast material does not achieve this synchronization and
lack of lip synchronization is noticeable. The most common cause is poor
attention to the delay introduced in the compression process for the audio
and video channels.
Chapter 5
Codecs for Voice and Other Real-Time
Applications
Leigh Thorpe
Peter Chapman
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Figure 5-1: Transport path diagram
Concepts Covered
Digitization of analog signals
General characteristics of codecs such as sampling rate, bit rate,
and compression ratio
Coding impairments such as baseline quality, encoding delay,
performance with missing packets, and transcoding
Speech codecs for telephony including G.711, G.726, G.729/
G.729A, G.723.1, GSM-EFR, and GSM-AMR
Introduction
Much of the content of human communications consists of analog signals.
To ride across digital networks, analog signals must be converted into
digital form. Codecs play a key role in formatting, compressing, and
translating digital data from analog signals. The characteristics of the codec
contribute significantly to the efficiency and the final quality of the
transmitted signal. In this chapter, we discuss codecs used in real-time
telecommunications applications and services, primarily speech codecs,
and more briefly, audio and video codecs. The discussion will introduce the
parameters that underlie the performance of a codec and will explore
effective use of compression codecs in real-time communications systems.
The field of digitization, encoding, and compression is a broad one, and
this chapter addresses only a very small portion of that field. We will not
cover the mechanics of compression or encoding for the purposes of
encryption. Readers unfamiliar with the digitization process should begin
with the following Sidebar, which describes basic analog-to-digital
conversion.
(Quantization Steps)
Original Signal
Signal Amplitude
{
5
0 time
}
5
Sampling rate too low
to respond to high
Digitized Signal
frequency content
There are eleven quantization steps in Figure 5-2, and the step spacing is
linear, meaning that the amplitude represented by Step Five is five times
the amplitude represented by Step One. The dynamic range of sound
amplitude is very broad, and a large number of linear steps are needed
for high quality reproduction. Sixteen-bit linear PCM (used on CD and
other high quality audio) has 216 (65,536) quantization steps covering 96
dB of dynamic range. The dynamic range of a coding system refers to
the difference between highest (loudest) and lowest (quietest) signals
that can be represented.
If the appropriate filtering is used in the digitization and reconstruction
processes, the deviation between the actual amplitude and the integer
value assigned shows up as noise in the signal, called quantization noise
or sometimes quantization distortion. The number of quantization steps
and their spacing determine how much quantization noise is added to the
output. The more steps, the closer the quantized value will be to the
actual amplitude, and the lower the quantization noise. The more
amplitude steps used, the more bits needed, and the higher the bit-rate
needed to represent the signal in digital form.
Types of Codecs
Codecs are designed to address particular signal domains (for example,
audio and video signals). Within each domain, codecs can be designed to
handle a broad range of signal types, or they can be tailored to optimize
performance on a particular type of signal. In the audio domain, there are
general audio codecs, others specifically for speech signals, and others
specifically for music. Codecs can also be classified by the format of the
output data. Basic audio codecs that directly represent the amplitude of the
analog signal (such as PCM codecs) are called waveform codecs. G.711
and G.726, two telephony codecs, are both waveform codecs. The bit
stream (the digital representation) of a waveform codec contains explicit
information about the amplitude and frequency of the sound signal. These
are sometimes called sample-based codecs. Other codecs work on a
principle that groups information together in bundles called frames. Frame
size for speech and audio codecs is usually given in terms of the duration of
signal contained in the frame, say, 10 or 20 ms. Before the codec can
encode the frame, it has to wait for the entire frame of signal to collect in a
buffer.
Video codec frames are based on the natural succession of images making
up the original signal, each frame corresponding to one image. A video
frame thus refers to either the data corresponding to the frame, or the image
itself. The frame rate used by a video codec is a key determinant of the
smoothness of motion depicted in the playback.
Some frame-based codecs also look at the signal following the current
frame, which means they wait a little more before they begin processing the
frame. This little extra bit of signal is called (not surprisingly) the look-
ahead. The look-ahead for speech codecs is around 5–10 ms. Look-ahead
in video codecs is usually the entire following frame; this will be 40 ms for
full-motion frame rate.
Codecs can also be classified by the type of algorithm they use. Common
types in this classification include PCM, ADPCM, (see Sidebar for a
discussion of both of these approaches), sub-band coding, in which the
signal is separated into multiple narrow frequency bands, each of which is
encoded separately, and Code Excited Linear Prediction (CELP). CELP
codecs are very common in voice telephony, and are described in greater
detail in the following section.
The sampling rate describes how often a digital value is defined from the
instantaneous value of the analog signal. The data rate or bit rate of a
codec refers to the number of bits per second that are needed to transfer the
signal from the encoder to the decoder. These terms are described in more
detail in the Sidebar above. Codecs used to transcode the signal from one
digital format to another generally inherit the sampling rate of the original
analog-to-digital [A/D] conversion; although, downsampling (changing to
a lower sampling rate) is possible.
Some codecs operate at one data rate; others have different rates available.
Constant bit-rate (CBR) codecs carry on at the same rate no matter what
the signal (or even when there is no signal). Variable bit-rate (VBR) codecs
can adjust their rate as coding proceeds. A VBR codec can select a bit-rate
based on the content of the signal being encoded (voiced vs. unvoiced
speech, speech vs. silence, full video scene change vs. a static image), or
based on other factors such as the channel quality or capacity.
All codecs take a finite time to encode and decode a signal. This time is the
encoding delay. Waveform codecs are very fast (microseconds), while
frame-based codecs have a significant delay built into them that can not be
reduced. This delay minimum is the algorithmic delay, which consists of
the frame length plus the look-ahead, if the codec uses one. The encoding
delay consists of the algorithmic delay plus additional time needed by a
finite speed processor to complete the processing. For practical reasons,
this delay is estimated as twice the frame size plus the look-ahead.
Compression
Linear encoding generates a lot of data, and transferring or storing them
uses a lot of capacity. To reduce the volume of data, and hence the capacity
needed to handle it, compression techniques are employed. Compression
can be lossless or lossy; lossless compression is called for where it is
necessary to restore the exact signal content. For voice and video
telecommunications, lossless compression does not sufficiently reduce the
data rate, so lossy methods are used. Some simple compression techniques,
such as logarithmic coding, restriction of dynamic range, and differential
coding, are described in the sidebar.
More complex techniques can compress the signal more efficiently. Many
coding algorithms include a modeling process that makes some
information or assumptions about the characteristics of the signal, the
channel, or human perception. Speech codecs model the acoustics of the
human speech production apparatus. Assuming a signal is speech limits the
number of different kinds of sounds that the codec needs to reproduce.
Speech codecs are designed to process speech signals, and they do it well,
on the other hand, they usually perform poorly with nonspeech signals such
as noise and music. Audio codecs are aimed at a broader range of signals,
and are less likely to put different types of sounds at a disadvantage.
One of the most common compression methods is Code Excited Linear
Prediction (CELP). CELP codecs are frame-based. A CELP coding
algorithm deconstructs the signal into two parts: a spectral model
component and a residual component. The spectral model component is a
digital filter. The residual is the left over part of the signal not accounted for
by the filter model. When the residual component of the signal is passed
through the filter, the segment of speech contained in the frame is
reproduced. The filter parameters are quantized, and the quantization
indices (step numbers) for each parameter are obtained. A table called the
codebook contains numbered entries corresponding to choices for the
residual. The encoder determines the codebook entry that best matches the
residual. These quantization and codebook indices make up the encoded
data that are sent to the decoder. Since the encoder and decoder use the
same codebook, the decoder can easily look up the codebook entry
corresponding to the residual. The quantization indices are used to
reconstruct the spectral model filter. The decoder then filters the residual
codebook entry to reproduce the signal segment for that frame.
Coding distortion occurs because the codebook entry is not an exact match
for the actual residual. The subjective quality of a CELP codec depends on
both the size of the codebook, which determines the number of signal
segments that could be used to represent the residual, and how well the
distortion takes advantage of the “blind spots” in the human perception of
distortion (some types of distortion are less apparent or less irritating). The
size of the codebook affects the bit rate needed to operate the codec.
The efficiency of compression is quantified in the compression ratio, which
is the ratio of the data rate of the compression codec compared to either the
uncompressed signal or some standard digital process. The standard digital
process for telephony voice is G.711 at 64 kb/s, which is the rate used for
an individual channel (DS0) in a TDM network. Audio and video
compression is usually compared to the rate of the linearly encoded signal.
For audio, this is 16-bit linear PCM (used in CDs). For video there is no de
facto standard; the data rate of a linear signal will depend on the frame rate,
display size, and other characteristics of the original analog signal.
Coding impairments
With the exception of delay, impairments associated with the digitization
and compression of audio and video signals are similar for real-time and
non–real-time operation. The performance of individual codecs is
determined by the amount of distortion they add to the target signal, as well
as how they behave with any unwanted signal components (such as noise),
how much the signal degrades when it is passed through the codec multiple
times, and how disruptive data loss is to the output signal.
Encoding distortion
As discussed above, digitization and compression always reduce the
information content of the signal. This can manifest itself as distortion
(change in the shape of the waveform), as increase in the noise floor
(“hiss”), or addition of so-called coding artifacts. For PCM codecs, it
depends on the sampling rate, quantization step size, and whether any
compression techniques (such as logarithmic companding or differential
encoding) are applied. Frame-based codecs that aim for higher
compression will add more distortion. It is often assumed that the lower a
codec's bit rate, the more encoding distortion it will add. While bit rate has
some direct relationship with distortion (see description of the CELP
coding in the previous section), advances in coding technology have made
successive generations of low bit-rate codecs much better than previous
generations. Comparisons across different technologies (differential PCM
vs. CELP, say) do not follow such a simple relationship, either. (The reader
can inspect data rate vs. coding impairment for various codecs in
Table 5-2).
Coding distortion in waveform codecs does not depend on signal type.
CELP codecs, however, because they are generally tuned to a particular
type of input signal, distort different signals in different ways. CELP-based
speech codecs perform poorly with nonspeech signals, including
background noise, DTMF1 tones, and music. This means that CELP codecs
often require a DTMF work-around to detect, transfer, and reconstruct the
tone without putting the signal through the CELP encoder. Music on hold
will be significantly degraded by CELP compression. CELP codecs also
vary in their performance with different voices. Because of the way they
work, CELP codecs work best with lower-frequency voices, so they
generally reproduce men’s voices better than women’s and children’s
voices.
Given this dependence on input signal, evaluating encoding distortion of
CELP and other compression codecs can be a complex process. The most
reliable method is formal subjective testing. This is usually done with
listening tests, so that the codec’s performance on many different input
signals can be examined. Listeners in these tests rate the quality on a scale
of one (bad) to five (excellent), and the resulting average is known as Mean
1. Dual-tone multifrequency (DTMF) tones were invented to pass some signaling such as number dialled
over analog equipment. They are also used by network and proprietary features such as access to
voice mail, credit card number entry, and so on. Network transparency to DTMF is required for these
features to work.
Opinion Score, or MOS, which ranges between one (poor quality) and five
(high quality).
At least three different approaches to estimating encoding distortion are
used: (1) using a MOS from one test case (or a weighted average from
several) from formal subjective evaluation,2 (2) estimating MOS using an
objective quality estimation technique such as P.862 (PESQ*) (currently
available for speech codecs only), or (3) assigning a value indicating the
general extent of impairment, derived from a prescribed subjective test.
The first is an older strategy and works well if the codecs of interest were
directly compared in the same subjective study. The main limitation is that
no bench testing can be done using this method. The second approach can
make measurements on arbitrary systems or components. However, these
methods depend on complex algorithms that are used to obtain the
distortion estimates, and they are not guaranteed to map onto the values
that would have been obtained in a subjective test, since the models are
incomplete. The final approach is used in the ITU E-Model3, where
subjective test results are used to generate an Equipment Impairment (Ie)
value for each codec. The Ie value is then used in the modeling. Ie is cited
in Table 5-2 below as an indicator of the baseline coding quality of each
codec described. All these methods are discussed in more detail in
Chapter 3.
Encoding delay
While encoding by a waveform codec is virtually instantaneous, encoding
by a frame-based codec may introduce a significant delay. Before a frame
can be processed, a frame's worth of speech must collect in the buffer.
Where a look-ahead is used, the encoding window stays a fixed time ahead
of the current frame, and this time is added to the delay.
The encoding delay can be calculated as (frame size + look-ahead +
queuing delay + processing delay). The queuing delay is the time between
a complete frame of speech becoming available and when that frame is
submitted to CPU for processing. Processing delay is the time taken for the
processor to execute the algorithm for that frame. Queuing and processing
delay depend on a particular implementation and processor. A conventional
formula for estimating encoding delay in the absence of a specific
implementation is (2 x frame size) + (look-ahead). This formula assumes a
2. The tendency of authors to cite “the” MOS associated with a particular codec is misguided. No specific
MOS can be assigned to a codec. Different input signals will return different scores, and a change to
the test cases or reference cases used can shift the scores. It is the pattern of results for input signal
types with one codec and for different codecs that is key to understanding a codec’s performance.
This is discussed further in “Chapter 3 Voice Quality”.
3. Additional details on the definition and operation of the ITU E-Model are covered in
“Chapter 3 Voice Quality”.
worst-case scenario that the sum of queuing delay and processing delay is
equal to the framesize, which is the maximum tolerable delay for real-time
operation. It is based on considerations of efficiency: a powerful processor
could encode each frame very quickly, resulting in a delay only a few
microseconds longer than the framing delay. However, the processor would
then sit idle until the next frame is ready. This would require a relatively
expensive DSP. Instead, systems are often designed around a processor that
is powerful enough to finish the processing of one frame just as the next
one is ready. Using optimal scheduling and some other techniques, system
delay can be reduced to (frame size + look-ahead + processing delay).
Decoding is typically a small fraction of encoding time for CELP codecs.
Packet Loss Concealment (PLC) may add a few milliseconds.
Because of the integration of functions in the DSP chip, it is not always
possible to assess the delay associated with individual steps in the
processing. Integration allows more parallel processing and thus provides
the opportunity to reduce the end-to-end delay; on the other hand, it makes
it more difficult to partial out the contributions of the different functions to
the end-to-end delay.
Silence Suppression
Silence suppression is a technique that conserves network capacity by
identifying only the portions of a signal containing active speech, and
sending those portions while discarding or suppressing the portions that do
not. If you're familiar with silence suppression, you've probably been
thinking of it as a VoIP feature, so you may be surprised to find it lumped
in with impairments.
Silence Suppression capitalizes on conversational turn-taking: partners
alternately talk, then listen. On average, each voice path in a two-party call
is active 40-60% of the time. This means that about half the time, a channel
carries no speech signal. To reduce the amount of data to be sent across the
network, only data that encodes actual speech is sent. Data associated with
silent intervals between utterances is discarded.
Silence Suppression employs a Voice Activity Detector (VAD; also called
Speech Activity Detector or SAD) to determine whether there is speech on
the channel, and a noise estimator that samples the noise background noise
and sends coefficients describing the noise to the decoder end of the call.
The detector is actually configured to detect the absence of speech rather
than its presence. This inverts the logic of the detector, which provides a
kind of fail-safe: if the signal is ambiguous in some way, or the detector
fails, the decision outcome will be that speech is present. Therefore, speech
content will not be suppressed accidentally. Two VAD parameters are
important to Silence Suppression performance: (1) the detection threshold
and (2) the hang time (which determines the minimum time that data will
be sent once the algorithm has determined that a signal is present). In
addition, the network operator must decide the peak capacity of each link,
which will determine the probability that the channel becomes overfilled by
active speech data.
At the decoder end, the speech bursts are played as sent, but the silences
between them are filled with a synthesized background noise called
4. Some vendors of enhanced methods claim to be able to repair speech with up to 30% packet loss.
These techniques invariably compare a PLC algorithm combined with an adaptive jitter buffer to
the performance of a system using an ordinary (or no) PLC and a fixed, moderately sized jitter
buffer. This algorithm does not repair 30% packet loss. Instead, it prevents loss associated with
late packets arriving at the jitter buffer, only to be discarded because they are too late to play out.
While this greatly improves the sound reproduction, it does so by adding delay, at least temporari-
ly. Should the network suffer high packet loss from drops or discards in the core, repair by these
algorithms will not be much better than standard PLCs.
comfort noise. The level and spectrum of the comfort noise are determined
from the coefficients received from the encoder end. This works best for
stationary and quasi-stationary noise, like car interior noise or crowd
babble. Dynamic noise, such as street noise is more difficult to match.
Differences in level between the actual noise that arrives mixed with the
speech and the comfort noise generated at the decoder creates an audible
contrast. This makes it obvious that something is interfering with the signal
or is being turned on and off.
The use of silence suppression can cause several impairments:
front-end clipping, where the beginnings of utterances are
removed
background noise contrast, where noise used to fill the silent
periods is noticeably different from the background noise audible
during speech
noise pumping, where peaks in the background noise trip the
detector and background noise is transmitted momentarily (for the
duration of the algorithm's hang time)
data loss, caused when the total volume of active speech exceeds
the capacity of the link it is carried over and some data must be
discarded
The first three impairments result from detection errors, while the last
results from a provisioning trade-off between statistical fluctuation in
speech activity and the number of channels carried over the link. Design
parameters (threshold and time constants) determine the amount of silence
detected as well as the accuracy of the decisions. In general, aggressive
settings will detect more silence and both long and short silent intervals,
even short silences within one talker's speech. Conservative settings, on the
other hand, remove only long periods of silence. At the same time, the
aggressive settings create more opportunity for errors. In quiet, speech is
easily differentiated from non-speech, so the parameter settings are not
critical for this case. Elevated background noise, however, can exceed the
threshold preventing the detection of silence. Tuning the threshold and time
constants can prevent such errors, but this can lead to the inverse, where
lower level speech is mistaken for silence. These errors can cause audible
artifacts in the output: the front ends of words may be clipped off or quieter
sections chopped out.
The total speech load to be carried across the network will depend on the
number of channels with active speech at any one time. Within aggregated
flows, silence suppression reduces the peak bandwidth needed to carry
voice traffic. Where the number of talkers is high (as in the network core),
the distribution of peak talker data rate will be based on the statistics of
large numbers and the actual peak data rate will rarely exceed the capacity
Transcoding
Transcoding refers to the successive encoding of a digital signal by
different codecs. Transcoding can be problematic for voice quality because
of cumulative degradation to the final output speech. Transcoding may
increase both signal distortion and delay. It is important to understand
when transcoding adds impairment and where it does not.
Generally, transcoding occurs because some networks use compression
coding to save bandwidth, and the compression codecs they choose are
different. For example, many VoIP networks use G.729, an 8 kb/s
telephony standard codec, to conserve bandwidth; digital wireless networks
use low bit-rate codecs over their radio channels. In addition, some network
features such as conferencing and voice mail may also add transcoding.
A special case of transcoding, called tandeming, occurs when a signal is
encoded, decoded, and reencoded by the same codec. The impairment is
similar to that for transcoding. TIA TSB-116 offers a good discussion of
the effects of transcoding.
Transcoder-free operation6
To maintain voice quality performance, we want to limit the number of
encodings by CELP and other low-bit-rate codecs to one. Packet networks
offer a unique opportunity to do this; speech that is already compressed can
be transported over IP without transcoding to an intermediate form.
Transcoder-free operation is a feature that ensures that speech signals can
5. Since the E-Model is an additive model, this estimate is approximate. It does not account for differences
in order of transcoding and becomes less accurate as more transcodings are combined.
6. Here, the term transcoder-free operation (TrFO) is used generically. In wireless technology standards,
TrFO refers to a specific signaling system to set up a clear TDM channel between the endpoints that
allows the encoded speech to be sent as data.
be carried through the network without encountering more than one low-
bit-rate encoding.
7. Where the bridge sits in a legacy network, and only some lines are calling from a network using packet
technology and/or speech compression, the situation is more complex. If there is only one caller from
such a network, there will be no additional impairments over the TDM, since that signal must be
unpacketized/uncompressed at the network interface in any case. Where there is more than one line
with packet or compressed speech, the analysis applies to signals they hear from the other talkers on
such networks, regardless of whether the conversions take place at the bridge or at an intermediate
interface.
designers are introducing hybrid algorithms that mix sometimes but not
other times, depending on whether there is significant activity on more than
one line. The success of these new strategies is not yet determined, but it is
clear that the traditional conference bridge will not deliver the quality users
expect when combined with VoIP technology.
Impairment to conference calls over VoIP can be minimized by using
G.711 or G.726-32 as the VoIP codec. Delay will increase, but the increase
will be less than for compression codecs, and there will be no transcoding
impairment.
G.711
The workhorse of the PSTN for digital trunking and switching
The best quality conventional-band codec
Handles nonspeech as well as speech
Two coding laws are defined for G.711: A-law and µ-law; the voice
quality produced by these two coding laws is very similar;
transcoding from one to the other adds distortion equivalent to
somewhat less than one unit of R or Ie.
The two coding laws do not interwork (a signal encoded by one can
not be decoded by the other), but the data can be translated using a
simple look up table.
A-law coding is used in most of the world and on international
connections; µ-law is used in North America.
Requires external packet loss concealment and silence suppression,
which are easily added.
G.726
32 kb/s rate is commonly used for compression in TDM networks,
private networks, undersea cables, and satellite links.
Specified for low-power (in-building) digital wireless systems, such
as CT2 and DECT
Common in ATM and FR environments, but not found in many
VoIP systems yet
Sounds slightly raspier on active speech than G.711. The noise
floor is slightly higher, and is acceptable for both public and private
networks.
Quality of the lower rates (24 and 16 kb/s) not generally acceptable
for commercial telecommunications, although these rates are
sometimes used in private networks.
The Synchronous Coding Adjustment (SCA) is used to avoid
cumulative quantization distortion from multiple conversions
between G.711 and G.726.
More sensitive to data loss than G.711 because the decoder can lose
its adaptive reference, and it takes a finite time to reconverge.
Requires external packet loss concealment and silence suppression;
which are easily added.
G.729, G.729A8:
G.729A (that is, Annex A) is a reduced-complexity version of
G.729
8. When reference is made to G.729, it is almost always the 8 kb/s rate that is intended. However, other
rates are defined in G.729 Annexes. G.729A is a reduced complexity version of the 8 kb/s codec
defined in the main body of the standard. In this book, references to G.729 without qualification may
be taken to mean the 8 kb/s algorithms, either G.729, G.729A, or both.
G.723.1:
Developed for use in video teleconferencing
Early de facto standard for “shrink-wrap” VoIP applications
Slightly poorer baseline quality than G.729, plus relatively long
delay
Built-in packet loss concealment
Runs at two rates, 6.3 kb/s and 5.3 kb/s
GSM-EFR:
GSM EFR (Enhanced Full-Rate) wireless speech coding standard.
Baseline quality with speech signals is essentially equivalent to
G.711
Tandeming degradation less than for the earlier compression codecs
Built-in silence suppression feature, called DTX (discontinuous
transmission)
AMR:
Adaptive Multi-Rate (AMR) codec for GSM wireless
Developed to optimize quality over wireless channels: the coding
rate adapts to current channel conditions: The bit rate of the speech
codec is reduced in the face of data loss; bits freed up are
transferred to the error protection function.
Has eight rate modes available (half-rate operation uses the lower
four modes).
Top rate mode is identical to the GSM-EFR codec; half-rate mode
is equivalent to IS-641.
iLBC, BV16:
Selected as low bit-rate codecs for CableLabs Packet Cable
standard.
Voice performance
For maximizing voice performance calls with a codec with low distortion
and low delay, G.711 is the natural choice. G.711 will provide the best
intranetwork voice quality, and will optimize interworking with other
networks. Conferencing and voice mail performance will be similar to that
with TDM. Running G.711 with 10 ms packets will offer the best end-to-
end delay, but where capacity considerations prevent that, G.711 with
20-ms packets is a good alternative. Using G.711 with careful network
provisioning, it is possible to shift from TDM to VoIP without users being
aware of any change in the infrastructure.
Note that voice mail can suffer more from coding distortion than live
conversation, because the listener does not get a chance to ask for repetition
of any unintelligible parts. This can be a significant problem if the voice
mail system has its own compression codec, in which case the signal may
be transcoded during storage and again during playback.
Sometimes G.711 cannot be used, for instance, with low speed links from
remote sites, when teleworkers or road warriors (salesmen or other users
who are often out of the office) dial in, or where LAN has insufficient
margin for G.711 operation. G.726-32 is the next best codec, but many
VoIP gateways have not yet implemented G.726.
Capacity
Where bandwidth is the top priority, there is a strong push to adopt the
lowest bit-rate codec. Where a codec is chosen based on bandwidth, make
sure that the quality is acceptable to your users on all their common calling
scenarios. Often when justifying a codec choice, the codec quality
considerations are limited to two-party calls over the immediate network.
An encoding delay, the kinds and quality impact of any transcoding, and
the quality of long distance calls, or calls to other networks such as wireless
are rarely considered. Only after the network is up and running do user
complaints focus attention on performance shortcomings.
When selecting a low bit-rate codec, remember that the bandwidth
efficiency obtained will be less than the compression ratio as determined by
the bit rate alone. VoIP packets are small, meaning that the header accounts
for a significant portion of the bits. For example, G.729 has a compression
ratio of 8:1 (compared to G.711),but the bandwidth efficiency of G.729
packets (with one frame per packet) is approximately 4:1. For networks
running with low proportions of voice traffic compared to data traffic, the
savings in terms of percentage of overall capacity may not be worth the
cost in terms of voice performance. This trade-off must be examined
independently for each network.
Where a low bit-rate codec is used, it may be advisable to specify a higher
rate for certain call scenarios that are particularly vulnerable to transcoding
degradation. Equipment features and the network architecture will
determine whether it is possible to implement contingent selection of
codec. Three-way and n-way conferencing and calls to or from cellular/
wireless networks are two situations that show unavoidable degradation
with additional low bit-rate coding.
Bandwidth calculators are useful in understanding the capacity
implications of various network provisioning choices.
Delay
Networks that will carry calls from cellular/wireless access, international
calls, or private networks with global reach must pay close attention to
delay. Your choice of codec and packetization can increase delay across the
network. Increasing the speech payload of the packets will improve
any clue that something is wrong. It may be possible to repair data losses
too well!
Restrictions on transcoding
For equivalent-to-TDM wireline access, there can be no transcoding to
frame-based compression codecs. For equivalent-to-2G wireless mobile-to-
land (2G being current digital cellular operating with TDM backhaul), only
one frame-based encoding can be tolerated.
Audio Codecs
The term audio codec can refer to all codecs intended to digitize sound, but
it often refers specifically to codecs intended to handle all signal types, or
especially nonspeech signals such as music. In general, the operation of an
audio codec is similar to that of a speech codec. Audio codecs are likely to
model the human auditory system, as compared to speech codecs, which
often model the human vocal tract.
Audio codecs are frequently used in IP applications and computer-based
audio applications. Audio streaming and exchange of compressed music
files are two common ones. Such applications are less likely than speech to
be real time, although some components of multimedia applications such
as games may include audio signals. Commonly used audio codecs are
summarized below.
Some general purpose audio codecs combine two different algorithms, one
for speech and one for nonspeech. This arrangement is essentially two
codecs, each operating on the portions of the signal for which it is best
equipped. A detection scheme is used to determine which algorithm is
appropriate for any particular signal. This technique allows the codec to
obtain better quality for a given compression (or higher compression for a
given quality) than might be achieved with a single coding algorithm, since
the use of specialized algorithms allows the codec to make simplifying
assumptions about the characteristics of the content.
Audio codecs offer multiple data rates. For some codecs, the higher rates
offer lossless compression. Recall that lossy compression is generally used
for real-time applications such as telephony. Lossless compression allows
complete recovery of the original; thus, there is no coding distortion.
However, there are limits to the amount of compression that can be
achieved with lossless techniques, and further compression must be lossy.
In the realm of streaming (and stored) audio, it is important to differentiate
between codecs, file formats, and players. Codecs convert the audio signal
from one form to another. The file format is a defined structure that
provides information needed by a player to parse the incoming data. (An
example of a format is .wav.) The format includes such information as the
codec and data rate used for compression. A player is a software device
used to play back the audio signal, and contains one or more decoders
associated with different codecs. The player is equipped to read the format
information and select the right decoder and any settings to restore the
analog output.
Some commonly used audio codecs are royalty-free standards. Others offer
a licence-free decoder, but licence the encoder to content providers. Some
audio codecs commonly used for streaming and music file exchange are
described below.
MP3
The MP3 codec is the de facto standard for music files stored and played in
the computer environment. It is closely associated with the Internet because
of music file sharing and the associated copyright disputes. It was
developed as an audio codec for digital video, which is hidden in its name:
MP3 stands for Motion Picture Experts Group (MPEG) 1, Layer 3.
The MP3 codec is used to compress audio files to reduce the space needed
to store them. MP3 compresses to several different data rates, which trade
off fidelity for file size. Because MP3 is used for listening only, encoding
delay is not an important factor.
MPEG-4 AAC
This new audio codec is part of the latest MPEG video coding standard. It
has been predicted that MPEG-4 AAC will displace MP3 as the de facto
music file standard because of its improved quality and features.
Ogg Vorbis
Ogg Vorbis is a creation of the Xiph.Org Foundation, a nonprofit developer
of tools for the Internet. It consists of two separate tools: the Vorbis codec,
which is a free-form variable bit-rate codec, and the Ogg transport
mechanism, which supplies free-form framing, sync, positioning and error
correction. Both the Vorbis codec and the Ogg transport mechanism are
available royalty-free. The Vorbis codec can be used with RTP, rather than
Ogg, as the transport protocol.
RealAudio
RealAudio* is a proprietary coding system that offers free use of its
decoder, in the form of RealPlayer*. RealPlayer is used extensively for
audio streaming and audio clips offered over the Internet. RealAudio (and
the associated RealPlayer) includes a number of decoders ranging from
very low to very high bit rates. RealAudio 10, the release current at this
writing, uses data rates ranging from 12 to 800 kb/s (recall that monaural
linear PCM requires 705 kb/s). The codecs sport various rates, frequency
bands, and options that optimize for specific audio content and application
(for example, speech/music, mono/stereo, Dolby*). RealAudio 10 includes
the MPEG-4 AAC codec as one of its operating modes.
RealAudio includes a packet loss concealment feature. In addition, the
Surestream feature can dynamically adapt to changes in the available bit
rate to maintain a continuous playout even where the access channel may
be intermittently shared.
Wave
Wave is an audio data file format, not a codec. The Wave format is a
version of an Interchange File Format (IFF, a standard established for all
kinds of data, from sound files to pictures to musical scores). Wave files are
designated as .wav. They includes information about the type of data in the
file, how the data is encoded, the length of the file, and so on. The format
also specifies how the data is structured (chunked) inside the file, so that
players that read the file know what setup to use and how to parse the data.
Video Codecs
Analogous to speech and audio codecs, video codecs convert standard
analog video signals into digital Pulse Code Modulated (PCM) signals or
compress them. Because of the much greater information content of a
video signal, compression is even more important for video than for audio.
The nature of the analog video signal, the way video information is used
and transmitted, and the characteristics of human visual perception all
place special requirements on video codecs. Some background on the
analog video signal will assist in understanding the general digitization
process. The sidebar below describes how analog video signals are
generated and the formats used to transfer the signals to local and remote
receivers.
is lossy, but with careful design of the compression technique, these losses
will not be detrimental to the perceived image.
Processing is done to identify redundancies between adjacent frames. For
example, a static background need not be updated until there is any change.
A moving object either remains approximately static on the screen while
being tracked by the camera, in which case the background will move, or
the object moves across the display and the background remains static.
Compression algorithms also look for a number of other common types of
movement to facilitate compression. Such analyses identify redundant
information that can be removed, and allow a high proportion of the
remaining information to be coded in differential terms.
Video codecs use either eight or ten-bit encoding. Composite video is
normally eight-bit encoded. Higher definition signals are sometimes ten-bit
encoded, in which case the most significant eight bits are regarded as the
integer part and the least significant two bits are regarded as fractional.
This allows decoding equipment designed to handle eight-bit streams to
handle the ten-bit words by simply truncating them to eight bits.
Synchronization of sound and video applies to both analog and digital
signals. However, it is perhaps more of a problem in digitized video
because of differences in encoding delay for speech/audio codecs and
video codecs. Audio and video signals may be separated, which is
beneficial where the identity of the far end receiver is not known, because it
may be possible to capture and play the audio component even where there
is no video receiver and display. However, separate streams for audio and
video must be synchronized for playback within certain tolerances;
otherwise, the quality of the playback is reduced. For full motion video, the
audio signal should not lead the video image by more than 20 ms, nor trail
it by more than 40 ms.
file which decoder to use. There are three players in common use today:
Apple’s* QuickTime*, Windows Media* Player, and Realplayer*. These
are described below.
Players take streamed information (information being presented in real
time) or information in a recorded file and convert it back to a form suitable
for presentation to a driver and hence to a display device. These players can
accept information in many common file and stream formats. File and
stream formats contain not only the information to be presented, but also
information telling the player how to decode the information in the stream.
Among other things, the information indicates which decoder and data rate
to use. Typically, files consist of frames (file frames, not to be confused
with video frames) with a file header containing the control information
followed by the video data. Most players can begin decoding at the start of
any frame and can ignore any partial frame data in the bit stream prior to
the next start-of-frame.
Even when they only provide the decoding function, video decoders are
usually referred to as codecs. The need for a decoder assumes the prior use
of an encoder. Decoders are sometimes referred to by the coding standards
to which they refer. For example, H.261 and H.263 are video codecs, as is
MPEG. Strictly, MPEG is both a file format and a codec. The MPEG
standard includes enough detail to define a file or stream that an MPEG
decoder can play. H.261 has been largely superseded by H.263. MPEG is
included here in the codec section, but is more strictly a standard defining
the data structure that a codec will act upon.
Video Codecs
Sorenson. Sorenson is a proprietary codec from Sorenson Media*. Apple
uses it in their QuickTime player. It has a number of encoding features
including bi directional (B frame) frame encoding. Added to this it
incorporates the ability to drop B frames when needed due to short term
restrictions on data rate. This makes it very tolerant to variations in data
rate. It also automatically determines video frames where new scenes
begin, based on how much video changes between adjacent frames, and
flags these as key frames, which are then used to begin a sequence of
decoding.
Cinepak*. Cinepak was an early codec that was rapidly established as a
standard. Cinepak is an asymmetric codec: the video has a long
compression time, and can not accept video input in real time, but it can
decode in real time. This ensures a smooth playout of streamed material.
Cinepak is included with Quicktime. Like Sorenson, it identifies key
frames based on the difference between adjacent video frames, and flags
these to indicate the beginning of a decoding sequence.
H.263. H.263 was designed for and is primarily used for video
conferencing and is a symmetrical real-time codec. It is limited to displays
Players
QuickTime. QuickTime is a comprehensive system from Apple and is
among the most popular coding and decoding systems available today. It is
a comprehensive system that handles video, still images, music, and speech
References
General References:
ITU-T Recommendation P.862, Perceptual evaluation of speech quality
(PESQ): An objective method for end-to-end speech quality assessment of
narrow-band telephone networks and speech codecs, Geneva: International
Telecommunication Union Telecommunication Standardization Sector
(ITU-T), 2001.
ITU-T Recommendation G.107, The E-Model, a computational model for
use in transmission planning, Geneva: ITU-T, 1998.
Codec Standards:
733-A, ANSI/TIA-733-A-2004, High Rate Speech Service Option 17 for
Wideband Spread Spectrum Communications Systems,
Telecommunications Industry Association, 2004.
BV16, J.Chen et al, BroadVoice™16 Speech Codec Specification, Version
1.2. October, 2003. (For further information, contact PacketCable, Cable
Television Laboratories, Inc.)
EVRC, ANSI/TIA-127-A-2004, Enhanced Variable Rate Codec Speech
Option 3 for Wideband Spread Spectrum Digital Systems,
Telecommunications Industry Association, 2004.
ITU-T Recommendation G.711, Pulse code modulation (PCM) of voice
frequencies, Geneva: ITU-T, 1988.
ITU-T Recommendation G.722, 7-kHz Audio-Coding within 64 kBit/s,
Geneva: ITU-T, 1989.
ITU-T Recommendation G.722.1, 7kHz Audio - Coding at 24 and 32 kb/s
for hands-free operation in systems with low frame loss, Geneva: ITU-T,
1999.
ITU-T Recommendation G.723.1, Dual-rate speech coder for multimedia
communications, (includes Annex A: Silence Suppression, and Annex C:
Channel Coding Scheme for use in wireless applications.) Geneva: ITU-T,
1996.
ITU-T Recommendation G.726, 40, 32, 24, 16 kbit/s Adaptive Differential
Pulse Code Modulation (ADPCM), (includes Annex A: Extensions of
Recommendation G.726 for Use with Uniform-Quantized Input and
Output-General Aspects of Digital Transmission Systems.), Geneva: ITU-
T, 1990.
ITU-T Recommendation G.728, Coding of Speech at 16 kBit/s Using Low-
Delay Code Excited Linear Prediction, Geneva: ITU-T, 1992.
Section II:
Legacy Networks
Legacy networks rely largely on Time Division Multiplexing (TDM) and
Synchronous Optical NETwork (SONET) technologies. Time Division
Multiplexing networks were designed and built specifically for one real-
time application, namely Voice. TDM networks now carry data as well as
voice, but make no distinction between real time and non–real time
applications because all data are treated as real-time. TDM is exceptionally
good at delivering real-time service, and Chapter 6 reviews how it achieves
such high performance.
TDM uses time slots to combine individual calls together on faster links in
the network core. Optical networking provide very fast links, and the
SONET standard was developed to facilitate interworking between optical
networks. Synchronous Digital Hierarchy (SDH) is the international
equivalent of SONET.
SONET is not just for Telcos—virtually all optical Layer 1 uses SONET.
Large Enterprises and even small Enterprises use SONET for long (> 15
km) distances. SONET is agnostic about the form of the data riding on it,
and continues to be used as Layer 1 for IP links.
Chapter 6
TDM Circuit-Switched Networking
Stephen Dudley
Signaling Voice
SS7 T1 ISDN
SONET / TDM
Figure 6-1: Transport path diagram
For the TDM Network, our transport path diagram has components that
don't exist in the packet switching network as well as components that we
would recognize in the other diagrams in the book. Shown in the diagram
are the following components:
The voice trunk signaling mechanisms most commonly used in the
network (MF, ISDN, SS7).
The voice path, which always uses a G.711 codec when the Public
Switched Telephone Network (PSTN) is employed.
Other speech codecs besides G.711 can be used, but they have to be set up
on a dedicated path without PSTN switching, one example, using an ATM
transport, is shown in Figure 6-1. When using this kind of a dedicated
connection, a wide variety of codecs can be used.
Concepts covered
Why the TDM Network is built around a 64 kb/s channel.
How a telephone call proceeds through the network.
How the digital switch uses a dedicated connection to end users
(line) and a nondedicated connection between switches (trunk).
How the digital switching network uses a switch hierarchy with a
trunking overlay of high usage to minimize the information
maintained in call routing tables.
TDM principles
The public telephone network was being converted to digital in the 1970's.
Even at that time, data rates for digital signals were fast enough that a
single transmission facility could send far more data than was needed to
support a single voice conversation. The technique of multiplexing multiple
digital streams onto a single facility by assigning each stream to a
particular block of time in round robin fashion, called Time Division
Multiplexing (TDM), became the basis for the Public Switched Telephone
Network (PSTN).
Framing
Sequence
DS0 #1
DS0 #2
DS0 #3
DS0 #3
Data
Stream
Figure 6-2: Time Division Multiplexing
The diagram illustrates how multiple signal streams (called DS0 here) can
be put together with a Framing sequence to create a complete bit stream.
Each data stream is assigned a time slot to transmit. The framing sequence
contains a recognizable pattern that can be easily detected in the data
stream and that has a definite start and endpoint. By knowing the starting
point of the framing sequence, it is possible to know which bits belong to
which data streams.
Multiplexing
A channel capable of carrying one call is called a DS0. The faster the
transmission rate, the larger number of calls can be combined onto it.
Combining channels together is called multiplexing and the output of one
Digital Switch
IB M
1 2 3 1 2 3
4 5 6 4 5 6
7 8 9 7 8 9
* 8 # * 8 #
Dial Tone
Wire
1 2 3
4 5 6
7 8 9
* 8 #
Line Trunk
Line Trunk
1 2 3
4 5 6
7 8 9
* 8 #
IBM
1 2 3
4 5 6
7 8 9
* 8 #
Sidebar: Erlangs
The number of trunks needed to support a given calling volume was
initially studied by a man by the name of Erlang. Because of the
statistical nature of call arrivals, it is not possible to add up the total
number of minutes or seconds that people want to talk on the phone and,
with a little arithmetic, calculate the number of trunks needed. However,
since it is based on statistics, the relationship between the number of
trunks needed to support a given calling volume 99% of the time or
99.9% of the time is always the same; so, traditionally, these values have
been captured in tables. As you might expect, they are often called
Erlang tables or Poisson tables because of the basic distributions
involved.
The one piece of direct arithmetic that does go into an Erlang table is the
calling volume. It is either measured in units of call seconds or in units
of (surprise!) Erlangs. One Erlang is one trunk continuously used for one
hour. This measurement is directly equivalent to 3600 call seconds (60
seconds X 60 minutes = 3600 call seconds). And 3600 call seconds = 1
Erlang.
fluctuates over the course of the day and the hour with the most traffic is
called the busy hour. Traffic also fluctuates from day to day in both a
statistical and a nonstatistical way. In North America, the busiest hour of
the whole year is likely to be sometime on Mother's Day. Almost all
telephone networks are engineered to deliver 99% or more of all traffic on
that busiest hour of the year.
Signaling
Two types of signaling systems must exist within the switch, one for lines,
and one for trunks. Line side signaling communicates with telephone sets.
There are dozens of signaling protocols that might be used, each specific to
a particular country, type of telephone switch, or telephone set. Trunk side
Gateway Switch A
Trk Trk
IP
Network
Call Control
No Gateways for
Per Trunk Trunks to
other
Signaling
switches
ISDN signaling
Robbing bits from every channel to convey signaling information had
limitations for data communications and setting up an end-to-end
connection was slow. Another mechanism to convey signaling information
is to rob a complete channel from a DS1 or E1 for signaling purposes to
allow the other channels to be delivered at full rate (that is, 64 kb/s). This
scheme is implemented in the Primary Rate Interface1 (PRI) of ISDN
(Integrated Services Digital Network). A PRI is essentially one DS1 or one
E1. For a DS1, channel 23 (the last one in the 0-23 sequence) is used for
signaling. For an E1, channel 15 is used. These channels are called D (data)
channels and the other 23 (or 31) are called B (bearer) channels.
Switch
PRI Gateway
PRI Trk Trk PRI
IP
D D
Network
Call Control
Q.931
The signaling protocol used by ISDN facilities is Q.931. The Q.931
protocol defines the messages sent over the D-Channel, both in terms of the
message format and the message sequencing. Figure 6-9 illustrates a
1. Note that if someone has an ISDN phone, it would not have PRI signaling. Instead, it would have BRI
signaling. A Basic Rate Interface (BRI) is two 64 kbps bearer channels and one 16 kbps D channel.
There are D Channel handlers for line side peripherals as well that permit signaling information to be
relayed to the call control functions of the switch.
typical call connection and tear-down sequence between two ISDN phones
connected to a Private Automatic Branch Exchange (PABX).
IBM
Originating Terminating
Switch IBM
Switch 1
4
2
5
3
6
7 8 9
* 8 #
1 2 3
Tandem
4 5 6
7 8 9
* 8 #
Switch
Off Hook
Setup
Dial Setup
Ringing
Proceeding Proceeding
Alerting
Alerting Answer
Hear Connect
Ringing Connect
Connect Ack
nowledge
Connect Ack
nowledge
t Hang Up
Disconnec
t
Hang Up Disconnec
Release
Release
omplete
omplete Release C
Release C
SS7 signaling
The last way that signaling information can be provided is through an
entirely outboard communication system. This method is the way that
CCS7 (Common Channel Signaling System number 7) is implemented.
Figure 6-10 shows the interconnection of switches by way of the CCS7
Network.
Switch A
PRI Gateway
ISUP Trk Trk ISUP
IP
SS7
Network
Call Control Trunks to
other
switches
STP
Switch Z
STP
SS7
Signaling SS7
Links Network
STP
Call Control
Trk Trunk Interface STP
SS7 SS7 Interface
STP Signal Transfer Point
ISUP ISDN User Part Trunks
Signaling links
SS7 messages are exchanged between network elements over 56 or 64 kb/s
bidirectional channels called signaling links. From the perspective of
setting up a connection to a VoIP gateway, two or more of these links will
need to be set up. The company that owns the SS7 network will require that
the equipment being connected goes through a rigorous certification
process. The company depends heavily on their network and needs to be
sure that any equipment attached to it, and the messages networked behave
in an expected manner, and that unwanted messages do not flood the
network.
IBM IBM
Originating Terminating
Switch Switch 1
4
2
5
3
6
7 8 9
* 8 #
1 2 3
4 5 6
7 8 9
* 8 #
Continuity
M essage
Ringing
ge (ACM)
s s Com p lete Messa
Hear Addre
Ringing Off Hook
NM)
essage (A
Answer M
omplete
Release C
Chapter 7
SONET/SDH
Anthony Lugo
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET
SONET/ TDM
Concepts covered
The evolution of SONET, which is driven by the next generation
applications and services of the telecommunications market.
The SONET protocol functionality within the network layers.
SONET Network element types and attributes.
Variance of flexible solutions to ensure network survivability,
redundancy and exceptional reliability.
The significant role synchronization plays within the SONET world.
Introduction
Today we deliver information at the speed of light, and the fiber optic
medium has proven to be a viable solution as a delivery agent.
Synchronous Optical Network (SONET) is an optical standard that permits
multiple vendors to deliver data, voice and multimedia applications to
residential and business customers. The purpose of this chapter is to define
and illustrate the concept of Synchronous Optical Network (SONET),
allowing the reader to understand the associated SONET structure and how
it can deliver value added applications to the telecommunications industry,
benefiting the consumer with new services.
Figure 7-2 illustrates the progressive evolution of SONET, beginning with
the foundation of traditional private line services growing into the optical
private line service and entering the new market demand for the SONET
next generation data applications and services.
Fast
SONET next Ethernet
ATM
generation ESCON
RPR
Data Services SAN FDDI
Service Flexibility
& Development
OC-3c/ Fibre
STM-1 Channel
Optical Private OC-12c/
STM-4 HDTV Video
Line Service OC-48c/ GE LAN
STM-16 WAN
DS1
OC-192c/
Traditional Private DS3 STM-64
10 Gigabit
Ethernet
Line Service DS0
Overview
The SONET Network has evolved over the years due to the demands of
new applications and services. The increasing demand for the
telecommunications industry to deliver and meet the demands of the
consumer market, has migrated the once traditional SONET legacy, time
division multiplexing networks, into the Next generation of SONET.
SONET enters a new stage of growth with the telecommunications market
striving for reduced capital and operating expenditures. The growing
demand of high-speed internet access, network security, new application
delivery, and the full drive of a competitive market is demanding the
utilization of the SONET network infrastructure, which has been critical
for a successful business market operation.
What once was LAN, MAN or WAN has been blended into the SONET
fiber cloud, creating flexible, secure applications, such as ATM, Storage
Area Networks (SAN) and high bandwidth applications.
As the evolution of applications and services has been developing, so has
the Optical Network backbone infrastructure. In the early development of
the Optical Legacy network, the majority of configurations used to be a
simple ring or linear applications.
Today's optical network involves more complexity, as a network element
has evolved into a HUB or meshed network creating more flexibility and
density per Network Element (NE). A single NE that used to add and drop
traffic for a single BLSR ring, can now terminate multiple different types
of network configurations, such as a BLSR, UPSR, Linear 1+1, and 0:1
unprotected.
This advancement in the Optical Networks arena allows for the
interconnecting or meshing of networks, to a scale no one could have
foreseen within a multivendor environment. What used to be a separation
of networks in terms of LAN, MAN, and WAN, is now seamless, due to the
SONET optical network backbone.
GbE application
Ethernet service at the Layer 2 level, using a SONET/SDH circuit
connecting multiple sites, can offer point-to-point Ethernet connectivity for
data centers, remote data backup sites, and servers without the addition of
dedicated data equipment. It can broaden their service offering and
increase their revenue potential. GbE service also benefits from the Layer 1
SONET/SDH protection schemes.
The addition of GbE support provides for more efficient bandwidth usage
of the 10 G signal, by being able to have different size payloads to carry the
GbE traffic and by being able to mix different types of services within the
same optical link.
RPR application
Resilient Packet Ring is a distributed switch (IEEE 802.1D bridge
functionality) application that is connectionless, packet-based, and allows a
shared bandwidth networking solution for Ethernet traffic over a SONET/
SDH backbone. RPR has 10/100/1000 Base-X capabilities, which is an
efficient carrier grade method, connecting to routers and LAN's utilizing a
Layer 2 add/drop/pass-through technology.
SONET terminology
This section describes SONET terminology including SONET level rates
and SONET layers and architecture.
For example, an STS-N has exactly N times the rate of 51.84 Mbit/s (for
example, an STS-12 is exactly 12 x 51.84 = 622.080 Mbit/s).
SONET
SONET SDH
SDH Rate
Rate
OC-1
OC-1/ /STS-1
STS-1 STM-0
STM-0 51.84
51.84Mb/s
Mb/s
OC-3
OC-3/ /STS-3
STS-3 STM-1
STM-1 155.52
155.52Mb/s
Mb/s
OC-12
CC-12
CC-12/STS-12
/STS-12 STM-4
STM-4 622.08
622.08Mb/s
Mb/s
OC-48
OC-48/ /STS-48
STS-48 STM-16
STM-16 2488.32
2488.32Mb/s
Mb/s
OC-192
OC-192/ /STS-192
STS-192 STM-64
STM-64 9953.28
9953.28Mb/s
Mb/s
OC-768
OC-768/ /STS-768
STS-768 STM-256
STM-256 39813.12
39813.12Mb/s
Mb/s
payloads and Figure 7-5, which depicts the SONET frame format with VT
payloads.
Photonic layer
The optically transmitted SONET signal is referred to as an OC-N. This
layer is primarily responsible for the electrical to optical conversion. The
OC-N is essentially the optical equivalent of the STS-N, however, the
STS-N terminology is used when referring to the SONET format.
Section layer
The section layer transports the STS-N frames and the section overhead
across the photonic layer. This layer has the job of performing such
functions as, performance monitoring, local orderwire and Section Data
Communication Channels (SDCC). The SDCC is used to provide a
communications path between a centralized Operations System (OS) and
the various network elements.
Line layer
The line layer is responsible for the transportation of the SPE (customer
Payload) and the line overhead. Some attributes of this layer consist of Line
Data Communication Channels (LDCC), express orderwire, performance
monitoring, protection switching signaling and line alarms.
Path layer
The path layer transports the customer traffic at the DS-1, DS-3, DS-1VT,
DS-3VT, and Video level for the Path Terminating equipment. The Path
layer transports the customer payload and the path overhead to the
terminating SONET/SDH equipment.
Customer payload can be mapped into the path layer in a STS level payload
and a STS/VT level payload as illustrated in Figure 7-4 and
Figure 7-5.
In addition to the STS-1 Path level base format, SONET also defines
synchronous formats at sub-STS-1 levels that are defined as Virtual
tributaries. VT's are synchronous signals used to transport low-speed
signals.
The VT-Virtual Tributaries (VT) is designed to carry (asynchronous)
payloads, such as the DS1 that requires considerably less than 50 Mb/s
bandwidth. The DS1 is such an important payload, that the entire SONET
format can be traced back to the need for DS1 transport. There are seven
VT Groups within an STS-1 SPE, and different groups may contain
different VT sizes within the same STS-1 SPE. The structured STS-1
signal has VT payloads and VT path overhead that together constitute the
VT SPE similar to the STS SPE.
Summary
All SONET network elements have section and photonic layer
functionality. However, not all have the higher layers. A network element is
classified by the highest layer supported on the interface. Thus, a network
element with path layer functionality is referred to as Path Terminating
Equipment, either VT PTE or STS PTE.
Line Line
STS Path
VT Path
Terminal
A SONET Line Terminating Equipment takes in a number of electrical
signals (tributaries) and transmits a single electrical or optical signal.
Network configurations
A network comprised of SONET network elements allows for a network
configuration to be formed. Some customers may have a need for a certain
type of customer payloads, which may create the need for certain types of
network configurations.
The SONET NE can be part of the three traditional configurations, linear,
Bidirectional Line Switched Ring (BLSR), and/or Unidirectional Path
Switched Ring (UPSR) configurations.
As technology evolves, so does the network topology. The SONET next-
generation NEs can utilize not only the three types of traditional
topologies, but all topologies in a single box, allowing the capability of the
Optical Hub and Meshed configurations.
Linear configuration
A linear configuration can comprise of a 1+1 type or a 1: N type. The
protection switching detection for a single failure at a STS/OC-N level is at
the 50 ms level.
1+1 Protected
The most basic protection system is a linear 1+1 system (see Figure 7-7).
The term linear, differentiates it from ring systems and the 1+1 indicates
that there is one working fiber and one standby fiber, and the traffic in both
directions is permanently bridged onto both the working and the standby
fiber.
Terminal Terminal
Working Working
Linear
1+1
Protection Protection
1: N Protected
In a 1: N Configuration, there is one protection facility for several working
facilities (range one to fourteen). Figure 7-8 illustrates the 1: N protection
architecture. If one of the working lines detects a signal failure or line
degradation, then the working traffic will be switched to the protection line.
1:N Protection
Terminal Terminal
Working # 1 Working # 1
Working # 2 Working # 2
Working # N Working # N
N <= 14
Protection Protection
Ring configurations
The telecommunications market demands the utmost quality from a
SONET infrastructure, and network performance and network survivability
are important. Survivable rings and route diversity are two characteristics
that almost become a necessity for a SONET infrastructure, and the Uni-
directional Path Switched Rings (UPSR) and Bidirectional Line Switched
Rings is a solution to meet these demands.
The two primary types of ring configurations are as follows:
UPSR
Dedicated protection bandwidth
Bellcore GR-1400-CORE
BLSR
Shared protection bandwidth
Bellcore GR-1230-CORE
A comparison of UPSR and BLSR is shown in Figure 7-9.
STS-1# Ch 2 Used
STS-1# Ch 2 Used in all Spans
NE 1 in all Spans NE 1
NE 2
STS-1 # Ch1 STS-1# Ch 1 STS-1 # Ch 2 NE 4 TA-496 NE 2
STS-1 # Ch 1
NE 4
NE 3
NE 3
STS-1# Ch 1 Used
in all Spans
STS-1# Ch 1
STS-1 # Ch 1
• Bi-directional flow enables timeslots to be
reused around the ring • Uni-directional flow requires dedicated timeslots
• Total BLSR Network capacity is always around entire ring
greater than or equal to capacity of UPSR • Total UPSR network capacity cannot exceed the
• Total 2F-BLSR Network Capacity* = optical line rate
(OC-N rate ÷ 2) x # Nodes • So Total UPSR bandwidth = OC-N rate of the main
*4F-BLSR Capacity = (OC-N rate) x # Nodes Optical Line Rate
Uniform Mesh
Hub Pattern
Pattern
A meshed pattern simply does not only have the network element to
network element topology, but offers an interconnection to each and every
node within the meshed topology offering network survivability,
redundancy and exceptional reliability.
Gain in flexibility of today's optical network infrastructures provides
tremendous opportunities in terms of service and applications. Consider the
network shown in Figure 7-11, that can basically provide every type of
SONET topology within a customer's network infrastructure, thus enabling
an endless array of applications and services to meet every customer's
needs.
Linear
1+1
Linear
1+1
UPSR
Linear
1+1
Synchronization
Understanding synchronization
SONET-based equipment derives many of its basic attributes from
synchronous operation. Synchronization is required in networks that
contain:
Add/Drop Multiplexers
Terminals
Synchronous tributaries
These configurations require synchronization among the network elements,
to avoid the effects of the SONET synchronous transport signal pointer
repositioning within the frame. When a network element is synchronized,
all synchronous tributaries and high-speed signals generated by that
network element are synchronized to its timing source. Normally, one
network element in a UPSR is externally timed. To protect the network
timing against complete nodal failure, two network elements in a UPSR
can be externally timed.
Internal timing
A SONET-compliant free-running clock produced within the network
element provides internal timing. Network elements with certain circuit
packs can provide timing signals of Stratum 3 (ST3) quality.
External timing
An external timing signal is obtained from a building-integrated timing
supply (BITS) clock of ST3 or better. ST1 reference quality would be the
preferred level of timing for SONET network elements.
Line timing
Line timing is derived from an incoming SONET frame (OC-3, OC-12,
OC-48, and OC-192), DS1 facility or EC-1 facility.
Stratum clocks
Stratum clocks are stable timing reference signals that are graded
according to their accuracy. American National Standards Institute (ANSI)
standards have been developed to define four levels of stratum clocks.
The accuracy requirements of these stratum levels are shown in
Figure 7-13.
Timing loops
A timing loop is created when a clock is synchronizing itself, either
directly or through intermediate equipment. A timing loop causes excessive
jitter and can result in traffic loss. Timing loops can be caused by a
hierarchy violation, or by having clocks of the same stratum level
synchronize each other. In a digital network, timing loops can be caused
during the failure of a primary reference source, if the secondary reference
source is configured to receive timing from a derived transport signal
within the network.
A timing loop can also be caused by incorrectly provisioned
Synchronization Status Message (SSM) for some of the facilities in a linear
or ring system. Under normal conditions, if there is a problem in the system
(for example, pulled fiber), the SSM functionality will heal the timing in
the system. However, if the SSM is incorrectly provisioned, the system
might not be able to heal itself and might segment part of itself in a timing
loop.
Synchronization-status messages
Synchronization-Status Messages (SSM) indicate the quality of the timing
signals currently available to a network element. The timing sources that
can be provisioned in a network element include external timing from a
BITS clock, timing derived from SONET interfaces, and the internal clock
of the network element. A network element can select the better of the two
timing signals provided by the primary and secondary timing sources
provisioned by the user. The selection is based on the quality values carried
in the SSMs.
Figure 7-14 provides an example of a network showing the synchronization
flow, head-end network element, synchronization boundary, and
synchronization status messaging.
Section III:
Protocols for Real-Time Applications
Protocols are basic building blocks of packet networks. Section 3
introduces protocols essential to real-time operation: protocols for media
transport and transport control, protocols for call and session setup, and
protocols and mechanisms that help real-time and non–real-time services
successfully share network resources. The section begins with a discussion
of Real Time Protocol (RTP) and its associated protocols Real Time
Control Protocol (RTCP) and Real Time Streaming Protocol (RTSP). The
characteristics and use of these protocols are described, and similarities and
differences with TCP and HTTP are highlighted. Chapter 9 talks primarily
about SIP and H.323. These are the media gateway protocols that establish
sessions, or call setup messaging, required in real-time communications.
The different setup protocols provide the same function, but accomplish
this in different ways. SIP works between end-points while H.323 places
the call setup intelligence in a centrally-located device.
The final chapter in this section covers the strategies and mechanisms
available in IP networks to provide Quality of Service (QoS). The chapter
explains what the different components, such as shapers and schedulers,
contribute to help differentiate flows and prioritize forwarding. The various
components are intended to ensure that real-time traffic is transported
across the network within performance limits, therefore, maintaining the
expected QoE (Quality of Experience). Later sections will address the
practical implementation of these techniques.
Chapter 8
Real-Time Protocols: RTP, RTCP,
RTSP
Hung-Ming (Fred) Chen
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Figure 8-1: Transport path diagram
Concepts covered
The purpose and operation of RTP & RTCP
RTP relays and how they operate: RTP mixer, RTP translator
Comparison of RTP and TCP
The purpose and operation of RTSP
Real-Time aspects of streaming applications
How RTP/RTCP, RTSP combine for form a complete package for
management of media streaming
Comparison of RTSP and HTTP
Introduction
Today, real-time and near–real-time applications are becoming very
common on IP networks and are being used for many different purposes
spanning both conversational (at least two party) personal communication
and streaming (typically one-way) applications. VoIP and IP Telephony are
becoming popular. Radio stations and TV channels now offer streaming of
live and archived programming over the Internet. Corporations use
streamed messages to promote new products and to provide education and
product documentation to customers. There is great promise for new
multimedia real-time services from converged networks. A set of protocols
has been developed to address the requirements of transporting the content
of these real-time/near-real-time services and to offer basic control for
streaming services. These are the Real-Time Transport Protocol (RTP), the
Real-Time Control Protocol (RTCP), and the Real-Time Streaming
Protocol (RTSP).
Successful Real-Time streaming applications must coordinate several
protocols: HTTP, RTP, RTCP, and RTSP. HTTP is used to retrieve the
presentation description. RTSP uses the description to set up and tear down
the sessions. RTP transports the contents to the end device, and RTCP is
used to report transmission statistics back to the server so that RTP can
adapt to network conditions.
In this chapter, we discuss features and operations of these protocols, and
the relationships between them. In addition, we compare RTP with TCP,
and RTSP with Hypertext Transfer Protocol (HTTP) to identify operational
similarities and differences.
Other applications can use RTP as well, for such functions as such as
storage of continuous data, interactive distributed simulation, active badge
tracking systems, and control and measurement.
RTP was defined by the IETF in RFC 1889 and revised in RFC 3550. The
International Telecommunications Union (ITU) has adopted RTP as one of
its standards for multimedia data streaming (H.225.01). Many standard
protocols use RTP to transport real-time content, including H.323 and SIP
for IP telephony applications and point-to-point and video conferencing,
RTSP for streaming applications, and Standard Announcement Protocol/
Session Description protocol (SAP/SDP) for pure multicast applications.
Its data format provides important information for operation and control of
real-time applications.
Features
RTP was designed with several goals in mind. First, it is intended to be a
lightweight and efficient protocol to define Application Level Framing
(ALF) and integrated layer processing. Second, one flexible mechanism
was provided rather than several dedicated algorithms. Third, RTP is
protocol-neutral, allowing it to integrate with various lower layered
protocols, such as UDP/IP, IPX, and ATM-AALx. Then, the elasticity of
the Contributing Source Identifier (CSRC) field provides a scalability
mechanism to communicate with large number of sources. Meanwhile,
partitioning of the control and transport functions into separate protocols
simplifies the operation. Finally, RTP also provides secure transport
through support of encryption and authentication protocols.
The RTP packet header provides information on packet sequence
numbering, type of payload, timestamping, and delivery monitoring. The
sequence number can be used to identify missing packets and to reorder out
of sequence packets. The payload type is indicated using a profile that
identifies all the types of data that may be used by the application during
the session. The timestamp corresponds to the time of packetization of the
first data sample in the packet. Timestamping permits inter- and intra-
media synchronization, such as time-alignment of audio and video signals
in a film (lip synch), since the audio and video components may be
transported as separate RTP streams. RTCP handles the delivery
monitoring function, sending reports to inform the RTP layer about the
status of network, such as reporting lost packets, interarrival jitter, and
other statistics. The feedback allows coding and transmission settings to be
adjusted to optimize the quality of the application or service.
Timestamps of packets
The timestamp captures the sampling time of the first octet (sample) of the
packet. The timestamp increases monotonically according to the clock of
the source and the size of payload. The receivers use the information to
check for gaps in the data and for out-of-order packets. Timestamps can be
used to synchronize flows from different sources, such as different end
systems or different sessions (audio and video) within a multimedia
application. However, RTP itself does not provide the mechanisms to do
this; the applications must contain the appropriate functionality to use this
information (and global time information within RTCP messages) to
synchronize streams.
For audio, the packet size (the number of bits in the packet, based on the
interval of packetization and the sampling rate) determines the increment
of the timestamps. For instance, an audio stream using a sampling rate of
32 kHz and a 20 ms packet will have a timestamp increment of 640 for two
consecutive packets. If silence suppression is used and no packet is sent,
the timestamp increments nevertheless, so the timestamp on the next
speech packet that is sent will include any silent interval.
For video, timestamps vary for different conditions. In general, timestamps
increase with the nominal frame rate. For instance, timestamps increase by
3,000 for each frame with a 30-frame/sec video, while timestamps step by
3,600 for a 25-frame/sec video. When a video frame is segmented across
several RTP packets, all the packets are marked with the same timestamp.
Where an atypical system is used, such as a special codec, or the
application cannot determine the frame number, the timestamps might need
to be computed from the system clock.
Timestamps and sequence numbers allow the application to play out the
audio and video packets with the correct timing, even where silence
suppression is used (timestamp), and to detect and compensate for missing
packets (sequence number). When RTP packets have the same timestamp
(that is, a video frame segments into several RTP packets), the sequence
numbers are used to determine the appropriate order for decoding and play-
out.
RTP relay
RTP Relay agents are frequently used to translate payload formats for
flows where the two end systems cannot exchange packets directly. There
are two classes of RTP Relay agents: RTP translator and RTP mixer.
RTP mixer
Several RTP flows can be merged into a single flow with the help of RTP
mixers. For example, where the original sessions require more bandwidth
than network can provide, two audio streams can be combined into a
single, more efficient flow. Synchronization is calibrated from the
contributing flows according to the content. The RTP mixer also assigns a
new source identification for the combined stream. Therefore, RTP mixer
can greatly reduce the bandwidth consumption, which is especially helpful
with low speed dial up access network. Figure 8-2 shows the mixer
combining two individual sessions (SSRC = 7 and SSRC = 36) into one
combined session with SSRC = 42.
End System X
RTP translator
An RTP translator performs a similar function, but maintains the individual
RTP flows instead of combining RTP sessions into single one. The source
identifiers are preserved so they can be separated again downstream. The
RTP translator can be used to convert media encodings, duplicate multicast
streams into unicast streams, and filter RTP flows on the application level,
as, for example, firewall services. Figure 8-3 shows two translators placed
on each end of a tunnel for secured media distribution. In this diagram, the
two end systems use separate sessions (SSRC=7 and SSRC=36) end-to-
end, including the encrypted portion of the channel.
End System W
End System Y
SSRC=36 SSRC=36
SSRC=36
End System X
End System Z
RTP Limitations
RTP is often criticized for its heavy overhead. For each media session, the
combination of the IP, UDP, and RTP control information adds up to forty
bytes (twenty bytes IP, eight bytes UDP, and twelve bytes RTP). The
number of packets sent, the number of packets dropped in the network, and
packet jitter. The recipient of these quality reports uses the information to
adjust the system or to diagnose problems. For instance, a sender can use
the statistics to modify its transmission settings. Receivers can determine
whether problems are local or remote. Network and service providers can
use the RTCP information to monitor network performance, and in
particular, performance with multicast connections.
Ensuring scalability
For a conferencing or multicast session, there is an inevitable tradeoff
between adding more and more participants and preventing overwhelming
network bandwidth requirements. This suggests that RTCP control
messages should be limited to a small fraction, say five percent, of the
overall session traffic. During video conferencing, each endpoint sends
control packets to other endpoints; thus, every endpoint can keep track of
the number of participants. Depending on the number of participants and
proportional traffic allowed for RTCP control messages, each member can
calculate the frequency with which to send RTCP packets. In addition, it is
suggested that at least 25% of RTCP bandwidth should be reserved for
source reports to permit new receivers timely recognition of their canonical
names.
Session protocol
Reliable connection
Flow/Congestion control
Error recovery
Multicast
Timing synchronization
Table 8-1: Service comparison of TCP and RTP
Functions
RTSP is more like a framework rather than a protocol. The control
mechanisms include session establishment, session termination, and
authentication. RTSP is designed to carry “VCR-style” commands and
coordinates with RTP to control and deliver media data. Therefore, RTSP
takes advantage of RTP features, such as selection of different delivery
channels, including UDP, TCP and IP multicasting. RTSP can control
multiple delivery sessions simultaneously.
RTSP methods
An RTSP request takes the same form as HTTP. However, while HTTP
request are limited to content download or stp, RTSP can request any of
several actions, referred to as methods, which are specified in the header.
The four main commands for real time services are:
SETUP: Requests that the server establish a session with the
requesting client. The SETUP request for a URI specifies the
transport mechanism to be used for the streamed media. The
available transport parameters of the client are given in the transport
header. Upon receiving the options, the server responds with the
selected transport parameters.
PLAY: Requests that the server begin transmitting streaming data
over the specified transport channel.
PAUSE: Requests that the server suspend transmission, but keep
the session open and wait for another command.
TEARDOWN: Requests that the server stop sending data and
terminate the session. The resources used by the media stream are
all be released.
Operation
RTSP, the so called “Internet VCR remote control protocol”, provides a
method of emulating VCR commands. It does not handle the transport of
streaming data between server and client, but relies on a transport protocol
to deliver it. RTSP works with other transport protocols, but is generally
combined with RTP and RTCP. Coordination of RTSP with RTP/RTCP
provides complete functionality for transport and control of streaming
media. Figure 8-4 shows a typical example in which RTSP and RTP are
used to stream stored content from a media server. The relationships
among HTTP, RTSP and RTP are shown as well.
web browser
HTTP Web
Server
meta file
rtsp://stream.server.net/foo/presentation.abc
meta file
RTSP
meta file,
streaming commands Content
Server
RTP
audio/video contents
media player
Figure 8-4: Streaming media retrieving with RTSP
To play a media presentation, the client (web browser) sends out a request
URL with required configuration parameters to the web server. The web
server responds with a presentation description containing information
about the media server and other required parameters. Meanwhile, the web
browser brings up the media player on the client processor. Upon the
receipt of presentation description, the SETUP request is sent to the media
server with available transport parameters. The media server chooses
transport settings and responds to the request. The player then sends the
PLAY command to request start of delivery of media streams. During the
play out period, reports about the streaming data reception might be sent
back periodically to the media server. The PAUSE or TEARDOWN
commands can be sent to the media server either during the playout or at
the end of the presentation to temporarily interrupt or terminate the
presentation, respectively. A walkthrough of the operation for a multimedia
presentation is illustrated in Figure 8-5. Note that the Web Browser and the
Media Player are both on a client endpoint.
Client Media Web
Web Media
Browser Player Server Server
HTTP GET
Presentation
Internal Description
Commun.
SETUP
PLAY
RTP Audio/Video
RTCP
PAUSE
TEARDOWN
Performance
As discussed in Chapter 2, streaming media to a client is not considered a
real-time process. Because the media path is one-way, the QoE of the
streaming session is not limited by response time, and the delay can be as
high as several seconds without affecting user performance. The addition
of the remote commands of RTSP changes all this. The use of interactive
commands demands a certain level of responsiveness from the system, and
so the session control becomes a real-time application.
Since UDP is can transport RTSP control requests, reliable communication
is not guaranteed within RTSP. Where a session has been PAUSED and a
subsequent PLAY command is lost, the media server may appear to be
frozen. For UDP operation, it is often left to the user to realize the loss of a
control request and repeat the command. This can affect the QoE of the
session.
RTSP is not robust to hardware or software failures. For example, consider
an unplanned reboot of a home PC system where a user is running a
streaming session. The session will be lost on the PC side, and the media
player cannot request that the server terminate the stream. The media
server continues to send streaming data to the client and occupying
bandwidth in the downstream direction. Even when the PC reboots, there is
no mechanism specified allowing the media player to recapture or
terminate the session. So far, RTSP does not specify how to recover a lost
state with the session identifier.
Text base
MIME headers
Status code
Security
URL format
Content negotiation
State maintenance
References
RFC 1889, “RTP: A transport protocol for Real-Time applications,” IETF,
1996 (replaced by RFC 3550).
RFC 3550, “RTP: A transport protocol for Real-Time applications,” IETF,
2003.
RFC 1890, “RTP profile for audio and video conferences with minimal
control,” IETF, 1996 (replaced by RFC 3551).
RFC 3551, “RTP profile for audio and video conferences with minimal
control,” IETF, 2003.
RFC 2326, “Real-Time streaming protocol (RTSP),” IETF, 1998.
RFC 3711, “MIKEY: Multimedia Internet KEYing,” IETF, 2004.
RFC 2250, “RTP payload format for MPEG1/MPEG2 video,” IETF, 1998.
RFC 2586, “The Audio/L16 MIME content type,” IETF, 1999.
Chapter 9
Call Setup Protocols: SIP, H.323,
H.248
François Audet
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
SIP Perspective
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
H.323
SIP
H.248/MEGACO, MGCP, NCS/J.162
Introduction
Communications over the global IP network use three different types of
real-time and control protocols:
Both H.323 (an ITU-T protocol) and SIP (an IETF protocol) were created
to address those requirements. There are consequently of lot of similarities
between the protocols. Furthermore, H.323 and SIP rely on other protocols
defined by the IETF and the ITU-T. For example, both protocols use
transport protocols such as RTP/RTCP, UDP, TCP and IP (IETF protocols),
and both protocols use voice codecs such as G.711 and G.729 (ITU-T
protocols). There are of course differences between the two, as they
evolved from two very different backgrounds.
A third type of protocol, consists of Gateway control protocols. They are
master/slave protocols necessary for a gateway controller to control slave
gateway devices. The main gateway control protocols used today are
MEGACO/H.248, MGCP and NTS/J.162.
The main difference between peer-to-peer and master/slave protocols, is in
the way intelligence is distributed between the network edge devices and
network based servers.
The master/slave approach, exemplified by Megaco/H.248, MGCP, NCS/
J.162, allows network gateway functions to be distributed or decomposed
into intelligent (master) and nonintelligent (slave) parts. Application
intelligence, such as call control, is contained in the functional control
servers (master), which also implement the peer-level protocols to interact
with other functional elements in the system and manage all feature
interactions. These control servers then drive a large number of dumb slave
devices that are optimized for their specific interface function and devoid
of application complexity, hence, they are lower in cost and not subject to
change as new services and features are introduced at the control servers.
A communication network can be comprised of both peer-to-peer and
master/slave elements, along with the real-time transport protocol.
Figure 9-2 illustrates the three types of protocols.
SIP, H.323
Peer Entity Peer Entity
H.248/Megago,
MGCP, etc.
Slave Entity
Media
RTP/RTCP
H.323
Architecture
H.323 has a well-defined architecture, with well-defined components,
functions and protocols.
H.323 defines the following physical components:
Gatekeeper The gatekeeper provides routing and call control
services to H.323 endpoints. It can not generate or receive calls.
Endpoint An endpoint can make or receive calls. Endpoints are of
one of the following three subtypes:
Terminal A terminal can be an IP phone, a PC, a PDA, a set-top
box providing video-conferencing, a voice-mail system or any other
device offering H.323 services to the end user.
Gateway Gateways provide an interface to non-H.323 network,
such as the GSTN, or a SIP network.
Multipoint Control Unit (MCU) MCUs support multipoint
conferences and must contain an MC. They can also contain an MP.
Physical components can be collocated. For example, it is very common to
have devices that are a Gatekeeper, a Multipoint Control Unit, and a
Gateway simultaneously. All Endpoints behave the same way at the
protocol level. It is not terribly important if a particular device is classified
as a terminal, a gateway or an MCU (or a combination of both), however, it
is important that they are “Endpoints” and not Gatekeepers. Sometimes,
the line can be a little blurry. For example, an IP PBX or TDM switch could
Gatekeeper
Terminals Gatekeeper
• address translation (IP, telephone)
• IP phones, PCs, • admission control
PDAs, set-top boxes
• cannot generate or terminate calls
Endpoints
• can make or receive calls
MCU Gateway
Terminal
Gateway
Multipoint Control Unit (MCU) • Interworking with other multimedia
terminals and GSTN
• Support for multipoint conferences
Figure 9-3: H.323 components
In addition, H.323 defines the following logical components:
Multipoint Controller (MC): MCs control multipoint conference
connections
Multipoint Processor (MP): MPs process and mix multiple audio/
video channels
MCs and MPs are logical components and not stand-alone entities. They
must reside within a physical component, such as a Terminal, a Gateway, a
Multipoint Control Unit or a Gatekeeper.
H.323 introduces the concept of a “Zone.” A Zone consists of one (and
only one) Gatekeeper, along with all the endpoints that are registered to
that Gatekeeper (see Figure 9-4). A Zone is independent of geography; it
can span multiple LANs segments and can include Endpoints anywhere on
the Internet.
Zone = 1 GK
Gatekeeper
Scope of H.323
Video Codec
Video I/O equipment H.261, H.263
H.225.0
Audio Codec Transport Layer,
Audio I/O equipment G.711, G.723.1, RTP/RTCP,
G.729, etc. UDP, TCP, IP
(IPX, ATM, etc.) Network
Interface
User Data Applications
t.120, etc.
RAS Control
H.225.0
TCP or
UDP
UDP/H.323 Annex E
IP
Physical layer
H.225.0 (RAS)
H.225.0 (RAS) is the Registration, Admission and Status protocol.
RAS is used only in environments containing a Gatekeeper, but since most
environments these days have Gatekeeper, it is almost always mandatory.
RAS is the protocol used between and Endpoint and its Gatekeeper, and
between a Gatekeeper and another Gatekeeper, to perform the tasks that are
not strictly speaking part of the call establishment.
RAS includes messages that are independent of individual calls.
Gatekeeper Discovery messages are normally sent to a well known
multicast address, to allow the endpoint to automatically discover their
Gatekeeper. This is useful in LAN environments, for plug-and-play
operations. Registration allows an Endpoint to let its presence be known to
its Gatekeeper, allowing others to route calls to that particular endpoint.
Other messages are call related. Admission Requests (ARQ) are used prior
to a call, to get permission from the Gatekeeper to make a call; the
Gatekeeper will confirm the ARQ by telling the endpoint where to place
the call to.
There are other messages that are less-used, such as bandwidth request
messages, in environments where the gatekeeper manages the bandwidth
utilization in the network, as well as messages for status requests.
ACF
H.225.0 (Q.931)
H.225.0 (Q.931) is used for Call Control. It allows for the establishment of
a call between endpoints.
Once an endpoint has obtained permission to make a call from its
gatekeeper, as per the RAS Admission procedures, it will attempt to
establish a call using H.225.0 (Q.931) procedures.
H.225.0 (Q.931) is transported over TCP. The TCP port to be used is
exchanged as part of the RAS procedures. Typically, the well-known TCP
Port 1720 is used, although it is possible to use a different port.
Calling Called
endpoint endpoint
SETUP
CALL PROCEEDING
ALERTING
CONNECT
H.245 session
RELEASE COMPLETE
conflicting with the UUIE information, in which case arcane rules are
defined to make sense out of it.
H.225.0 (RAS)
Gatekeeper Gatekeeper
H.225.0 (Q.931)
H.225.0 (Q.931)
H.225.0 (RAS)
H.225.0 (RAS)
Endpoint Endpoint
H.225.0 (RAS)
Gatekeeper Gatekeeper
H.225.0 (RAS)
H.225.0 (RAS)
H.225.0 (Q.931)
Endpoint Endpoint
H.245
After the H.225.0 (Q.931) call has been set up, it is possible to establish the
H.245 control channel in order to establish media sessions and control
sessions.
An H.245 IP address and port number is exchanged as part of the H.225.0
(Q.931) protocol for carrying H.245 over TCP. Unlike H.225.0 (RAS) and
H.225 (Q.931), a dynamic port is used; there is no default H.245 port.
Termina
lCapab
(G.711, ilitySet
G.729)
k
tySetAc
lCapabili
Termina a b ili ty Set
lCap
Termina , G.711)
(G.729
Termina
lCapa bilitySe
tAck
n
rminatio
laveDete
MasterS
MasterS
laveDete
rminatio
MasterS n
laveDete
rminatio
(Slave) nAck
nAck
rminatio
laveDete
MasterS (Master)
the other endpoint what IP address and port will be used for sending media
on a dynamic RTP/UDP/IP port.
.711)
annel (G
gicalCh
OpenLo
OpenLo
gicalCh
annel (G
.711)
(G.711, OpenLogicalC
RTP/RT ha
CP=192 nnelAck
.168.0.1
:5200/5
201)
k
annelAc 2/4313)
gicalCh 1
OpenLo =10.10.0.2:413)
C P
RTP/RTTCP Media (G
.71
(G.71 1,
RTP/R
RTP/RT
CP Med
ia (G.711)
pty)
ty S et(e m
C ap ab ili
Te rm in al
Term inal Close the audio
C apabili
ty S etA ck connection
with node A
B)
nn el(A <-
gica lC ha
C lo se Lo
C lo se L og
ic alC ha
n ne lA ck
C lo se Lo
gica lC ha
nn el(A ->
B)
T er m inal
Make a new call to C apabili
ty S et(B )
node C
A ck
C apab ility
Te rm in al (C )
in al C ap ab iltyS et
T er m
T er m inal
C apa bilty
S etA ck
Master Slave Negotiation
Now B transfers the call to A T er m in al
C ap ab ili
ty S et(E m
pty)
ty A ck
C ap abili
T er m inal
Close the audio C lo seLo
gica lC ha
connection nnel(A <-
B)
with node C
nn elA ck
gica lC ha
C lo seLo
ha nn el(A -> B )
gica lC
C lo se Lo
C lo se Lo
gica lC ha
n nelA ck
iltyS et (C ) T er m in al
Te rm inalC ap ab C ap ab ili
ty S et(A )
T er m inal ty A ck
C apab ilt C ap ab ili
M as te rS
yS etA ck Te rm in al
la ve D ete D et er m inatio n
rm in atio la ve
n M as te rS
m in ation M as te rS
la ve D eter la ve D et
M as te rS e rm in at
io n
M as te rS ck
la veD eter in at ionA
m in ationA la ve D eterm
ck M as te rS
ck M as te rS
m in atio nA la ve D eter
la ve D eter m in ationA
M as te rS ck
O p en Lo <- C )
g icalC ha
nn el (A -> icalC ha nn el (A
C) O pe nL og
O pe nL o
el (A <- C ) gica lC ha
icalC hann nn el (A ->
O pe nL og C)
O pe nL og A ck
ic alC h an hannel
ne lA ck C lo se Lo gica lC
O pen Lo
nn elA ck g ic alC ha
gica lC ha nn e lA ck
C lo seLo
CALL PROCEEDING
ALERTING
CONNECT
2/4313)
fastStart(G.711, RTP/RTCP=10.10.0.2:431
dia (G.711)
RTP/RTCP Me
RTP/RTCP Media (G.
711)
Call walk-through
Figure 9-16 illustrates a complete call walk-through including H.225.0
(RAS), H.225.0 (Q.931) and H.245, with the normal (“slow start”)
procedures with the gatekeepers operating in direct routed mode.
Endpoint A Endpoint B
Gatekeeper A Gatekeeper B
(192.168.0.1) (10.10.0.2)
ARQ (+1 40
8 55 5 1212)
LRQ (+1 408
555 1212)
LCF
.0.2:1720)
ACF (Q.931=10.10
.0.2:1720)
(Q.931=10.10 SETUP (+1 408 555 1212)
555 1212)
ARQ (+1 408
.931=10.10.0.2:1720)
(Q ACF
CALL PROCEEDING
ALERTING
CONNECT (H.245=10.10.0.2:8878)
Termina
lCapab
(G.711, ilitySe
G.729) t
ck
ilitySetA
lCapab
Termina ility Se t
lCapab
Termina , G.711)
(G.729
n
rminatio
laveDete
MasterS
Termina
lCapab
MasterS ilitySetA
laveDete ck
rminatio
MasterS n
laveDete
rmin
(Slave) ationAck
ck
inationA
sterSla veDeterm
Ma (Master)nel (G.711)
an
gicalCh
OpenLo
OpenLo
gicalCh
annel (G
.711)
(G.711, O pe nLogicalC
RTP/RT hannelA
CP=192 ck
.168.0.1
:5200/5
201)
k
annelAc 2/4313)
gicalCh 1
OpenLo =10.10.0.2:413)
1, RTP /RTCPMedia (G.71
(G.71 RTP/RTCP
RTP/RT
CP Media (G
.711)
Endpoint A Endpoint B
Gatekeeper A Gatekeeper B
(192.168.0.1) (10.10.0.2)
ARQ (+1 40
8 55 5 1212)
LRQ (+1 40
8 555 1212)
LCF
.0.2:1720)
ACF (Q.931=10.10
0.0.2:1720)
(Q.931=10.1 SETUP (+1 408 555 1212, h245Tunne
lling)
fastStart(G.711, G.729, RTP/RTCP=
192.168.0.1:5200/5201 )
555 1212)
ARQ (+1 408
.93 1= 10.10.0.2:1720)
(Q ACF
CALL PROCEEDING
ALERTING
CONNECT
2/4313)
fastStart(G.711, RTP/RTCP=10.10.0.2:431
dia (G.711)
RTP/RTCP Me
RTP/RTCP Media (G.
711)
FACILIT
Y{Termina
lCapabili
MasterS tyS
laveDete et (G.711, G.72
rminatio 9),
n} 11),
.729, G.7
a p ab ilitySet (G
inalC rmination}
Y {Term
FACILIT MasterSlaveDete
IL Y IT
13a: FAC inationAck}
ve D ete rm
la
{MasterS
13b: FA
{MasterS CILITY
laveDete
rminatio
nAck}
SIP
Architecture
The purpose of SIP is to initiate multimedia sessions. SIP includes user
location, user availability and capability negotiation, session establishment,
and session modification.
SIP allows a user to invite others to a session. For example, Alice would
invite Bob to an IP voice call by sending an INVITE message describing
the voice codec to be used and the IP address and port where the media
stream should be sent. The INVITE message will be routed trough the SIP
network (through proxies, redirection servers and other network elements),
using a location service, and will be presented to Bob. Bob will accept the
invitation and provide his own IP address and port for the media stream.
Figure 9-18 illustrates a SIP session through two SIP Proxy servers.
Alice Bob
Proxy Server A Proxy Server B
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
INVITE sip:bo
100 Trying b@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing 200 OK
180 Ringing 200 OK
200 OK
ACK
ACK
RTP/RTCP ACK
Media (G.711
)
edia (G.711)
RTP/RTCP M
BYE
BYE
BYE
200 OK
200 OK
200 OK
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing
INVITE sip:bo
b@biloxi.com
100 Trying
180 Ringing
CANCEL
200 OK
t Terminated
487 Reques
ACK
200 OK
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing CANCEL
Timeout
200 OK
t Terminated
487 Reques
ACK
INVITE sip:b
ob@biloxi.com
100 Trying
180 Ringing
180 Ringing 200 OK
200 OK
to define their own. This allows for much greater flexibility in the number
of features that can be deployed.
The core SIP specification is RFC 3261. Many other RFCs are necessary
for a functional commercial SIP implementation. For example, SDP
Session Description Protocol (RFC 2327) is used for media session
description, and the RTP Real-Time Protocol (RFC 3550) and its Audio
and Video profile (RFC 3551) are used for transporting media.
SIP is based on an HTTP-like request/response transaction model. The
originator of a request is a Client and the response is provided by the
Server. Because of the peer-to-peer nature of communication, each user in
a communication may make some requests. This means that the SIP
protocol entity acting on behalf of a user (called a User Agent) operate both
as a Client and as a Server, depending on which sides is initiating the
request. The protocol itself frequently uses the terms User Agent Client
(UAC) and User Agent Server (UAS). These terms are only useful from a
protocol point of view, as physical devices will always include both, and be
simply called User Agents (UA).
Requests from a SIP UAC invoke a particular Method. A Method will
generate at least one response. Methods and Responses are called SIP
Messages.
SIP Messages are encoded in text format using augmented Backus-Naur
Form grammar (BNF). SIP is loosely defined as a structured protocol,
although most people don't typically view it that way. The BNF-defined
syntax and grammar layer is the first layer. The second layer is the
transport layer. SIP typically uses TCP or UDP as the transport for SIP
messages, but other transports such as TLS and SCTP are also allowed.
Registrar
A Registrar allows a UA to make its presence known to the network. It will
associate an address-of-record URI, with one or more contact addresses
(normally IP addresses). This binding can be done manually, or through a
dynamic mechanism called “Registration.” SIP itself does not specify how
network elements, such as proxies, will use a location service to locate the
users or services based on their URIs. There is an implication that
somehow the registrar stores the binding of address-of-record URI and
contact addresses in a “location server” and that the proxy will somehow
use that location service. In practice, the registrar and proxy are very often
collocated, or have access to the same database.
Redirect server
A Redirect server is a very simple server that responds to a session
invitation by an indication of the requested address-of-record location. It is
up to the UA to then reestablish the session directly to the new URI. The
Redirect Server is not involved in that second session, as it does not stay in
the signaling path. A redirect server is transaction stateful, but it only
understands that particular simple transaction. Figure 9-21 illustrates a SIP
session establishment through a SIP Redirect server.
INVITE sip:b
ob@biloxi.com
porarily
302 Moved tem
b@10.10.0.2
contact: sip:bo
ACK
INVITE sip:b
ob@10.10.0.
2
100 Trying
180 Ringing 0.2
:bob@10.10.
contact: sip
200 OK
ACK
Proxy server
SIP proxies are elements that route SIP requests to user agent servers and
SIP responses to user agent clients. It routes SIP messages and therefore
stays in the signaling path. Proxy servers have a lot of flexibility in the
amount of “state” information it keeps, depending on what its function is.
One key point with proxy servers is that it is only allowed to modify very
specific parts of the SIP messages, mainly related to routing. SIP forbids
proxies to modify most of the message content (for example, the SDP is not
allowed to be modified). SIP was very much written with end-to-end
transparency in mind. Figure 9-22 illustrates a SIP session establishment
through a SIP Proxy server.
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing 200 OK
200 OK
ACK
ACK
UA A B2BUA UA B
b@biloxi.com
INVITE sip:bo
100 Trying
180 Ringing
200 OK
INVITE sip:b
ob@biloxi.com
ACK
100 Trying
180 Ringing
200 OK
ACK
SIP messages
SIP Messages are defined using a syntax inspired from HTTP/1.1,
however, SIP is not an extension of HTTP.
Requests contain a method name and a Request-URI (Universal Resource
Identifier). The Request-URI indicates the user or service to which the
request is being addressed (see Table 9-1).
SIP Responses
1xx: Provisional, Most common are as 3xx: Redirection, Most common are:
follows: 303 “Move Temporarily”
100 “Trying” 4xx: Client Error, Most common are:
180 “Ringing” 404 “User not found”
183 “Call Progress” 486 “User Busy”
2xx: Success, Most common is: 5xx: Server Error
200 “OK” 6xx: Global Failure
Since SIP does not mandate a reliable transport, reliability is handled at the
SIP level itself. INVITE requests are retransmitted using an exponential
back-off until there is a response, or until there is a timeout. Final responses
to INVITE are also repeated until acknowledge by an ACK, or until there is
a timeout.
Alice Bob
(192.168.0.1) (10.10.0.2)
INVITE sip:bo
SDP (A) b@biloxi.com
Offer=SDP(A)
G.711, G.729
RTP/RTP=192.168.0.1:5200/5201)
100 Trying
g
180 Ringin
200 OK
(B)
Answer=SDP
SDP (B)
G.711
RTP/RTP=10.10.0.2:4312/4313
ACK
RTP/R
TCP M
edia (G
.711)
a (G.7 11)
CP Medi
RTP/RT
Alice Bob
(192.168.0.1) (10.10.0.2)
INVITE sip:bo
b@biloxi.com
100 Trying
180 Ringing
200 OK
)
Offer=SDP(A
SDP (A)
G.711, G.729
RTP/RTP=192.168.0.1:5200/5201)
ACK
SDP (B) Answer=SDP
(B)
G.711
RTP/RTP=10.10.0.2:4312/4313
RTP/R
TCP M
edia (G
.711)
edi a (G .711)
CP M
RTP/RT
Answer. However, this can only be done after the initial offer has been
answered.
The MMUSIC is currently defining a next generation SDP protocol for
Session Description and Capability Negotiation called SDPng. Unlike the
current SDP, it will have a clear distinction between capabilities and
session parameters. Instead of defining its own syntax, SDPng uses XML.
Alice Bob
(192.168.0.1) (10.10.0.2)
INVITE sip:bo
b@biloxi.com
SDP (A) , Supported:
Offer=SDP(A 100rel
G.711, G.729 )
RTP/RTP=192.168.0.1:5200/5201)
100 Trying
0rel
required: 10
n Progress,
183 Sessio DP (B )
Answer=S
SDP (B)
G.711
RTP/RTP=10.10.0.2:4312/4313
PRACK
200 OK
)
ia (G.711
P Early Med
RTP/RTC
200 OK
ACK
RTP/RTCP
Media (G.7
11)
in-band early media is used, the caller may very well ensure that the 180
Ringing with no SDP is delivered reliably, in order to ensure that proper
alerting treatment be provided to the caller (that is, the let the caller know
the “phone is ringing”). Another usage for PRACK is to allow other pre-
conditions to apply before setting up the media session, for example,
resources may need to be reserved (through RSVP, or a circuit-based
transport like ATM).
Another twist on early media is that it is sometimes necessary to modify an
Offer before the call is answered (for example, when a preanswer
announcement is provided). Since modifying an INVITE (through a
re-INVITE) is not allowed by SIP before the first INVITE is accepted
through a 200 OK, the UPDATE method was introduced to allow a client to
update the parameters of a session (such as the Offer or Answer) but has no
impact on the state of a dialogue. In that sense, it is like a re-INVITE, but
unlike re-INVITE, it can be sent before the initial INVITE has been
completed. The IETF preferred to introduce this new method to modify the
INVITE method to allow this behavior for backward compatibility reasons.
REFER
The REFER method allows for a UA to “refer” another UA to the resource
provided in the REFER request and for the UA to be informed of the result
of the referral. It is a very generic and powerful primitive that has a very
large number of usages.
For example, it can be used to implement a call transfer feature. For
example, if Alice is in a call with Bob and decides Bob needs to talk to
Carol, Alice can tell her SIP UA agent to send a REFER to Bob's UA with
Carol's Contact information. Bob's UA will attempt to call Carol. Bob's UA
will then report with a notification whether it succeeded in reaching the
contact to Alice's UA. Figure 9-27 details this scenario.
Alice
Bob Carol
INVITE sip:bo
b@example.co
m
100 Trying
180 Ringing
200 OK
ACK
re-INVITE sip
:bob@exam
(hold existing ple.com
call)
200 OK
REFER sip:bo
Refer-To: sip
b@ biloxi.com
:carol@examp
le.com
d
202 Accepte INVITE sip:ca
rol@exam ple.com
100 Trying
180 Ringing
200 OK
ACK
NOTIFY
200 OK
BYE
200 OK
Alice Bob
SUBSCRIBE
Bob's Presen
ce Status
200 OK
Bob sets his status
to "unavailable"
NOTIFY
ble
status=unavaila
200
SIP-T
SIP for Telephones (SIP-T), is a set of practices describing the use of SIP
for interoperability with SS7 PSTN gateway. The main idea behind SIP-T
is that SS7 PSTN gateways can use SIP to set up “calls” and maintain total
PSTN transparency, by allowing ISUP to be encapsulated in SIP messages
as a MIME Body. SIP-T is thus, SIP and ISUP at the same time. It is an
“Inter Call Server” protocol. Tunneling ISUP inside SIP messages has an
advantage over backhauling ISUP independently, as it maintains
intrinsically the association between the SIP session and the PSTN call.
ISUP messages are tunneled in corresponding SIP messages when possible
(for example, an IAM in an INVITE), and in the INFO method defined for
this purpose, when no other SIP message is appropriate. SIP-T is well
suited to Carrier equipment migrating from TDM SS7 architecture to SIP
IP architecture. As such, SIP-T is largely limited to “Carriers” as ISUP is
not widely deployed in Enterprises.
Figure 9-29 illustrates the concept of tunneling ISUP in SIP.
PSTN PSTN
Gateway Gateway
SIP UA SIP SIP SIP UA
IP Network
ISUP ISUP
Termination Termination
ISUP ISUP
PSTN
H.323 SIP
H.225.0 SIP
H.245 SDP
Call Session
SETUP INVITE
CONNECT 200 OK
H.323 Annex M - QSIG & ISUP SIP-T and ISUP & QSIG
Tunneling Tunneling
Table 9-3: Parallels between the protocols
H.323 is well-defined protocol suite with roots in both video conferencing
(H.320) and telecommunications (Q.931). These roots make it palatable to
software writers who are familiar with one or the other sector.
SIP is a simpler and more flexible protocol that has its roots in Internet
Protocols and as such, is much more palatable to software writers who
write for the Internet. SIP also makes it easier to integrate with other
software applications than H.323. SIP is a better protocol for simple
applications that can conceivably use a smaller subset of SIP than they
could of H.323.
H.323 includes well-defined procedures for video-conferencing, while SIP
is starting to include the protocol necessary for commercial video
conferencing. H.245 is a much more feature rich protocol than SDP.
H.323 has a larger installed based today. SIP does not have a large installed
based yet, but most of the development dollars are being spent on SIP, and
not H.323.
Today, the overwhelming majority of protocol development taking place at
the Standards level is on SIP and SIP-related protocols. H.323 is essentially
in a maintenance mode, and only minor additions are actively worked on.
The amount of work devoted to SIP is blossoming. The original IETF
MMUSIC Working Group was so swamped with SIP work that it spun-off
the SIP Working Group who itself spun-off a SIPPING Working Group
(SIP focuses on the core SIP protocol, while SIPPING focuses on noncore
aspects). There is now a SIMPLE Working Group for SIP Presence and
Instant Messaging, and an XCON for Centralized Conferencing. Certain
groups are also defining extensions or writing protocols with SIP in mind,
and not H.323. Some examples are Firewall and NAT traversal solutions in
the MIDCOM, MMUSIC and NSIS Working Groups, or telephony-number
related work in the ENUM or IPTEL Working Groups. Other features such
as Presence and Instant Messaging as defined by the SIMPLE Working
Group can have “equivalent” in H.323, but are much more appropriate in a
SIP environment. IPv6 transition mechanisms are also more usable by SIP
than H.323.
At the development level, the situation is similar. R&D investments are
pouring in SIP, while H.323 development is more in the established phase.
SIP will certainly become the most pervasive protocol for most people and
most applications, but there will still be vast amounts of H.323 systems out
there and it may well remain dominant in certain markets for the
foreseeable future.
Megaco/H.248 overview
As shown in Figure 9-31, the architecture of Megaco/H.248 is based on the
media gateway control (MGC) layer, the media gateway (MG) layer, and
the Megaco/H.248 Protocol itself.
Service Data Point information needed to establish the call and will signal
the Media gateway that audible ringing has been established. When the far
end answers, the MGC will be notified and will pass this on to the Media
Gateway to establish the two way talk path.
Media
Media
Gateway
Gateway
Controller
Lift Handset
Notify: "Off Hook"
Dial Digits
Notify: "digits"
Create Connection
Hang up
Notify: "on hook"
Delete Connection
Terminations
Terminations identify media flows or resources, implement signals,
generate events, have properties and maintain statistics. They can be
permanent (provisioned) or transient (ephemeral). All signals, events,
properties, and statistics are defined in packages that are associated with
the individual terminations.
Contexts
As shown in Figure 9-33, context (C) refers to associations between
collections of terminations (T), defines the communication between the
terminations, and acts as a mixing bridge. A context can contain more than
one termination and can be layered to support multimedia.
Sidebar: Terminations
All signals and events are assumed to occur at a specific termination and
they provide a mechanism for interacting with the remote entity
represented by that termination. Specific signals and events are defined
in packages. Examples of signals include tone generation, playing of
announcements, and the display of caller identity. Examples of events
include line off-hook, DTMF digit received, and fax tone detected.
Properties are defined within the Megaco/H.248 Protocol in two ways.
The term can be assigned to any piece of information that may be placed
into a descriptor in either a request or a response. The term can also
apply to package definition where properties act as state, configuration,
or other semistatic information regarding the termination to which the
package is attached.
Statistics can be accumulated at particular terminations and returned
from the Media Gateway (MG) to the Media Gateway Controller (MGC)
to provide information relevant to monitoring of the MG, network
performance or user activity. Statistics are also defined in packages.
Examples of statistics include, number of bytes sent and received while
in a context, duration of a termination in a context, packet loss rate and
other operational measurements.
References
ITU-T Recommendation H.225.0v4, Call Signalling Protocols and Media
Stream Packetization for Packet Based Multimedia Communications
Systems, International Telecommunication Union Telecommunication
Standardization Sector (ITU-T), 2000
ITU-T Recommendation H.245v7, Control Protocol for Multimedia
Communication, ITU-T, 2000
ITU-T Recommendation H.323v4, Packet Based Multimedia
Communications Systems, ITU-T, 2000
RFC 2833, “RTP Payload for DTMF Digits, Telephony Tones and
Telephony Signals,” IETF, http://www.ietf.org/rfc/rfc2833.txt
RFC 3551, “RTP Profile for Audio and Video Conferences with Minimal
Control,” IETF, ftp://ftp.rfc-editor.org/in-notes/rfc3551.txt
RFC 3550, “RTP: A Transport Protocol for Real-Time Applications,”
IETF, ftp://ftp.rfc-editor.org/in-notes/rfc3550.txt
RFC 2326, “Real Time Streaming Protocol (RTSP),” IETF, ftp://ftp.rfc-
editor.org/in-notes/rfc2326.txt
The tel URI for Telephone Calls, IETF, http://www.ietf.org/internet-drafts/
draft-ietf-iptel-rfc2806bis-02.txt
RFC 3261, “SIP: Session Initiation Protocol,” IETF, http://www.ietf.org/
rfc/rfc3261.txt
RFC 3515, “The Session Initiation Protocol (SIP) Refer method,” IETF,
http://www.ietf.org/rfc/rfc3515.txt
RFC 3264, “An Offer/Answer Model with the Session Description Protocol
(SDP),” IETF, http://www.ietf.org/rfc/rfc3264.txt
RFC 3265, “Session Initiation Protocol (SIP) – Specific Event
Notification,” IETF, http://www.ietf.org/rfc/rfc3265.txt
RFC 2976, “The SIP INFO Method,” IETF, http://www.ietf.org/rfc/
rfc2976.txt
RFC 3262, “Reliability of Provisional Responses in the Session Initiation
Protocol (SIP),” IETF, http://www.ietf.org/rfc/rfc3262.txt
RFC 3311, “The Session Initiation Protocol (SIP) UPDATE Method,”
IETF, http://www.ietf.org/rfc/rfc3311.txt
IETF MMUSIC Working Group, IETF, http://www.ietf.org/html.charters/
mmusic-charter.html
IETF SIP Working Group, http://www.ietf.org/html.charters/sip-
charter.html
Chapter 10
QoS Mechanisms
Ralph Santitoro
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts Covered
Network convergence
Comparison of voice application over TDM and IP packet networks
Convergence drives the need for QoS mechanisms in packet
networks
Implementing QoS mechanisms versus adding bandwidth
Overview of QoS mechanisms – classifier, meter, marker, dropper,
shaper, scheduler
DiffServ QoS architecture
TOS and DiffServ Field Definitions and their importance in
determining the IP QoS a packet should receive.
Introduction
Quality of Service (QoS1) is a broad term used to describe the treatment an
application's traffic receives from the network. Quality of Service involves
a broad range of technologies, architecture, and protocols. Network
operators achieve end-to-end QoS by ensuring that network elements apply
consistent treatment to traffic flows as they traverse the network.
Today, network traffic is highly diverse and each traffic type has unique
requirements in terms of bandwidth, delay, loss and availability. With the
explosive growth of the Internet, most network traffic today is IP-based.
Having a single end-to-end transport protocol is beneficial because
networking equipment becomes less complex to maintain, resulting in
lower operational costs. This benefit, however, is countered by the fact that
IP is a connectionless protocol, that is, IP packets may take different paths
as they traverse the network from source to destination. This can result in
variable and unpredictable delay in a best-effort IP network.
The IP protocol was originally designed to reliably get a packet to its
destination with less consideration to the amount of time it takes to get
there. IP networks must now support many different types of applications.
Real-time applications, such as voice and video, require low latency
(delay) and loss. Otherwise, the end-user quality may be significantly
affected or in some cases, the application simply does not function at all.
Consider a voice application. Voice applications originated on public
telephone networks using Time Division Multiplexing (TDM) technology,
which has a very deterministic behavior. On TDM networks, the voice
1. QoS typically deals with the measurement of parameters associated with a specific treatment. For a long
time, Quality of Service was also used to indicate the overall experience of the user or application.
However, because the objective assurance of meeting a specific parameter can sometimes result in
different levels of overall quality, the term Quality of Experience (QoE) is now used to indicate the
overall experience. For example, meeting a specific delay or jitter objective on a network might be
thought of as QoS, while QoE would deal with the user's perception of voice quality on a network
with that amount of delay and jitter.
With the converged network, different types of traffic are mixed and each
application often has very different performance requirements. These
different traffic types often react unfavorable together. For example, a voice
application expects to experience essentially no packet loss and a minimal,
constant amount of packet delay. The voice application operates in a
steady-state fashion with voice channels (or packets) being transmitted in
fixed time intervals. The voice application receives this performance level
when it operates over a TDM network. Now take the voice application and
run it over a best-effort IP network as Voice over IP (VoIP). The best-effort
IP network has varying amounts of packet loss and delay caused by the
amount of network congestion at any given point in time. The best-effort IP
network provides almost exactly the opposite performance required by the
voice application. Therefore, QoS technologies play a crucial role to ensure
that diverse applications can be properly supported in a converged IP
network.
Classifier
Packets entering an interface are classified based on some filtering criteria
specified by local or network-wide QoS policy. This is done to properly
identify the application for subsequent marking with the appropriate class
of service identifier (CoS marking), after which, the packets are sent to a
rate enforcer (policer). Classifiers may filter based on OSI Layers 2-7
information, although routers most commonly support classification based
on OSI Layers 2-4 criteria.
The classifier is also useful for security purposes and by applying multiple
filters in combination, for example Layers 2, 3 and 4 filters, one can
improve the likelihood that the application classified is not an unauthorized
application or user attempting to get better QoS than permitted by the
network’s application or user QoS policies. Real-time applications initiated
from fixed function hosts, such as voice gateways, are the simplest to
classify because their IP addresses are static and rarely changed. Mobile or
easily movable devices, for example IP phones, require more complex
classification and authentication techniques since their addresses are
typically dynamically assigned.
Finally, many real-time applications use dynamically assigned port
numbers, so care must be taken to select the right combination of filters to
properly identify the application. It is important to properly classify real-
time packets so they are marked properly. This ensures that the routers in
the network provide them with the appropriate QoS treatment.
Marker
Once IP packets are classified, they are marked to indicate the class of
service to which they belong. This marking is done in the DiffServ/TOS
field in the IPv4 packet header and the Traffic Class Octet in the IPv6
header. This marking is important because it will indicate to routers how
the packets should be treated across the network. If a real-time packet is
marked improperly, a router may introduce higher delay or jitter, or drop
(packet loss) the packet under different network operating conditions.
The original definition of this field was referred to as the Type of Service
(TOS) field. In 1999, the Internet Engineering Task Force (IETF) created a
new QoS architecture called IP Differentiated Services (DiffServ) and had
redefined this TOS field to now be called the DiffServ Field. Since the TOS
field definition has changed several times over the years, there is much
confusion surrounding this fields definition, so a bit of history is warranted.
defined for local or experimental use. Figure 10-4 shows the new TOS
field.
Policer (Meter/Remarker/Dropper)
Once the incoming packets are classified, the policer uses a configured rate
and burst size to determine which packets are conformant and which are
not. Depending upon the implementation, a router may have a single rate or
dual rate policer or a combination of both. In a single rate implementation,
there is a committed information rate (CIR) whereby CIR, conformant
packets are assured delivery. In a dual rate implementation, there is a CIR
and an excess information rate (EIR) that determines the amount of excess
(CIR-non conformant) traffic allowed into the network. The dual rate
policer is used for traffic that varies in packet size or arrival time, that is,
bursty traffic. The single rate policer is often used for applications that
transmit at regular intervals, e.g. VoIP.
If incoming packets are not CIR conformant, they can be either remarked
to indicate higher drop precedence for a dual rate policer with a non-zero
EIR or dropped outright for a single rate policer or dual rate policer with
EIR set to zero. Since voice traffic is sent at a constant rate, a single rate
policer is sufficient. Video traffic can use either a single or dual rate policer,
since some video applications send packets at a constant rate while others
send packets at a variable rate.
Shaper
A shaper provides a smoothing effect on bursty traffic so it can be delivered
more efficiently over lower speed interfaces. The policer provides some
shaping (buffering) and some routers implement a secondary shaper,
especially over WAN interfaces. Shaping, effectively buffers (delays)
packets before being sent. Therefore, the delay added by shaping must be
accounted for in the end-to-end delay budget for the real-time application.
Shaping is generally not recommended for real-time applications.
Scheduler
The scheduler determines how packets are queued out an interface. There
are two classes of schedulers, namely, priority scheduler (priority queuing)
and weighted schedulers. Priority schedulers simply continue transmitting
packets until their queues are empty, resulting in the least amount of packet
delay. Weighted schedulers transmit packets based on an assigned weight.
For example, the weight could indicate a percentage of time a queue is
emptied before the next queue is serviced. There are many forms of
weighted schedulers, for example, Weighted Fair Queuing (WFQ) and
Weighted Round Robin (WRR), as well as variants of these, for example,
Deficit WRR (DWRR) and Class Based WFQ (CBWFQ).
Voice applications should use a priority scheduler. Video applications
could use a priority scheduler. However, some weighted schedulers may
also be able to support video applications. Schedulers have a direct impact
on packet delay and jitter.
Queue Management
Queue management determines how queued packets are handled as the
number of packets in a queue increases. The queue becomes fuller as more
packets from multiple sources traverse the same interface. There are two
basic forms of queue management, namely, tail drop and active queue
management (AQM). Tail drop simply drops arriving packets when the
buffer is full (or when a provisioned buffer depth is exceeded). AQM
randomly drops discard eligible packets when one or more buffer depths
are exceeded. Examples of AQM are random early discard (RED),
weighted RED (WRED) and multilevel RED (MRED).
Queue management has an effect on packet loss. AQM methods, in
general, are not recommended for real-time applications unless the
application can detect packet loss and readjust its transmission rate. For
example, some video applications can detect packet loss and switch to a
lower bit rate codec. When packet loss is no longer detected, the video
application can then switch back to the higher bit rate (higher quality)
codec. AQM must never be used with voice applications.
As you can see, there are many QoS mechanisms that can be used and each
has a different impact on delay, jitter and loss. Each QoS mechanism must
be tailored to real-time application. The following sections will cover this
in more detail.
CS Code CS Code
Point Name Point Value
(in binary)
CS7 '111000'
CS6 '110000'
CS5 '101000'
CS4 '100000'
CS3 '011000'
CS2 '010000'
CS1 '001000'
CS0 '000000'
Table 10-1: Class Selector PHB group DSCP values
decimal), which is quite different from the eight bit value created by the
router. While the Windows approach may be masked by the application, the
application developer must keep these differences in mind to ensure that
Windows-based real-time applications are properly marked with the correct
DSCP value.
VLAN ID field
The VLAN ID is used to group certain types of traffic based on common
requirements. Packets marked with a particular VLAN can be classified
and the appropriate QoS mechanisms applied, for example, all packets in
VLAN 10 are IP telephony packets and are given DiffServ Expedited
Forwarding treatment.
Three bit priority field Value from 0-7 representing user priority
(802.1p)
Port Prioritization
When a VoIP gateway or IP PBX is installed in the network, it typically is
assigned a static IP address and connects to a specific port on the Ethernet
Layer 2 switch that rarely, if ever, gets changed. In this application, the L2
switch is configured to receive and transmit all incoming traffic from this
port, ahead of all other traffic entering other switch ports. If the next hop
device is a router, then it can classify and mark the voice packets with the
appropriate DSCP value for use by other routers across the network.
If the device attached to the Ethernet Layer 2 switch port is an IP phone,
then port prioritization is not recommended since IP phones may be
unplugged and moved and another user may attach to the port and receive
inappropriate or unauthorized QoS. For example, if a PC were connected to
a port configured to use port prioritization, then all of the PC's traffic
would be given high priority treatment in the switch. See Figure 10-6 for a
port prioritization example.
IP Address Prioritization
VoIP traffic can also be prioritized by its IP address. This approach is ideal
for devices with statically assigned IP addresses that rarely, if ever, change.
IP PBXs, VoIP gateways and call servers are VoIP devices that would have
their IP addresses statically assigned. A network administrator can
configure the routers to filter (classify) and prioritize all packets originating
from or destined to these IP addresses. See Figure 10-8 for an example of
IP address prioritization.
References
RFC 3246, “An Expedited Forwarding PHB,” IETF, http://
www.ietf.org/rfc/rfc3246.txt
RFC 3168, “The Addition of Explicit Congestion Notification
(ECN) to IP,” IETF, http://www.ietf.org/rfc/rfc3168.txt
RFC 2597, “Assured Forwarding PHB Group,” IETF, http://
www.ietf.org/rfc/rfc2597.txt
RFC 1349, “Type of Service in the Internet Protocol Suite,” IETF,
http://www.ietf.org/rfc/rfc1349.txt
RFC 2475, “An Architecture for Differentiated Services,” IETF,
http://www.ietf.org/rfc/rfc2475.txt
RFC 2474, “Definition of Differentiated Services Field (DS Field)
in IPv4 and IPv6 Headers,” IETF, http://www.ietf.org/rfc/
rfc2474.txt
RFC 2686, “Multi-Class Extensions to Multilink PPP,” IETF, http://
www.ietf.org/rfc/rfc2686.txt
RFC 1990, “PPP Multilink Protocol (MP),” IETF, http://
www.ietf.org/rfc/rfc1990.txt
ATM Forum Traffic Management Specification v4.1, ftp://
ftp.atmforum.com/pub/approved-specs/af-tm-0121.000.pdf
IEEE 802.1Q, Virtual Bridged Local Area Networks, http://
standards.ieee.org/getieee802/download/802.1Q-2003.pdf
Introduction to Quality of Service (QoS), http://
www.nortelnetworks.com/products/02/bstk/switches/bps/collateral/
56058.25_022403.pdf
RFC3260, “New Terminology and Clarifications for DiffServ,”
IETF, http://www.ietf.org/rfc/rfc3260.txt
Section IV:
Packet Network Technologies
Earlier sections dealt with the requirements of real-time applications and
about the ways that TDM and SONET handle these demands. You also
learned about the protocols that handle transport, call setup, and flow
priority management needed for real-time packet networking. Section IV
covers various transport and access technologies that can be used to
provide differentiated service to your network traffic. It is important that in
a converged network environment the network operator understands the
media and protocols that underlie network operation, and that help transfer
and maintain Quality of Service (QoS). The selection of technologies and
protocols should provide a seamless fabric from end of the network to the
other.
The section begins by describing the incumbent technologies
Asynchronous Transfer Mode (ATM) and Frame Relay. These technologies
have a large installed base and are thus very important in providing real-
time services over converged networks. ATM was designed to provide a
high degree of QoS and inherently ensures packets arrive with their QoS
bounds. Frame relay, which came into popularity just prior to ATM, has
continued to extend its functionality through the MPLS/Frame Relay
Alliance (formerly known as the Frame Relay Forum) to aid in providing
real-time differentiated services. While the specifics of these technologies
fills volumes, we will offer a brief description of the basic characteristics of
ATM and Frame Relay, and will then focus our attention on their real-time
capabilities.
While ATM and FR technologies continue to be important, today's
networks are migrating to IP/Multiprotocol Label Switching (MPLS)
cores. Chapter 12 addresses MPLS, providing the basic concepts and
functions that help MPLS provide real-time service. The chapter introduces
MPLS concepts associated to Label Switch Path (LSP) creation, how these
paths are set up using MPLS signaling protocols, and label stacking and
integrating DiffServ into MPLS EXP (experimental) bits.
Chapter 13 is about Optical Ethernet (OE), a relatively new technology that
is growing in prominence. OE combines a Layer 2 protocol (Ethernet) with
Layer 1 protocols like SONET and Dense Wave Division Mulitplexing
(DWDM). OE has the ability to ride over long-haul fiber as well as a
number of added extensions that allow it to emulate ordinary Ethernet
LANs. It also incorporates some of the redundant functionality associated
to optical, such as resilient packet ring.
Chapter 11
ATM and Frame Relay
Sinchai Kamolphiwong
Shardul Joshi
Timothy Mendonca
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts Covered
Layered protocol
ATM interfaces
ATM architecture
ATM adaptation layer
QoS and services in ATM networks
Introduction
The ITU (International Telecommunication Union) has chosen ATM
(Asynchronous Transfer Mode) as the switching and multiplexing
technology for carrying all signals in a high speed network. To support
such goals, network architecture was moved away from circuit switching to
a range of packet switching systems. ATM uses fixed-length packets, called
cells, and is a virtual connection-oriented system. The cell length of 53
bytes (five byte header + 48 byte information field) is an engineering
compromise, to accommodate conflicting requirements of a whole range of
traffic types, be it computer data or real-time traffic such as voice or video.
An ATM network consists of a set of ATM switches to multiplex/
demultiplex traffic streams. Each ATM switch is connected by point-to-
point ATM links. Each ATM link can accommodate several VPs (virtual
paths) and each VP may comprise a number of VCs (virtual channels). This
allows the aggregation of dissimilar types of traffic streams to be
accomplished in one ATM link.
Frame relay is a connection oriented protocol that is a precursor to ATM.
The sections in this chapter that address frame relay will not go into the
basics of the technology, but will examine the evolution of the frame relay
protocol to ensure its viability in tomorrows networks. More specifically,
the chapter will evaluate the use of the MPLS and Frame Relay Alliance
(formerly known as the Frame Relay Forum) Implementation Agreements:
FRF.11.1, the Voice over Frame Relay Agreement, and FRF.12, the Frame
Relay Fragmentation Agreement. The chapter evaluates how the two
specifications work in isolation and how efficiently they work together.
The main advantages of ATM are as follows:
ATM is a connection-oriented network, with each connection setup
associated with its QoS (quality of service) requirements, for
example, delay, loss and cell delay variation. With a high
guaranteed QoS offered by ATM, real-time communications, for
example, voice and video traffic, are suitable for carriage over ATM
networks, as shown in Figure 11-2.
With ATM, the incoming traffic channels are aggregated using
statistical multiplexing into one communication link. High system
utilization is easily obtained.
ATM provides services for traffic types of multipriority services.
ATM offers the opportunity for all traffic sources to use resources
fairly, regardless of distance and the number of connections.
Voice
Voice
Data channel
ATM Switch
Video channel
Telephone channel
Layered protocol
To compare ATM to other protocols, OSI (Open Systems Interconnection)
is an appropriate model to use as a reference. The ATM layer is above the
physical layer (Layer 1), and provides transport functions required for the
switching and flow control of ATM cells. In this context, “transport” refers
to the use of ATM switching and multiplexing techniques at the data link
layer, for example, Layer 2 of OSI model, as shown in Figure 11-3, to
convey end-user traffic from source to destination within a network.
Application Application
Presentation Presentation
Session Session
Transport Transport
Network Network
Physical Physical
Physical Layer
popular operating systems. Recently, ATM has often been used as a link-
layer technology for both local and international regions of the Internet. A
special AAL (ATM adaptation layer) type, called AAL-5, has been
developed to allow TCP/IP to interface with ATM, as shown in Figure 11-
5. Fundamentally, the network layer sees ATM as a data link protocol. At
the IP-ATM interface, AAL-5 prepares ATM transport for IP datagrams.
Application Layer (HTTP, FTP, etc.)
Transport Layer (TCP or UDP)
ATM interfaces
The ATM standard defines two main types of interface in ATM networks:
User-to-Network Interface (UNI)
Network-to-Network Interface (NNI)
User-to-network interface
ATM can be used within both a private network and a public network,
referred to as private User-to-network interface (UNI) and public UNI,
respectively, as shown in Figure 11-6. The public UNI is used to
interconnect an ATM user (or ATM terminal) with an ATM switch
deployed in a public service provider's network, while the private UNI is
used to interconnect an ATM user with an ATM switch that is managed by
a private organization, for example, computer center. Both UNIs share an
ATM layer specification, but may utilize different physical media. The
primary distinction between these two UNIs is physical reach. There are
also some functional differences between the public and private UNIs, due
to the requirements associated with each interface. For example, the
administrative function of private UNI for all domains in an organization,
may follow a local management scheme, which may not seriously consider
interconnection issues.
Network-to-network interface
The Network-to-network (NNI) is the interface between ATM switches.
There are two types of NNI, public NNI and private NNI. The public NNI,
also known as the Broadband Intercarrier Interface (BICI), defines an
interconnect interface between public ATM switches. The private NNI
Figure 11-6: ATM over private and public UNI and NNI
ATM architecture
The B-ISDN protocol reference model has been defined in ITU-T
Recommendation I.121, as shown in Figure 11-7. The model contains both
horizontal and vertical structures. The horizontal layer consists of four
main layers:
Higher layer that specifies functions for applications.
ATM Adaptation layer (AAL). AAL is concerned with a number of
processes necessary to transform the user data stream into a format
suitable for ATM, such as segmentation/reassembling of higher
layer Protocol Data Unit (PDU) into ATM cells. The AAL is
divided into two sublayers, the convergence sublayer (CS) and the
segmentation sublayer (SAR)
ATM layer specifies the ATM structure and functions at the cell
level (more details of AAL and ATM layers will be given in the
next section).
Physical layer specifies media technology dependent issues.
Higher Layer
ATM Layer
Physical Layer
Transmission convergent sub-layer
Physical media-dependent sub-layer
ATM layer
There are two different formats of ATM cell header, one for use at the
User-to-Network Interface (UNI), and the other one for use at the Network-
to-Network Interface (NNI), as shown in Figure 11-8.
Bit Bit
8 5 4 0 8 5 4 0
(a) (b)
CLP Cell loss priority PT Payload type
GFC Generic flow control VCI Virtual circuit identifier
HEC Header error control VPI Virtual path identifier
NNI Network-network interface UNI User-network interface
Figure 11-8: ATM cell headers: (a) UNI (User-Network Interface) and (b) NNI
(Network-Network Interface)
At the UNI, the header contains a four bit generic flow control (GFC) field,
a 24 bit label field containing virtual path identifier (VPI) and virtual
channel identifier (VCI) subfields (eight bits for the VPI and sixteen bits
for the VCI), a 2 bit payload type (PT) field, a 1 bit priority (PR) field, and
an eight bit header error check (HEC) field. The cell header for an NNI cell
is identical to that for the UNI cell, except that it lacks the GFC field; these
four bits are used for an additional four VPI bits in the NNI cell header, as
shown in Figure 11-8 (b).
1 VP1 1
VC
VC 2
VP1 2
1
1 VP2 VC
VC 2 VP2 2
cell headers are examined by the switch to determine which out-port should
be used to forward the cell. In the process, the switch translates the VPI and
VCI values of the original cell (received from the input-port) into new
outgoing VPI and VCI values, which are used in turn by the next ATM
switch to send the cell toward its intended destination. The table used to
perform this translation is initialized during the establishment of the call.
An ATM switch may either be a VP switch, in which case it only translates
the VPI values contained in cell headers (as shown in Figure 11-10), or it
may be a VP/VC switch, in which case it translates the incoming VCI value
into an outgoing VPI/VCI pair (as shown in Figure 11-10). In a VP switch,
all VCI numbers accommodated in a particular VP are not changed even if
the VPI number of the VP may be changed. Since VPI and VCI values do
not represent a unique end-to-end virtual connection, they can be reused at
different switches through the network. The VPI and VCI are local labels
between each switch pair for a given connection. This is important, because
the VPI and VCI fields are limited in length and would be quickly
exhausted if they were used simply as destination addresses.
ATM Switch
VC Switch
VPI-in VPI-out
7 5
9 7 VPI=7, VCI=3,4
ATM
VPI-in VPI-out switch 3
7 3
VPI=3, VCI=3,4
CS CS CS SSCS SSCS A
CS A
CPCS CPCS
L
SAR SAR SAR SAR SAR
ATM Layer
AAL structure
The user data (for example, user-PDU) from a higher layer is first
encapsulated in a common part convergence sublayer protocol data unit
(CS-PDU) in the convergence sublayer (CS), as shown in Figure 11-13. In
this sublayer, the CS-PDU header and trailer are added. Typically, the
CS-PDU size is much too large to fit into the payload of a single ATM cell.
Convergence
CS-PDU SAR-PDU User-PDU sub-layer A
header
A
SAR-PDU
Trailer SAR sub- L
SAR-PDU layer
48 bytes
ATM cell ATM
ATM Payload header ATM Payload Layer
ATM cell
Physical Layer
AAL-0
AAL-0 is the null function (CS and SAR are each an empty function). Cells
from the higher layer are transferred, through the AAL-0 service interface,
directly to the ATM layer service.
AAL-1
AAL-1 has been standardized in both the ITU-T and ANSI since 1993, and
is incorporated in the ATM Forum specifications for circuit emulator
services (CES). AAL-1 supports constant bit rate (CBR) services with a
fixed timing relation between source and destination users synchronous
traffic (for example, uncompressed voice). The AAL-1 service is offered by
most ATM equipment manufacturers. AAL-1 provides the following
services to the AAL user:
Transfer of service data units with a constant source bit rate and the
delivery of them with the same bit rate
Transfer of timing information between source and destination,
explicit time indication is used by inserting a timestamp in the
CS-PDU
Source clock recovery at the receiver by monitoring the buffer
filling (if needed)
Detection of lost or misinsertion cells
The SAR sublayer defines a 48 bytes protocol data unit (SAR-PDU). The
SAR payload contains 47 bytes of user data (of which one byte can be used
for a pointer), four bits for a sequence number (SN), and four bits for
sequence number protection (SNP). The SNP field is used for a CRC
(cyclic redundancy check) value to detect errors in the SN field.
AAL-2
AAL-2 is designed for a service to handle variable bit rate (VBR). AAL-2
needs a different mechanism from AAL-1, since VBR traffic behavior
differs from CBR. Due to a variable bit rate of VBR traffic, cell interval
time is not a constant value. Maximum delay time must be defined in the
AAL-2 packetization mechanism.
As stated above, VBR traffic is not a constant bit rate. There is a chance
that the SAR-PDU may not be filled during a particular time interval. The
delay may be large if the SAR-PDU waits until the payload is completed.
So, the SAR-PDU contains the following information, six bits of SAR-
PDU Header, sixteen bits of SAR-PDU Trailer, and 362 bits of SAR-PDU
Payload. Based on a new service requirement in the AAL layer, the ATM
Forum and ITU -T Study Group (SG) 13 discusses a new AAL-2 to provide
efficient transport of low bit rate voice that allows a very small transfer
delay across the network.
AAL-3/4
Originally, AAL-3 and AAL-4 were separated. AAL-3 tended to support a
connection-oriented service over ATM, while AAL-4 was a connectionless
operation. However, the rest of functions were similar. Therefore, AAL-3
and AAL-4 were combined. As a result, AAL-3/4 supports both connection
oriented and connectionless services. The SAR-PDU consists of sixteen
bits of the header, sixteen bits of the trailer, and 44 bytes of the payload.
AAL-5
AAL-5 is a low-overhead AAL, that is mainly used to transport IP
datagrams over ATM networks. With AAL-5, the header is empty. Similar
to other AALs, the SAR function takes care of segmenting the CS-PDU
into blocks of 48 bytes, but no SAR overhead is added in AAL-5.
The PAD ensures that the CS-PDU is an integer multiple of 48 bytes. The
length field identifies the size of the actual CS-PDU payload, so that the
user data can be retrieved at the destination.
From a service point of view, AAL-3/4 and AAL-5 offer the same layer
functionality. The main differences between these two service types are as
follows:
The AAL-5 performs minimum error control mechanisms in
comparison to the AAL-3/4
The AAL-5 does not offer a multiplexing capability
AAL-5 is the most widely used AAL. Currently AAL-5 offers a service for
the transport of IP networks, and a frame relay service. AAL-5 is
considered in the ATM for possible use, to transport real-time multimedia
information.
ABR Traffic
VBR Traffic
Load Utilization CBR Traffic
UBR Traffic
High
1.0
ABR Traffic
VBR Traffic
CBR Traffic
Time
Figure 11-14: Bandwidth usage of the four service classes in ATM networks
The VBR and CBR classes are higher priority classes used to transport
real-time or high quality audio and video data. The CBR and VBR services
guarantee the negotiated throughput (and therefore the necessary
bandwidth), the maximum cell delay, and variance. Hence, a switch first
allocates link bandwidth to these classes. The remaining bandwidth, if any,
is given to ABR and UBR traffic. To enable the ABR service to function
effectively, a suitable closed-loop flow control mechanism must be
implemented. To that end, the ATM Forum proposed a rate-based flow
control scheme, which is a closed-loop control mechanism. With the rate-
based scheme, the network controls the transmission rate of the sources to
maximize the network performance. Thus, at times when resources are
plentiful, the network will allow a source to increase its rate of
transmission, but at other times, when the traffic is heavy, the source rate
will be throttled to a safe value. In contrast, the UBR service has no flow
control mechanism (or open loop) and does not specify traffic-related
service guarantees, but may be subject to a local policy in individual
switches and end systems. It is a “best effort” service.
The characteristics of the four ATM services are summarized in Table 11-
1.
Table 11-2: The traffic and QoS parameters for the ATM service classes
However, some QoS parameters are not negotiated during connection
setup, for example, Cell Error Ratio (CER), Severely Errored Cell Block
Ration (SECBR), and cell misinsertion rate (CMR).
The QoS parameter CDV should not be confused with the connection
traffic parameter CDVT. Even though CDV is a QoS parameter, it is not
used for negotiation. CDV is introduced by cell multiplexing when cells
from two or more connections are multiplexed (to the same output
channel). Cells of a given channel may be delayed while cells of other
channels are being inserted at the output of multiplexer. In practice, the
upper bound of CDV is measured in CDVT. The value of CDVT is chosen
such that the output cell flow conforms to a bandwidth enforcement
mechanism.
A user of an ATM connection (a VCC or a VPC) is provided with one of a
number of QoS classes supported by the network. At VPC establishment,
virtual channel and virtual paths are assigned. During these steps, however,
if any conditions can not be done, the alternative path selection process will
be performed. If no alternative path is found, the call will be rejected.
After CAC process is complete (see Figure 11-15), and the requested path
is created, data may be sent over this communication channel. To ensure
user contracts, which have been agreed upon during the CAC, the cell level
process will monitor and enforce traffic according to the contact
parameters. Any violating users may cause a rejection of the established
channel. To that end, the Generic Cell Rate Algorithm (GCRA) is used to
specify the arrival cell as either, conforming or nonconforming. The cell
that is conforming can be admitted, where a cell with nonconform may be
blocked. A widely used mechanism to shape the traffic is 'leaky bucket' as
shown in Figure 11-16.
Path bandwidth
Call rejection Network
request
level
No
Call setup request
Is alternate Yes
path selection Call
successful? level
No Agree
with network load
conditions?
Yes
Path generation phase CAC
(Call
Admission
Agree with
Control)
No
load condition of trunks in
the path?
Yes
Connection
No
Is link phase
allocation phase
successful?
Yes
VC and VP assignment
phase
Connection
establishment
Cell
Traffic Policing
level
Traffic Traffic
source source
AAL Layer
MUX
ATM Layer
Physical Layer
UNI
Cell with token
VCI/ VPI
Cell buffer
ATM
Switch
Continuous-state leaky
bucket Token generator
T1/E1 T1/E1
PABX ATM Switch ATM Switch PABX
Voice Codec Voice Codec
AAL AAL
ATM ATM
ATM Network
PHY PHY
Voice channel
AAL Layer 1 2 3 1
Figure 11-18: Voice and telephony multiplexing over ATM using AAL-2
FRF.11.1
As mentioned above, FRF.11.1 is the Voice over Frame Relay
Implementation Agreement. This specification deals with a number of
different concepts. This section concentrates on the primary piece
concerning transport of voice within the frame relay payload, support of
various codecs and the effective utilization of low-bandwidth connections.
Codecs
G.729 G.728 G.723.1 G.726/G.727 G.711
Applications of fragmentation
There are three applications for fragmentation defined in the specification:
Locally across a frame relay UNI interface between the DTE-DCE
peers
Locally across a frame relay NNI interface between the DCE peers
End-to-end between two frame relay DTE peers
Frame relay User-to-Network Interface (UNI) and frame relay Network-to-
Network Interface (NNI) share the same type of data fragment format. By
that, the header information for the first two types of fragmentation match.
This contains a two-octet header that precedes the frame relay header
information.
End-to-end fragmentation's header information is a quite a bit more
complex. It contains the same information as in the UNI and NNI
fragmentation schemes, however, adds a number of other information into
the header. Missing is the network layer protocol ID (NLPID), which is
assigned to identify this fragmentation header format and identifies the data
content. The unnumbered information (UI) bits are also missing and are
associated to multiprotocol encapsulation.
Note: Some customers may have a requirement to communicate via
frame relay to a web site that is not their own and does not support
fragmentation. In these cases the customer may not be willing to
implement fragmentation.
device and the provider edge device. Since fragmentation is local to the
interface, the network can take advantage of the higher internal trunk
speeds by transporting the complete frames, which is more efficient than
transporting a larger number of smaller fragments. UNI fragmentation is
also useful when there is a speed mismatch between the two DTEs at the
ends of a VC (virtual circuit).
Boston
Branch Office
St Louis
Branch Office
Seattle SRG
Branch Office FT-1 (512Kb)
SRG
FT-1 (384Kb)
SRG
FT-1 (256Kb)
Frame Relay
Network
Los Angeles
New Campus Site
FT-1 (128Kb) Houston
T-1 (1.5Mb)
Branch Office
SRG
Succession 1000E
Figure 11-19: Typical frame relay network with centralized switching
Policing is a similar function done at the ingress point of the frame relay
network. Its function is to determine if traffic is within, or exceeds, the
specified contract. If it exceeds the contract, it has the option of either
marking the traffic Discard Eligible (DE) or discarding it. If the customer
premises does not successfully shape the traffic it may be discarded,
affecting the voice QoE.
The fourth best practice is referred to as pacing. Pacing spreads the packets
over the time interval evenly. This addresses the problem where the high
speed end of a circuit (in this case Los Angeles) has a time interval that is
large and a number of data packets may be transmitted ahead of voice
packets, in the specified time interval. These data packets will then get
ahead of voice packets at the egress point of the network (for instance at
Houston), which will in turn create jitter.
The fifth and sixth potential problems in frame relay have to do with the
number of PVCs you allocate in the network, and how they are architected.
Most customers who implement frame relay for VoIP and multimedia
applications, have already implemented it for data. In data networks, it is
common to build the network with centralized switching (fifth potential
problem), which has a single PVC (sixth potential problem) from each site
into the main site for all applications (see Figure 11-19). This is done to
minimize the number of PVCs, which in turn, reduces the cost
substantially. Note that in a data network, this has very little impact.
Figure 11-20: Frame relay network with full mesh, separate PVCs for voice
and T-1 access
When implementing real-time protocols, especially VoIP, it is
recommended that the frame relay network be built in a full mesh, as
depicted in Figure 11-20. A full mesh has two characteristics that make it a
better solution for VoIP applications. First, it provides more bandwidth in
the network so that traffic from different locations do not have to contend
for bandwidth and priority with traffic from other sites. Second, it
eliminates the additional serialization and queuing delays of the centralized
switching approach.
The sixth potential problem in a frame relay network occurs when a single
DLCI is allocated for all applications. Since frame relay has no real QoS
mechanisms, this increases the chance of poor performance for real-time
protocols. It is generally recommended that at a minimum, VoIP be
allocated its own DLCI. Although this will cost more, it is the only way to
truly assure that voice will get its bandwidth with the quality that is
expected. (see Figure 11-20.)
The seventh has to do with the access speed of the circuit. Many times
customers will get a frame relay circuit from the carrier with a CIR of
256 Kb, and assume that CIR is the only thing that affects performance. In
fact, in a data network, this is true. However, in a real-time network, the
access speed also comes into play. For instance a 256 Kb CIR circuit will
operate better on a full T-1 access link than on a fractional T-1(FT-1) link.
This is due to the fact that the quicker the egress point of the network can
offload packets, the less chance there is for jitter problems due to the
additional serialization delay introduced by a fractional T-1. In other words,
on a full T-1, the clock speed is at 1.5 Mbps, where a FT-1 with four DSOs
(256 K) allocated to the 256 K CIR, the clock speed will be at 256 K.
References
K. Asatani and S. Nogami, “Standardization of Network Technologies and
Services,” IEEE Communications Magazine, pp.82-90, August 1995.
S.L. Sutherland and J. Burgin, “B-ISDB Internetworking,” IEEE
Communications Magazine, pp.60-63, August 1993.
F.A. Tobagi, Fast Packet Switch Architectures for Broadband Integrated
Services Digital Networks, Proceedings of IEEE, V.78, N.1, pp.133-166,
January 1990.
ITU-T Recommendation I.150, B-ISDN Asynchronous Transfer Mode
Functional Characteristics, International Telecommunication Union
Telecommunication Standardization Sector (ITU-T), 1993.
R. Händel, M. N. Huber, and S. Schröder, ATM Networks: Concepts,
Protocols, Application, 2nd Edition, Addison-Wesley Pub. Co., Inc., 1994.
K. Sriram, “Methodologies for Bandwidth Allocation, Transmission
Scheduling, and Congestion Avoidance in Broadband ATM Networks,”
Computer Networks and ISDN Systems, V.26, pp.43-59, 1993.
K. Genda, N. Yamanaka, Y. Arai, and H. Kataoka, “A High-Speed-Retry
Banyan Switch Architecture for Giga-Bit-Rate BISDN Networks,”
Communication System, V.7, pp.223-229, 1994.
G. Gallassi, G. Rigolio, and L. Fratta, Broadband Assignment in Prioritized
ATM Networks, IEEE GLOBECOM, pp.852-856, 1990.
ITU-T Recommendation I.363.1, ITU-T, 1993.
ITU-T Recommendation I.363.2, B-ISDN ATM Adaptation Layer Type 2
Specification, ITU-T, 1997.
Jan Höller, “Voice and Telephony Networking over ATM,” Ericsson
Review, No.1, 1998.
M. S. Chambers, H. Kaur, T. G. Lyons, and B. P. Murphy, “Voice over
ATM,” Bell Labs Technical Journal, October-December, 1998, pp.176-190.
D.W. Petr et al., Efficiency of AAL2 for Voice Transport: Simulation
Comparison with AAL1 and AAL5, IEEE INFOCOM'99, pp.896-901.
ITU-T Recommendation G.729, Coding of Speech at 8 kbps using
Conjugae-Structre Algebraic-Code-Excited Linear Prediction (CS-
ACELP), ITU-T, November 1995
H. Saito, “Performance Evaluation of AAL-2 Switch Networks,” IEICE
Trans on Communications, V.E82-B, No.9, September 1999, pp.1411-
1423.
G. Mercankosk, J. E. Siliquini, and Z. L. Budrikis, Provision of Real-time
Services over ATM using AAL type 2, Mobicom'99, pp.83-90.
Chapter 12
MPLS Networks
Ali Labed
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLSMPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
MPLS architecture
Label switching
The label format
MPLS signaling protocols
LSP setup
Integrating MPLS and DiffServ
Label merging
Label stacking
Introduction
Multiprotocol Label Switching (MPLS) represents an overlay network that
operates on top of the existing IP network, but depends greatly on the
underlying IP network for its operation. In conventional IP forwarding,
each hop performs a forwarding table lookup on each packet to determine
the appropriate next hop for the destination address. A longest match is
applied to perform the lookup, hence, the forwarding of each packet is
expensive. By using a short, fixed length label, a lookup can be quickly
performed using hardware acceleration.
An MPLS network (see Figure 12-2) is composed of two types of nodes,
Label Edge Routers (LER) and Label Switch Routers (LSR). Label Edge
Routers, located at the edge of the MPLS network, are responsible for
classifying traffic. A class of traffic with the same destination and that
needs to be treated the same way, is called a Forwarding Equivalency Class
(FEC). A label is added to the packet that identifies how the next router
should forward it. Each Label Switch Router (LSR) along the path uses the
label instead of the IP address to make forwarding decisions. Just like ATM
Virtual Path Identifier / Virtual Channel Identifier (VPI/VCI), the label has
local significance. Each router maintains a table that maps incoming labels
to outgoing labels, to provide the packet with the correct label for the next
router. At the egress edge of the MPLS network, the LER removes the label
and forwards the packet using normal IP.
LSR1 LSR2
LSP1
LER1
LSP2 LSR5
LER2
LSR3
LSR4
distribute labels to MPLS nodes along the path. Once the labels are in place
at each router, the path through the network is the same for all packets of
the same Forwarding Equivalency Class. Such a path is called a Label
Switched Path (LSP).
The label
A label is an identifier used to identify a Forwarding Equivalency Class
(FEC). At the Ingress LER, an arriving packet is augmented (encoded) with
a label that identifies the FEC to which that packet belongs. Usually, the
CONTROL PLANE
Routing
OSPF-TE, IS-IS-TE
DATA PLANE
Shim-Header
Layer 2
ATM, Frame-Relay, Ethernet
Optical Layer
1. ATM and MPLS services can operate on the same ATM switch without interaction. This is sometimes
called Ships in the Night.
LER LER
1 2
LSR2 LSR2
RESV (label=Lr2)
Ls2 >
RESV (label=Ls2)
Ls1 >
RESV (label=Ls1)
Label merging
Several LSPs can be merged at a common LSR. This takes place by
assigning the same outgoing label and outgoing port, to packets arriving on
any of the merged LSPs. Beyond that LSR, the information packets that
belonged to different LSPs, is lost.
Label stacking
One of the important scalability features of MPLS is the ability to put an
LSP inside another LSP. It is implemented by allowing packets to have
more than one label at a time. The labels form a stack and the router always
forwards packets based on outermost or “top” label. The stack creates a set
of nested tunnels. At the tunnel exit, the outermost label is popped off the
stack and the packet proceeds based on next label without further lookup.
The protocol permits nesting to an arbitrary depth. This can be used to
create a hierarchical network, where LSPs from lower levels are
encapsulated into an LSP for transit through the backbone at that level. See
Figure 12-6.
LER1
LSP_1.1
LER3
Pushes 2nd label on packets
The 2nd label represents LSP2
LSP_1.2
Forwards based on
LSP_2 (outer) 2nd label.
LER4 LER2
References
RFC 3031, E. Rosen et al., “Multiprotocol label Switching Architecture,”
IETF, January 2001, http://ietf.org/rfc/rfc3031.txt
RFC 3564, Le Faucheur, W. Lai, “Requirements for Support of
Differentiated Services-aware MPLS Traffic Engineering,” IETF
RFC 3032, “MPLS Label Stack Encoding,” IETF
RFC 2702, MPLS-TE, D. Awduche et al., “Requirements for Traffic
Engineering Over MPLS,” IETF, September 1999, http://www.ietf.org/rfc/
rfc2702.txt
RFC 3034, “Use of label Switching on Frame Relay Networks
Specification,” IETF
RFC 3272, Internet-TE, D. Awduche et al., “Overview and Principles of
Internet Traffic Engineering,” IETF, May 2002, http://www.ietf.org/rfc/
rfc3272.txt
MPLS-RC, MPLS Resource Center, http://www.mplsrc.com/
Chapter 13
Optical Ethernet
Peter Kealy
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
Optical Ethernet basics
Resilient packet ring
Optical Ethernet services
Introduction
Ethernet is an easy to understand technology and is extremely cost-
effective. For these reasons, 98% of local area network (LAN) connections
are now Ethernet based.
Link-layer protection
Ethernet over Resilient Packet Ring takes advantage of SONET's link-layer
protection and provides less than 50 ms fail-over, in event of a fiber cut and
maintained ring function when a node is disabled.
RPR technology
RPR packets are mapped into a synchronous payload envelope (SPE)
before being added to the SONET WAN ring. Individual packets are
delineated within the STS payload (SPE). TDM traffic, sharing the ring
with RPR traffic, is mapped into separate STS connections. Each RPR
within the OC-N SONET ring is provisionable in sizes of STS-1
(51 Mbps), STS-3c (155 Mbps), and STS-12c (622 Mbps).
Layer 1 SONET protection is disabled for the STS-Nc RPR, allowing both
directions around the ring to carry unique, independent traffic. This feature
doubles the effective bandwidth available for data traffic, allowing true line
rate such as 100 Mbps on an STS-1 RPR or 1 Gbps on an STS-12c RPR.
Layer 2 RPR protection techniques provide carrier grade transport
protection of less than 50 ms. Spatial reuse transports data packets on the
ring in the shortest direction and discards packets at the destination node,
utilizing only the bandwidth in the segment between the source and
destination node. This feature increases the effective bandwidth available
on the ring allowing, service providers the ability to oversubscribe the STS-
Nc RPR by several times.
Point to point, multicast, and broadcast applications are supported
efficiently through the connectionless architecture of all modules
provisioned with access to a common STS-Nc RPR. Data is transported
around the ring in a drop and continue technique, avoiding multiple copies
of the same packet consuming ring bandwidth. Automatic topology
discovery updates all nodes when a new node is added to the ring. Class of
Service queuing supports 802.1p prioritization, enabling service providers
the ability to support time sensitive data applications, such as Voice over IP
and streaming video, along with best effort Internet or e-mail data
applications.
Transparent domains
802.1Q Q-tagged VLANs are used in customer Enterprise networks to
virtually separate LANs (4096 VLANs supported in L2 header).
Service Providers support multiple Enterprise customers who may use the
same VLAN identifiers. RPRs must be able to separate customer traffic,
even if they utilize the same VLAN identifiers. Transparent Domain
identifiers TDIs provide this mapping to maintain privacy as well as
transparency. See Figure 13-4.
RPRs support two kinds of traffic management, Transparent and Mapped:
Transparent–All frames are mapped to a TDI regardless of VLAN
on ingress, and are switched transparently through the RPR
regardless of whether the received frames have 802.1Q VLAN tags
or not. End-customer VLAN tags are carried unaffected through the
network and will exit the network the same way. Because Packet
Edge performs packet encapsulation, each TD is capable of
supporting 4096 unique VLAN IDs.
Mapped–Frames are filtered based on VLAN on ingress and can be
retagged at RPR egress. When configured in “Mapped” mode, the
carrier has the flexibility to configure a mapping of end-customer
802.3Q VLAN tags. This provides options to filter on incoming
end-customer VLANs. That is, only certain packets may be mapped
to a specific VPN, depending on the incoming VLAN tag. As well,
Packet Edge provides the ability to re-tag a customer packet as it
exits the network. For example, a particular customer’s Q-tagged
traffic is received within its defined TD (for example, Q-TAG 6
within TD 451). Leaving, the network the service provider can
retag for the end-customer (for example, the packet now leaves with
Q-TAG twelve within TD451).
Customer 1
Ethernet LAN Customer 1
VLANS 1 - 4096 Ethernet LAN
VLANS 1 - 4096
Service Provider
Transparent Domains
RPR 1 RPR 2
Customer 2
Ethernet LAN Customer 2
VLANS 1 - 4096 RPR 3 Ethernet LAN
VLANS 1 - 4096
LAN extension
LAN extension provides seamless, end-to-end connectivity for an
Enterprise to extend their LAN between metro areas. The Enterprise's
traffic is collected in the metropolitan area network (MAN), and the
network's Optical Ethernet functionality is extended across a high-
performance long-haul network, which is completely transparent to the
Enterprise user. Even though the service now spans a much longer distance,
it still functions as a LAN for the Enterprise.
Chapter 14
Network Access: Wireless, DSL,
Cable
Rob Dalgleish
Dave Anderson
Robert Cirillo, Jr.
Peter Chapman
Session Gateway
Media Related Control Control
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
SONET / TDM
Concepts covered
Access mechanisms and their characteristics
An overview of cellular coding techniques
Nomadicity and mobility
Comparison of wireless access technologies
Introduction
The process of providing a high bandwidth physical layer connection over
distance, through physical channels, with properties that are time-varying
and nondeterministic, is an age old problem.
To push the limits of maximum bandwidth while minimizing cost
inevitably results in compromised quality in the form of random loss of
information. Information loss can be detected and corrected by
retransmission in Layer 2 or above, but this trades one type of problem for
another. Error detection and retransmission causes random variation of
transmission delay and can degrade performance of real-time services.
This chapter looks at the transmission characteristics of wireless and
wireline physical media and how the different ways of implementing them
can affect the end to end performance of real-time services. Table 14-1
provides a comparison of access technologies.
on one path for short periods, when the other is too weak to use. For added
reliability, some models use time varying delay characteristics.
1. Erlang B is a trunk sizing tool for voice switch to voice switch traffic.
bit rate codecs also achieve their lower bit rates by using more complex
algorithms that make certain assumptions, such as those about the media
and the packet loss rate. Other codecs may not make those same
assumptions. When a user with a low bit-rate codec talks to a user with
another codec, additional distortion is introduced by each transcoding.
Having to deal with a lossy channel, such as an RF interface, impacts Real-
Time performance, both directly in terms of the packet loss rate and
indirectly in the techniques used to mitigate packet loss. Packet loss
concealment algorithms all require a buffer of packets to allow
interpolation to occur. If the RF interface is designed to accommodate re-
transmission on error, additional buffering will occur, at the cost of real-
time performance.
Because of the impact of delay on voice quality (see Chapter 3), this must
be accommodated in an overall network design to ensure a given level of
voice performance. Low bit-rate codecs require larger packet buffers and
more processing and so they introduce larger delays. Buffering an RF
channel to accommodate retransmission adds delay, and more complex
packet loss concealment algorithms mean more delay. As an example of
how significant these delays can become before accounting for the impact
of transcoding or long distance transmission, the accommodations made
for minimizing bandwidth and accommodating packet loss on a cellular
connection can easily result in a one way delay of 100 ms.
a central entity (Base Station Controller) that managed the fine details of
radio resource management and switching calls between cell sites. This
approach is still effective for voice networks with dedicated narrow band
bearers, and is well suited to the requirements of low, consistent delay and
always-on connectivity during the call. Separate frequencies are used to
separate uplink and downlink traffic and to allow continuous and
simultaneous two-way connections.
The need for ubiquitous coverage and always-on connectivity at reasonable
cost per user, requires that the coverage per cell be maximized, in order to
minimize the number of cell sites required. This encourages very
sophisticated signal processing at the base station radio and user terminal,
to stretch radio capacity and coverage as far as possible. A side effect of
this increased processing power is the increased ‘user perceived’ delay in
the system. The signal processor at each end must store each frame,
implement the encoding and error correction processes (interleaving,
convolutional coding and baseband equalization), and forward the signal.
First and second generation wireless networks have no direct connection
with a packet network. Voice communications with the Public Switched
Telephone Network (PSTN) are achieved by transcoding the wireless codec
used for the RF portion of the path to G.711, which is used in all of the
switching equipment in the PSTN. Normal PSTN switching is used to
deliver the voice data stream to the appropriate central office switch that
connects it to the wireline phone connected to that switch.
6. R1, R2 and ISUP are examples of trunk signaling protocols in use in the public switched telephone
network
7. Code Division Multiple Access
8. Wideband Code Division Multiple Access (third generation)
9. Unique code issued by the base station to separate the signal from each user.
base stations at a time. The list of base stations changes as the mobile
moves, creating a seamless hand off. Regardless of the mechanism and
technology, cellular systems are well optimized to support hand off of real-
time applications, given their origins of circuit switched voice.
Home
Location SS7
Base Station Register signaling
Controller
(vocoder)
Mobile
circuit core Switching voice PSTN/ PBX
Centre
Home
Agent
Radius
AAA
transmitted to maintain very low error rates without the need for
retransmission.
Bandwidth allocation in the time domain is also a challenge for wireless
networks. For example, IS2000 CDMA 1xRTT allocated a 9.6 kbps
fundamental channel to a user, once the network determined that data
would be transmitted. The packet control function (PCF) in the base station
controller has a per user buffer. Thresholds are monitored, and once the
active threshold is passed, 9.6 kbps radio resources are assigned. The user
will retain these resources and the air interfaced link while data is being
actively transmitted. Once the user’s activity slows, PCF timers will
determine that the user has gone idle and will revoke the 9.6 kbps channel.
A supplemental channel (SCH) is applied when a user is active with a
fundamental channel (FCH) and the PCF buffer indicates that more
bandwidth is required. The resource manager can then assign supplemental
resources and radio link capacity to support up to 153.6 kbps. This
supplemental resource is managed on behalf of all subscribers in a sector,
and it is a shared resource. The result is that if many simultaneous users
require high bandwidth in addition to their fundamental channel, the
supplemental channel resource is shared. This means that the bandwidth
can be granted and then after some time, taken away from a particular user.
This centralized management of bandwidth resources is unique to public
cellular systems and attempts to trade off end user performance with
overall network efficiency. This mechanism works relatively well for
bursty services, such as browsing, e-mail and instant messaging, but does
not lend itself well to real-time services.
3G networks
Full 3G cellular technologies, such as UMTS14, are designed to address
these real-time application challenges. The UMTS 3GPP15 standard has air
interface services defined for voice and for data, but UMTS further defines
data as packet data services and circuit data services. These are more
specifically PS16 (unspecified delay) and CS17 (low delay) services, each
available in 64,144 and 384 kbps allotments.
WWANs18 are evolving further, by implementing very fast downlink
scheduling and time multiplexing of high rate data-bursts of a few
milliseconds, to users, when it is known that their channel conditions are
optimal. This requires continuous monitoring and reporting of the channel
conditions seen by every user, to allow the central radio controller to
determine who should get the next burst, after also considering how much
data each user needs to send and their relative priority. Technologies such
as 1xEV-DO and 1xEV-DV19 are built on CDMA2000*, and High Speed
Downlink Access (HSDPA) is a Release 5 addition to the WCDMA
standard. These techniques are capable of achieving up to ten times higher
throughput per user, and three times higher kbps capacity per sector, under
ideal conditions of all users requiring continuous throughput, and those
users with the best channel conditions getting the most data. If user priority
or fairness policies are enforced to divert capacity to users in suboptimal
conditions, or there are a very large number of active sessions with low
utilization, then this bearer type is less effective.
OFDM (Orthogonal Frequency Division Multiplexing) has potential to
further improve the spectral efficiency of WWANs. By avoiding the need
for tight power control, OFDM can be more suited to data traffic with long
sessions, but short activity bursts. This is proposed for 3GPP Release 6.
19. Single carrier (1x) Radio Transmission Technology (1.25) MHz Evolution - Data Only and Data &
Voice
tuned to work with this, it is clear that mobile IP is not inherently designed
for real-time hand offs.
A perfect example of a nomadic wireless network is 802.11 wireless LAN.
Due to the power constraints imposed on license exempt spectrum, the
coverage area of a single access point is small, roughly 50 m. Given this, a
cluster of access points will cover an area, but contiguous outdoor coverage
of wireless LAN is impractical. For this reason, wireless LAN users
typically connect at each locale as needed, and when finished, pack up and
move to another and reconnect. Seamless hand off, either in idle or traffic
mode, may occur within a cluster of access points but not between clusters.
As a result, wireless LAN users are considered to be nomadic, reattaching,
authenticating, establishing a security association and potentially a billing
association, at each use. This very truth can make real-time services
difficult for a few reasons. First, there is no consistent Layer 3 address that
the user can be reached at that spans all connections. Second, there is no
certainty of bandwidth or QoS at each attach point, since it is a best effort
network.
Conclusion
Traditional circuit switched cellular networks designed for voice are
optimized for real-time services, including mechanisms for hand off,
billing and user authentication. IP-centric wireless solutions, like wireless
LAN, offer the same challenges to real-time applications, as do wired IP
networks, with the additional requirement to support authentication and
billing for nomadic users. 2.5G and 3G cellular networks have designed
solutions to offer the performance of 2G networks for real-time services,
while providing IP evolution in the access and core network components,
offering data services and to taking advantage of IP network economics.
xDSL technology
Max.
Downstream Upstrea
Name Description Loop Notes
Speed m Speed
Length1
ISDN DSL 144 kbps 144 kbps 18,000 Ft. Symmetric access.
two wire (one pair)
IDSL operation. ISDN
BRI
High bit rate 1.544 Mbps 1.544 15,000 Ft. Symmetric access.
DSL Mbps Requires four wire
HDSL (two pair)
operation.
Local Telephone
Switch
Access to Copper Loop
PSTN Main
Distribution
Frame
Central Office
CPE POTS
POTS Splitter
Splitter
House
Access to Wiring
Internet
DSLAM ADSL
DSL Access Single Modem
Multiplexer Cooper Loop
ADSL modem
The ADSL modem is the customer premises equipment used to terminate
the DSL connection. Physically, it features an RJ-11 socket for connection
to the local loop, and an RJ-45 socket for the 10Base-T connection to the
subscriber’s personal computer (PC). For subscriber convenience, it may
also feature a second RJ-11 socket for POTS connection to a standard
telephone set. Since ADSL uses ATM to carry traffic at Layer 2, the
modem is also responsible for encapsulating the IP packets into ATM
AAL5 cells. Depending on the type of connection, the modem can be
configured to use a RFC1483 (multiprotocol encapsulation over AAL5)
bridged or routed connection, or the more versatile combination RFC2364/
RFC2516 (PPP over AAL5/PPP over Ethernet) to carry traffic into the core
network. As a CPE (customer premises equipment) device, a network
compatible ADSL modem can either be purchased by the subscriber or
obtained directly from the service provider as a purchase or rental.
DSLAM
The DSLAM20 provides backhaul services for packet, cell and circuit-
based applications, through concentration of the DSL lines onto 10Base-T,
100Base-T, T1/E1, T3/E3 or ATM outputs. For ADSL, the connection is
typically ATM cell-based from the subscriber modem through the network
core. DSLAMs are found on the edge of the provider network, located in
either the central office or a remote cabinet. Physically, DSLAM sizes
range from large, multicard shelves (five to seventeen U spaces) to small
1U boxes found in remote access cabinets.
Subscriber aggregation
With ATM being a point-to-point, connection-oriented link, there exists the
requirement to terminate the subscriber endpoint and reassemble the ATM
cells into IP data packets. While it is possible for an ATM-enabled core
router to terminate these ATM VPI/VCI connections, as a practical matter,
Subscriber B 02
0.1
VCC
M
AT
ISP
Subscriber C
Subscriber
Internet
(ISP A)
Loop DSL Aggregation
10Base-T DSLAM
Platform
ADSL
Modem
ATM/Frame ADSL ATM IP Internet
DS-3/OC-3 DS-1/DS-3 (ISP B)
ADSL
Subscriber
IP
PPPoE
ATM ACME Co.
ADSL (Private Network)
RADIUS Server
Cable network
The cable network was originally designed for one way communication of
broadcast information, primarily television signals for the purpose of
information and entertainment. In this context, broadcast traffic means that
the same material is sent to every user, usually residences connected to the
cable.
The technology was designed to provide a large number of television
channels in one direction, from the cable operator to the residence. In
addition to carrying television signals, it usually also carries sound signals.
With the advent of the Internet, and to enable cable service providers to
offer services above and beyond broadcast mode entertainment, the
requirements on the cable technology and infrastructure significantly
changed. The requirement is now to provide all of the foreseeable two way
communication services to and from residences and in some cases,
businesses.
21. Founded in 1988 by members of the cable television industry, Cable Television Laboratories, Inc.
(CableLabs*) is a nonprofit research and development consortium that is dedicated to pursuing those
technical advancements into their business objectives.
Rights management
In order to offer different selections of program channels to different users,
and to be able to offer users individual programs on a billable basis, known
as pay-per-view, it is necessary to protect the program material and to
provide access to it for individual users. In addition, because it is not
always possible to control physical access to the cable, it is necessary to
provide some form of control of access. This is done by applying high-pass
filters to the tapped circuits of the primary distribution circuit with analog
scrambling techniques. With the advent of digital transmission, encryption
is enabled with digital keys.
Line extender
Bridger amplifier
Head end
CMTS
Bi-directional Hybrid Fibre/Coax
Tap
To residences
Downstream technology
Typically, a few hundred users can share a 6 MHz downstream channel and
one or more upstream channels. The downstream digital modulation
system is the same as that used for digital television, using 64 or 256 state
QAM (Quadrature Amplitude Modulation), and it can provide up to 40
Mbps. Information for each user is separated by a time division multiple
access system, usually referred to as TDM. Because the signals for all users
are generated by the CMTS (Cable Modem Termination System), they are
all synchronized in relation to each other, so separation of each user’s
traffic is reliable.
Upstream technology
DOCSIS 1.0 and 1.1. In DOCSIS versions 1.0 and 1.1, the upstream
channels can be up to 3.2 MHz wide and can deliver up to 5.12 (DOCSIS
1.0) Mbps per channel. Because a number of users share the upstream RF
channel, a media access control (MAC) layer coordinates shared access to
the upstream bandwidth.
PacketCable
The means of using the cable infrastructure to carry traditional voice traffic
has been facilitated by means of PacketCable. PacketCable 1.0, introduced
in 1999, provided baseline voice capabilities. It did not provide certain
essential services, primarily 911 service and functionality without
consumer power, and was aimed at the residential second line market.
Managed IP Network
Router
HFC Access Network HFC Access Network
(DOCSIS) (DOCSIS)
PSTN
OSS Servers Router
Gateway
Cable modem
The cable modem converts the digital signal from the user equipment, into
a signal suitable for sending in both directions on the hybrid fiber coax
network. This is multiplexed onto the HFC network, together with the
broadcast equipment.
Access network
This is the network that connects the user to the cable service provider
(MSO). It can be regarded as consisting of three primary components:
Hybrid fiber/coax (HFC) access network–PacketCable-based
services are carried over the hybrid fiber/coax (HFC) access
Media server
The Media Gateway Controller is the logical signaling management
component is used to control PSTN media gateways.
Network announcement player–This device holds all the verbal
messages that need to be generated for the telephone and other
voice related services.
PSTN gateway
PacketCable allows MTAs to interoperate with the current PSTN through
the use of PSTN gateways. In order to enable operators to minimize cost
and optimize their PSTN interconnection arrangements, the PSTN gateway
is decomposed into three functional components, a controller (media
gateway controller), one gateway for the bearer path, and a second gateway
for the signaling path.
Media gateway controller (MGC)–The MGC maintains the call
state and controls the overall behavior of the PSTN gateway. It
terminates and generates the call signaling from and to the
PacketCable side of the network.
Media gateway (MG)–The MG terminates the bearer paths and
transcodes media between the PSTN and IP network.
Signaling gateway (SG)–The SG provides a signaling
interconnection function between the PSTN SS7 signaling network
and the IP network.
QoS
Quality of Service has already been mentioned. Internet protocol specifies
a number of delivery mechanisms. Most common are TCP (transmission
control protocol) and user datagram protocol. These mechanisms do not in
themselves, provide either QoS or guaranteed delivery. However, TCP
provides for acknowledgement and retransmission of lost packets.
PacketCable carried over DOCSIS 1.1 provides means for ensuring that
packets are delivered in such a way to guarantee delivery and sequencing.
The CMTS has ultimate control of the QoS mechanisms. Clients make
requests to the CMTS, but it is only the CMTS, or a policy server
controlling the allocation made by the CMTS that has the authority to grant
or deny those requests.
PacketCable security
The standard defines methods for protection of information generated by
the user, protection of confidential information, such as passwords, and for
protection of copyrighted material, made available by the service provider.
The security architecture provides facilities to easily detect and identify
attempted breaches.
Best-effort
A standard contention-based resource management strategy, in which
transmit opportunities are granted in the order in which the request is
received by the CMTS, as coordinated by the CMTS scheduler. This
scheduling type may be supplemented with QoS characteristics in which,
for example, maximum rate limits are applied to a particular service flow.
Non–real-time polling
A reservation-based resource management strategy is one in which the
cable modem is polled at fixed time intervals. Although it is a fixed
interval, the interval is sufficiently large, to facilitate utilization efficiency
at the cost of real-time performance. When queued traffic is identified on a
particular service flow, a transmission opportunity, or grant, for that service
flow, is provided by the scheduler.
Real-time polling
Real-time polling is analogous to the non–real-time polling scheduling
type, except that the fixed polling interval is typically very short (<500 ms).
Polling scheduling types are most suitable for variable bit rate traffic that
has inflexible delay and throughput requirements. Video streaming is
typical of this type of traffic.
Unsolicited grant
A reservation-based resource management strategy is one in which a fixed-
size grant is provided to a particular service flow at approximately fixed
intervals. This scheduling type is most suitable for constant bit rate traffic
and eliminates much of the protocol overhead associated with the polling
types. This is suitable for voice traffic.
Downstream Service Flows are defined using the same set of QoS
parameters that are associated with the best-effort scheduling type on the
upstream.
References
Cable Television Laboratories, Inc., http:\\www.cablelabs.org
Chapter 15
The Future Internet Protocol: IPv6
Elwyn B. Davies
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
IPv6 Basics
QoS in IPv6
IPSec in IPv6
Routing in IPv6
Network Control in IPv6
Application Programming Interfaces in IPv6
IPv6 Transition Strategies
Tunneling
Interworking between IPv4 and IPv6
64 billion (296) times the size of the IPv4 address space (232)
In other words, the total number of available IPv6 addresses is
340,282,366,920,938,463,463,374,607,431,768,211,456
Pessimistic estimate:
1,564 addresses per square meter of the Earth’s surface.
Optimistic estimate:
3,911,873,538,269,506,102 addresses per square meter
Maybe enough addresses to assign one to each of grain of sand
on the planet!
Even the US Department of Defense should be happy with this!
(The networked battlefield initiative needs many addresses)
Routing and filtering of the 128 bit addresses of IPv6 requires significantly
greater processing than the smaller IPv4 addresses, and the full header
structure of IPv6 is more complex than that of IPv4. On the other hand, no
Layer 3 checksum calculation is required and QoS routing may be quicker
due to the flow label.
IPv6 has some improvements that should make basic management,
especially in Enterprise networks, considerably easier. But this is offset by
the need to manage networks that can transmit both IPv4 and IPv6, as well
as the transition and migration mechanisms that will be used during the
transition period.
During the transition from IPv4 to IPv6, significant amounts of traffic will
either be traveling through tunnels or will be using a protocol translator. In
both cases, the extra processing may add significantly to the network
latency and jitter experienced by packets.
The remainder of this chapter describes the basic points of IPv6 technology
and transition mechanisms sufficient to show how they can affect the
performance of a real-time network. As already mentioned, IPv6 affects a
great deal more than just the network layer. Refer to Appendix D for more
detailed information on IPv6.
DATA ...
...
...
Extension
Modified field for IPv6 ...
data
Deleted field for IPv6
IPv4 IPv6
Four octet (32 bit) addresses Sixteen octet (128 bit) addresses
QoS Specification in Service Type octet: QoS Specification in Flow Label and Traffic
Class octet:
Originally:
Type of Service (five bits) Originally:
Priority (three bits) (Traffic Class – never used)
Now: Now (as IPv4):
DiffServ Code Point (six bits) DiffServ Code Point (six bits),
Explicit Congestion Notification Explicit Congestion Notification
(two bits) (two bits)
Protocol field identifies type of Packet Next Header field in main header and each
Data Unit (PDU) extension identifies next component
No IP Layer Checksum
The checksum found in IPv4 headers has not been carried over to IPv6.
The logic behind this is that almost all the risk of corruption in packets
comes from the transmission between nodes. All Layer 2 technologies in
common use provide a checksum that can detect corruption during
transmission on each separate link; IP transport protocols provide a
checksum that can detect corruption of the payload end-to-end. The
considerable extra processing load needed to calculate the checksum
initially—verify it at each node, modify it as the hop count is decremented
and recalculate it at each waypoint when a routing header is included—is
not justified and could be considered an added risk for corruption.
To ensure that a transport layer checksum is provided in all cases, the
specification of UDP has been slightly modified to force the use of the
checksum when UDP is carried over IPv6. UDP checksums are optional
for IPv4 networks, based on the grounds that there is a Layer 3 checksum.
Fragmentation
IPv6 routers are no longer expected to fragment packets if the Maximum
Transmission Unit (MTU) of the next link is too small for the size of
packet. Instead, all IPv6 nodes and links are required to handle packets up
to 1280 octets long. They may be constructed to handle bigger packets, but
must guarantee to provide this minimum value of the MTU.
Hosts may still have to fragment large packets for transmission and
reassemble them on receipt. Routers have to handle the resulting
fragments, but will drop any that exceed the available MTU and report the
error back to the source with an Internet Control Message Protocol (ICMP)
‘Packet Too Big’ error message.
A host can either decide to work with the minimum guaranteed MTU, for a
message or a session, or use Path MTU Discovery (PMTUD) to find out
what is the minimum MTU for the set of links on which a packet will
traverse in reaching a destination. Traditionally, PMTUD has been carried
out by starting transmission with large packets and reducing the size used if
‘Packet Too Big’ errors are received as described in Path MTU Discovery
for IPv6 [RFC1981]. This approach has a number of drawbacks. For
example, network resources and time are wasted transmitting the overly
large packets and the responses. Also, many systems no longer forward
ICMP messages to avoid some denial of service attacks so that the error
responses will never be received. The pmtud working group in the IETF is
currently (mid 2004) designing an improved ‘packetization layer’ system
for PMTUD that will address these concerns. The new scheme starts with
small packet lengths and increases the size used until a packet fails to make
it through the network. See Path MTU discovery draft
[I-D.ietf-pmtud-method].
There is a trade-off between the time taken to determine the actual MTU
and the overhead incurred in fragmenting some or all of the packets to be
sent to a destination. A management interface is provided to control
whether PMTUD is used for particular applications and paths. The
application designer needs to consider whether it is possible or desirable to
limit messages to the minimum MTU. If only a very few messages are
likely to exceed this value, it may be worth living with a little
fragmentation; however, for long streams of messages over the minimum
MTU, PMTUD may offer significant value if the path can offer a larger
MTU. Designers also need to be aware that the path MTU may change
during a session if the path is rerouted. Network designers should consider
how the MTU of a network will affect the applications that will run over it.
Extension headers
IPv4 packets allow a fixed maximum amount of space for optional fields
and the majority of routers handle IPv4 options reluctantly, at best. Most
IPv4 packets carrying options are classed as special cases and diverted
from the hardware supported ‘fast path’ in larger routers. The processing of
packets with options in the ‘slow path’ carries a large performance penalty;
as a result, IPv4 options are little used.
By contrast, IPv6 is designed with an extensible header system that is
intended to be flexible and future proof. Several of the capabilities that are
0 3 4
IPv6 sources add a fragment header to each part so that the destination can
reconstruct the PDU.
Extension
Functions Comments
Header
Contains any options that need to be
processed by every router on path:
Hop-by-Hop Jumbo Packet Allows for packets bigger than 64KB
Router Alert Flags that packet needs inspection by all
routers on path
Path Spec Specifies ‘waypoints’ that packet must
Routing
pass through
Packet Fragment Attached to each portion of a packet that
Information has had to be fragmented to allow
reassembly. Because the fragment header
Fragment
is only added when it is needed, bandwidth
is not wasted transmitting empty fields as
is the case for most packets in IPv4.
Contains options that need only be
examined at the destination of the packet:
Mobile IP Information needed to maintain the link
Destination Information between a mobile station and its home
location.
Tunnel Extension Limits on the depth of tunnel nesting for a
packet
IPsec information:
Authentication When a packet is authenticated but not
Security
encrypted
Encryption Encryption specification
Last Header Dummy Used if a packet has no protocol payload
The fields are not affected by encryption, allowing QoS and IPsec
security to work together.
No layer violations are needed (the network layer doesn’t need to
take into account which transport protocol is used or values of
transport parameters).
The basic standards for the use of the flow label have just been completed
at the time of writing [RFC3697]. Applications have not yet been adapted
to take advantage of this new capability, and classifiers using the flow label
field are not generally available at present. The RSVP specification
[RFC2205] and MIB [RFC2206] already allow the flow label to be
included in traffic specifications. Likewise, the DiffServ architecture
[RFC2475] implicitly allows use of the flow label: the MIB [RFC3289]
and PIB [RFC3317] for DiffServ already provide standardized access to
classifiers that use the flow label.
Version Class Flow Nxt hdr PL length Res Source port Destination port
PL length Nxt Hdr Hop lmt Security Parameter Index Sequence number
Source IP address Sequence number field Acknowledgement number
Version Class Flow Security Parameter Index Source port Destination port Padding
PL length ! Nxt Hdr Hop Imt Sequence number Sequence number Padding length
Source IP address Encryption parameters Acknowledgement number Payload type
(eg. initialization vector)
Destination IP address Code bits + window size Authentication data (optional)
Checksum + urgent pointer
= encrypted Options + padding
used. SAs can be set up manually at each end point; but, this is obviously
not desirable for a network that hopes to make extensive usage of IPsec.
One of the main obstacles to widespread deployment of IPsec for end-to-
end communications has been the slow development of an acceptable key
exchange protocol and ubiquitous key distribution infrastructure. The
Internet Key Exchange (IKE) protocol is still under development to replace
two earlier key exchange mechanisms that have not found wide acceptance
[RFC2408], [RFC2409]. The second version of the Internet Key Exchange
protocol (IKEv2) shows considerable promise as a reasonably simple and
robust key distribution mechanism [I-D.ietf-ipsec-ikev2].
IPsec provides two modes of operation:
Tunnel Mode, which has been widely deployed to provide security
protection for Virtual Private Networks and ‘road warriors’ in IPv4-
based networks, is designed for use where hosts are generally IPsec
unaware. The secured tunnel end points are provided by specialized
security gateways or add-on ‘extranet clients’.
Transport Mode is designed for use by IPsec aware hosts and
provides end-to-end security between pairs of such hosts.
The use of Transport Mode IPsec connections to provide end-to-end
security is likely become much more prevalent in IPv6 networks because of
the requirement that all nodes support IPsec. Apart from the problems of
key distribution, the use of Transport Mode IPsec has been inhibited by the
interactions between IPsec and firewalls, and NATs and other ‘middle
boxes’ that currently need to examine the fields of packets that may be
concealed by IPsec. The introduction of IPv6 should essentially eliminate
NATs, and work is in progress to solve the other problems.
IPsec is not a suitable solution for the security of all real-time data
exchanges. The overhead and the sensitivity to lost packets makes IPsec
less appropriate for media streams and it is likely that IPsec will be
reserved for the protection of signaling protocols. Even if IPsec becomes
commonly available in IPv6 networks, media streams may well make use
of a stream security protocol, such as Secure RTP [RFC3711] and its
associated keying mechanisms, so that lost packets do not result in
termination or loss of synchronization between end points and reduction of
overhead encryption.
Operating systems typically provide mechanisms to establish an overall
security policy, set up the database of security associations, and to allow
Application Programmable Interface (API) applications to control the use
of IPsec within the limits of the overall policy. See Appendix D for more
details on this subject.
Transport Protocols
IPv6 is primarily a replacement for the IP Network Layer. All the existing
IP transport protocols (including TCP, UDP, SCTP) can be carried as
payloads in an IPv6 datagram just as in IPv4. As mentioned in “Extension
headers” , the protocol number for the transport in use is placed in the Next
Header field of the last IPv6 extension header (or in the basic header if
there are no extensions).
The only modifications that are needed have to do with transport layer
checksums [RFC2460]:
The use of IPv6 affects the pseudo-header, which is used to
calculate the checksum carried in the transport payload and
includes some of the IP header fields.
Because there is no IP layer checksum in IPv6, the use of a UDP
checksum has been made mandatory for IPv6 networks (it was
optional for IPv4).
supported by IPv6 assume that most of the link layer networks that will be
used are Ethernet-like, and will work most easily with multiple access link
layers that support broadcast transmission at the link layer. Point-to-point
links are no problem; but, using a multiple access network—such as a
multipoint ATM network or X.25 network, which does not support
broadcast—requires extra work.
ICMPv6 now has a multitude of functions:
Returns error messages to the source if a packet could not be
delivered. Four different error messages are specified in
[RFC2463].
Monitors connectivity through echo requests and responses used by
the ping and traceroute utilities. The Echo Request and Echo
Response messages are specified in [RFC2463].
Finds neighbors (both routers and hosts) connected to the same link
and determines their IP and link layer addresses. These messages
are also used to check the uniqueness of any addresses that an
interface proposes to use through Duplicate Address Detection
(DAD)—DAD can be turned off if the network administrator
believes that the configuration method used is bound to generate
unique addresses. Four messages—Neighbor Solicitation (NS),
Neighbor Advertisement (NA), Router Solicitation (RS) and Router
Advertisement (RA)—are specified in [RFC2461].
Ensures that neighbors remain reachable using the same IP and link
layer address by applying Neighbor Unreachability Discovery
(NUD) and notifies neighbors of changes to link layer addresses.
This function uses NS and NA messages as specified in [RFC2461].
Finds routers and determines how to obtain IP addresses to join the
subnets supported by the routers. This function uses RS and RA
messages as specified in [RFC2461].
Communicates prefixes and other configuration information
(including the link MTU and suggested hop count default) from
routers to hosts if stateless autoconfiguration of hosts is enabled
(see “Autoconfiguration for Hosts” ). This function uses RS and
RA messages as specified in [RFC2461].
Redirects packets to a more appropriate router on the local link for
the destination address or points out that a destination is actually on
the local link even if it is not obvious from the IP address (where a
link supports multiple subnets). This facility could be used by a
malicious sender to divert packets. Nodes should provide
configuration options to prevent the messages being sent by routers
and acted on by hosts. The redirect message is specified in
[RFC2461].
the most trivial of networks. Every interface will have a 'link local' address,
which is required for operational control communications with neighbors
and local routers on the same link but can be used for application traffic
between neighbors on the same link. For unicast communication beyond
the local link, the interface will need at least one global unicast address and
it may be a member of several multicast groups.
Applications have to be adapted to use the address possibilities correctly: a
prospective communication partner that has several addresses may only be
reachable on some of them (for example, because of a network failure) and
the application has to be prepared to cycle through the available addresses
to find one that works. More details on this process are given in Appendix
D.
Real-time network designers should be aware that cycling through the
possible destination addresses may take significant time, due to the
network round trips and timeouts involved in determining that an address is
unreachable or unusable. Any additional knowledge that is available should
be used to inform the address selection procedure and override the defaults
where this will speed up the communication process.
1. An alternative proposal, which introduced A6 records [RFC3596], was intended to make it easier to
renumber networks; but, this has now been made experimental because of some fears that the
recursive lookups needed might result in loops and security breaches.
between the IPv4 and IPv6 versions of the routing protocols and real-time
applications.
ISP networks
3G Cellular networks
Once this work has been completed, it will naturally suggest the
mechanisms that appear most useful, and natural selection can take its
course.
IPv4 only
IPv6 host router
IPv4 IPv6
IPv6 IPv6 only
router
IPv6/IPv4 IPv4 only IPv4 only IPv6/IPv4
IPv6 host translator router router translator
IPv6 IPv6/IPv4
translator
IPv6/IPv4 IPv6/IPv4
Single administrative host host
domain IPv6 only
IPv6 router
IPv4 only with IPv6 islands IPv6
IPv4 IPv4
IPv4 only
router
IPv6/IPv4 IPv6/IPv4 IPv6/IPv4
translator router router
IPv4
IPv6
Point-to-Point Links
Standards for the encapsulation of IPv6 packets in various Layer 2
technologies have been defined. The interconnection routers are similar to
those used to carry IPv4 over point-to-point links with IPv6 interfaces on
the customer side and L2 interfaces on the core network side. As with the
IPv4 case, this solution has a high setup and management cost, but is useful
for situations where stable, secure communications with a well-defined
traffic pattern is required. It can be implemented without needing any IPv6
capabilities from the Layer 2 service provider.
MPLS Networks
MPLS Label Switched Paths (LSPs) can carry IPv6 packets using a
standard encapsulation. Existing MPLS networks with IPv4-based control
planes can be used to carry IPv6 if the Label Edge Routers (LERs) have
dual IPv4/IPv6 stacks. In due course, MPLS networks will be built with
IPv6 control planes and the need for dual stack LERs will gradually
disappear.
References
[I-D.ietf-ipngwg-icmp-v3] Conta, A. and S. Deering, “Internet Control
Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6)
Specification,” draft-ietf-ipngwg-icmp-v3-04 (in preparation), IETF, June
2004.
[I-D.ietf-ipsec-ikev2] Kaufman, C., “Internet Key Exchange (IKEv2)
Protocol,” draft-ietf-ipsec-ikev2-14 (in preparation), IETF, June 2004.
[I-D.ietf-mobileip-ipv6] Johnson, D., Perkins, C. and J. Arkko, “Mobility
Support in IPv6,” draft-ietf-mobileip-ipv6-24 (in preparation), IETF, July
2003.
[I-D.ietf-pmtud-method] Mathis, M., “Path MTU Discovery,” draft-ietf-
pmtud-method-01 (in preparation), IETF, February 2004.
[I-D.ietf-v6ops-3gpp-analysis] Wiljakka, J., “Analysis on IPv6 Transition
in 3GPP Networks,” draft-ietf-v6ops-3gpp-analysis-10 (in preparation),
IETF, May 2004.
[I-D.ietf-v6ops-ent-scenarios] Bound, J., “IPv6 Enterprise Network
Scenarios,” draft-ietf-v6ops-ent-scenarios-04 (in preparation), IETF, July
2004.
[I-D.ietf-v6ops-isp-scenarios-analysis]Lind, M., Ksinant, V., Park, S.,
Baudot, A. and P. Savola, “Scenarios and Analysis for Introducing IPv6
into ISP Networks,” draft-ietf-v6ops-isp-scenarios-analysis-03 (in
preparation), IETF, June 2004.
[I-D.ietf-v6ops-mech-v2] Nordmark, E. and R. Gilligan, “Basic Transition
Mechanisms for IPv6 Hosts and Routers,” draft-ietf-v6ops-mech-v2-03 (in
preparation), IETF, June 2004.
[I-D.ietf-v6ops-unmaneval]Huitema, C., “Evaluation of Transition
Mechanisms for Unmanaged Networks,” draft-ietf-v6ops-unmaneval-03
(in preparation), IETF, June 2004.
[I-D.tsuchiya-mtp] Tsuchiya, K., Higuchi, H., Sawada, S. and S. Nozaki,
“An IPv6/IPv4 Multicast Translator based on IGMP/MLD Proxying
(mtp),” draft-tsuchiya-mtp-01 (in preparation), IETF, February 2003.
[I-D.venaas-mboned-v4v6mcastgw] Venaas, S., “An IPv4 - IPv6 multicast
gateway,” draft-venaas-mboned-v4v6mcastgw-00 (in preparation), IETF,
February 2003.
RFC1981, McCann, J., Deering, S. and J. Mogul, “Path MTU Discovery
for IP version 6,” IETF, August 1996.
RFC 2205, Braden, B., Zhang, L., Berson, S., Herzog, S. and S. Jamin,
“Resource ReSerVation Protocol (RSVP) -- Version 1 Functional
Specification,” IETF, September 1997.
RFC 3596, Thomson, S., Huitema, C., Ksinant, V. and M. Souissi, “DNS
Extensions to Support IP Version 6,” IETF, October 2003.
RFC 3697, Rajahalme, J., Conta, A., Carpenter, B. and S. Deering, “IPv6
Flow Label Specification,” IETF, March 2004.
RFC 3711, Baugher, M., McGrew, D., Naslund, M., Carrara, E. and K.
Norrman, “The Secure Real-time Transport Protocol (SRTP),” IETF,
March 2004.
RFC 3750, Huitema, C., Austein, R., Satapati, S. and R. van der Pol,
“Unmanaged Networks IPv6 Transition Scenarios,” IETF, April 2004.
RFC 3810, Vida, R. and L. Costa, “Multicast Listener Discovery Version 2
(MLDv2) for IPv6,” IETF, June 2004.
Section V:
Network Design and Implementation
The previous sections have described individual topics and technologies
that form the building blocks of the Real-Time network. This section deals
with the broader aspects of how these building blocks are combined into an
entity called a network. The focus shifts from the details of technology and
protocol definition to the behavior of the elements and devices in
combination: the operation, resiliency, and performance of the network as a
whole. The chapters in this section address ways the various protocols,
technologies, and techniques can be integrated into a complete solution,
based on Quality of Experience performance targets.
Implementation of real-time networking principles requires the
coordination of a number of network deployment options. These options
include the end-to-end QoS mechanisms, network architecture and
topology choices, selection of codec and packet size, and provisioning of
link speed to ensure compatibility with the desired performance.
Successful deployment requires an understanding of how the packet
network behaves under different traffic conditions and how it will
interwork with any existing legacy systems, whether as part of the same
network or where a call is handed off to another network or carrier.
A converged network takes on the reliability requirements of the most
demanding traffic, application, or service that it carries. If the network
carrying only e-mail traffic goes down for a few seconds every five
minutes, users might not even notice. Similar outages on a network
carrying mission-critical services can result in business disruption,
confusion, frustration, or even loss of life.
Chapter16, Network Address Translation (NAT) will help you gain a
working knowledge of this technology and its implications for real-time
network design.
Chapter 17 and 18 describe protocols and techniques used to improve the
resiliency of a network infrastructure. Methods include built-in redundancy
through duplication and the use of higher-layer protocols that will provide
sub-second reconvergence times. These techniques build on the Layer 2
protocols that were discussed in Section IV.
Chapter 19 provides specific guidance on mapping QoS settings from one
network technology to another, an important factor in ensuring end-to-end
Quality of Service operation.
Chapter 20 looks at engineering of converged networks to deliver high QoE
for both voice and data applications. A planning process is described that is
used to predict voice performance based on known relationships of voice
Chapter 16
Network Address Translation
Elwyn B Davies
Cedric Aoun
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323 Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
Network Address Translation (NAT)
Introduction to firewalls
Autonomous and signalled operation for NATs and firewalls
Costs and benefits of NATs, firewalls and other Internet
improvements
The middlebox concept
The basics of NAT technology
Taxonomy of NATs
Interactions of NAT with transport protocols and applications
How NATs modify network packets
The issues resulting form the introduction of NATs
Introduction
This chapter covers Network Address Translation and the Network
Address Translators (NATs) that implement the translation. NATs are one
of a group of technologies introduced into the original IP network to
increase its capabilities and update the architecture to preserve its
flexibility for the future. The next generation Internet Protocol, IPv6,
covered in Chapter 15 and Virtual Private Networks (VPN) covered in
Appendix E, are the other technologies that make up this group. These
solutions all significantly alter the way the network operates while
preserving the basic IP paradigm; data is still transmitted in individually
addressed packets across a stateless data plane that makes a separate
forwarding decision at each hop, depending on the addressing information
in the packet.
NATs have been extensively deployed in the IPv4 Internet to eke out the
limited supply of globally routable IPv4 addresses. NATs allow a network
(such as an Enterprise network) to utilize the ‘private’ address space
defined in RFC 1918. The Enterprise can use some or all of the large
amount of space within prefixes 10/8, 172.16/12 and 192.168/16 in the
private networks on one side of the NAT and still retain a fair measure of
access to the global Internet using a small number of globally unique IPv4
addresses on the other side. Some network operators see the reduced
transparency of the Internet when using NATs as an advantageous means of
access control and a security protection, but the ‘security by obscurity’
offered by a standard NAT is not much of an obstacle to a determined
invader. Genuine perimeter security needs a 'firewall' device that will
appropriately filter the packets rather than just changing the address fields.
NATs can also restrict the deployment of some peer-to-peer applications
(such as VoIP). The spread of NATs has been exacerbated by the delayed
deployment of IPv6, which would solve the address supply shortage. The
network is paying the price through increased processing and latency
experienced by packets traversing NATs, as well as reduced service
deployment velocity because of the interactions between NATs and many
application protocols.
flow state information maintained by the various packet flows to which the
packets belong. In both cases, some packets are allowed to pass through the
device because of the policies set by the administrators of the domain.
NATs will also modify the packets to implement the address mappings. A
NAT is often combined with a firewall in one device because of the
similarities and frequent need to provide both functions at the same point in
a network.
Changes to management
Introduction of IPv6, VPN or NAT technology requires changes to the
network management tools and techniques, and may add significantly to
the complexity of the management task.
IPv6 has some improvements that should make basic management,
especially in Enterprise networks, considerably easier; however, this is
offset by the need to manage networks that can transmit both IPv4 and
IPv6. The easier basic management is also offset by the transition and
migration mechanisms that will be used during the transition period.
VPNs add an extra layer of complexity to the management process, and if
configured tunnels are used, there is a significant management load setting
up and maintaining the tunnels.
The configuration loaded into a NAT or firewall has to be carefully
managed to correctly limit access to and from the outside world by both
applications and users in the private network, as well as preserving the
security of the network behind the NAT and providing an adequate supply
of globally routable IPv4 addresses.
Both NATs and IPv6 have direct interactions with real-time applications
and are explored in the main body of the book, but VPNs are relatively
transparent to real-time applications although the QoS and performance
aspects need to be taken into account when designing networks for real-
time applications. NAT technology, limitations and the large number of
variant implementations are explored in detail in the remainder of this
chapter. IPv6 technology is described in Chapter 15 with some additional
material in Appendix D, while VPN technology is described in Appendix
E. Firewall technology is not discussed in detail in this book, but many of
the application interactions and performance limitations that NATs exhibit
also affect firewalls.
To address this problem, the development of what became IPv6 was started
(see Chapter 15). It rapidly became clear that the IPv4 address space would
run out before IPv6 could be deployed and a range of shorter term
measures were put in place to postpone the exhaustion of IPv4 addresses.
Technical changes to routing allowed by removal of the class
boundaries in addresses. Removing the class boundaries allowed
address allocations for subnetworks to be more closely matched to
the network size increasing the efficiency of address space
utilization.
Restraining the availability of globally routable IPv4 addresses
through address registry allocation policies: users have to justify the
size of each allocation request and registries will only allow a small
percentage of growth headroom in each allocation
Widespread deployment of private addressing schemes behind
Network Address Translators (NATs).
These solutions are now reaching the limits of their effectiveness; IPv4
address space exhaustion is now believed, once again, to be getting much
closer (five years or a little longer.)
NATs as a middlebox
The original architecture for the Internet (DARPAnet) envisioned that
packets would travel end-to-end essentially unmodified (only the Time-to-
Live field is altered as the packet goes from router to router; the checksum
is adjusted to match).
A number of developments have introduced various varieties of
‘middlebox’, which intercept the packets in flight and manipulate some
part of the packet, especially the IP and/or the transport headers.
Middleboxes can also apply filters to the traffic passing through them and
apply various administrative policies that may result in packets not being
forwarded (‘dropped’). Depending on the policy, the middlebox may or
may not try to inform the packet source using an ICMP error message.
Network Address Translators are examples of middleboxes; other varieties
include firewalls and IPsec tunnel gateways.
NAT terminology
Middleboxes and NATs, in particular, have acquired a number of pieces of
special terminology.
Private Address. Address according to the RFC1918 scheme for addresses
that are to be used in networks not directly connected to the public Internet.
Private addresses can be reused across multiple sites that are not
administratively connected.
Packets from the inside are routed to one or other of the translators (it has
to be the same one for every packet in a communication session so that the
same binding is used unless the states of the translators are coordinated)
and, after translation, can be routed onwards by normal IP routing in the
public network. Packets from the outside will be routed to the public side
of the translator using the public address and, after reverse translation, onto
the host using the private address and intradomain routing.
The NAT solution to the scarcity of public IP addresses has several
disadvantages. The most important of these is the removal of the end-to-
end significance of IP addresses, thereby reducing the transparency of the
network and increasing the amount of state in the data plane of the
network. Many protocols, for example, the file transfer protocol (FTP),
took advantage of the end-to-end significance of IP addresses by
embedding numeric IP addresses in protocol payloads. Whether or not this
represents a mistake in the design of these protocols has been extensively
discussed. NAT has to accommodate the existing protocols to remain
application independent and avoid the need for modifications to end hosts.
NAT devices can be expected to contain several application level gateways
(ALGs) that monitor the traffic passing through the NAT and perform
additional modification of the protocol payloads for specific protocols. The
ALG for a protocol ‘understands’ the packet formats and sequences of the
protocol. The NAT and ALGs coordinate to divert protocol packets that
will need additional translation through the ALG where they may be
modified, using the NAT bindings and any additional state maintained in
the ALG, before forwarding.
Restricted Cone devices, external hosts can only send packets from
an IP address and port to which the internal host had previously sent
a packet.
SIP/SDP
RSVP
talk, ntalk
A number of other protocols would need complex ALGs, but it does not
make sense to operate them through NATs, including the following
examples:
All dynamic routing control protocols (including RIP, OSPF, BGP,
IS-IS). Routing in the public and private domains are essentially
independent problems. The NAT devices are placed at the
boundaries. They act as routing proxies for the whole private
domain and forward the traffic between the domains.
BOOTP, DHCP. It makes little sense to deliver configuration
information to hosts in the private domain from servers in the
public domain and vice versa.
SNMP. Translating the information returned by SNMP would be
confusing to managers and might expose details of the private
network that the owners would prefer to keep private.
Finally, a number of common protocols can operate through NATs without
need for ALGs or translation because they do not carry IP addresses or
transport identifiers, including the following examples:
HTTP, HTTPS (unless numeric addresses are used)
TFTP
telnet
archie
finger
NTP, NNTP
NFS
radius
IRC
SSH
POP3
SMTP
rlogin, rsh, rcp (provided Kerberos authentication is not used)
case, the NATs may have to be replaced to deal with a new service or
protocol.
The IETF is currently investigating the use of signaling to control NATs
and other middleboxes. This would remove many of these difficulties but
introduces new needs for policy and security if NATs are to be controlled
by applications.
Security by obscurity
Some network operators, encouraged by marketing hype, have come to see
the hiding of network addresses by NATs as providing some security for
the private network.
Unfortunately, they are more or less completely deceived! Any sense of
security that this provides is totally misplaced. While the NAT reduces the
visibility of the private network topology to a casual observer, a NAT
without additional firewalling provides little or no obstacle to a determined
attacker. All NATs will accept some incoming packets that don't
necessarily come from an approved source and send them on to a machine
in the private network—exactly what the conditions have to be depends on
the type of NAT in use; but, all NATs are vulnerable for at least part of the
time. Without additional filtering, the private network is highly vulnerable
and must be protected.
Management overhead
Managing the address pools and the policies associated with NATs is a
considerable management overhead. The management becomes even more
complex if multiple layers of NATs have to be employed. In areas where IP
addresses are extremely scarce and providers charge a premium for
globally unique addresses, network operators have been forced to use
several layers of NAT to cope with needs of their networks.
References
RFC 1918, Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E.
Lear, “Address Allocation for Private Internets,” BCP 5, IETF, February
1996.
RFC 2391, Srisuresh, P. and D. Gan, “Load Sharing using IP Network
Address Translation (LSNAT),” IETF, August, 1998
RFC 2547, Rosen, E. and Y. Rekhter, “BGP/MPLS VPNs”, IETF, March
1999.
RFC 2663, Srisuresh, P. and M. Holdrege, “IP Network Address Translator
(NAT) Terminology and Considerations,” IETF, August 1999.
RFC 3022, Srisuresh, P. and K. Egevang, “Traditional IP Network Address
Translator (Traditional NAT),” IETF, January 2001.
RFC 3489, Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy,
“STUN - Simple Traversal of User Datagram Protocol (UDP) Through
Network Address Translators (NATs),” IETF, March 2003.
Chapter 17
Network Reconvergence
Shardul Joshi
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
Protection switching versus rerouting
Protection schemes
Spanning tree
Multilink trunking
Distributed Multilink Trunking (MLT)
Split MLT
Virtual Router Redundancy Protocol (VRRP)
Open Shortest Path First (OSPF) Equal Cost Multipath (ECMP)
OPSF failure modes
Introduction
One network for all services, including real-time voice and video, provides
significant cost savings over separate, individual networks; however, if the
“One Network” fails, everything stops. On August 12, 2003, CBS News
provided an example of what happens when the “One Network” fails.
Maryland's Motor Vehicle Administration shut all its offices at noon as
technicians cleaned the agency's network systems. “There's no telephone service
right now. There's no online service right now. There's no kiosk or express office
service.”
“One Network” demands high availability. E-mail, faxes, phones, computer
networks, and video conferencing stops and business ceases if the “One
Network” fails.
Achieving resiliency
There are two methods to providing network resiliency. The first is to
provide redundancy. Redundancy can come through the duplication of
circuits or network elements (for example, ports and routers). The second is
to use protocols to provide quick reconvergence and high availability of
existing circuits after a failure event occurs in the network.
Cost is the most important factor when determining the amount of
redundancy, and where the redundancy is to be integrated into the network.
When evaluating the cost of the network, it is important to look beyond the
simple cost of adding equipment and bandwidth. The cost can be
associated to network outages and may be measured in terms of loss of
revenue or loss of service to customers. This will be a factor in determining
to what degree the network engineer uses sophisticated architecture and
fault management strategies. The end result is survivability. The educated
network engineer must evaluate cost versus effect to provide the greatest
return per dollar spent.
never allow the network to gain any return on investment, and double all of
the required maintenance costs unnecessarily.
This is not to say providing a physical or logical redundancy in some areas
of the network is not a good idea. The engineer must decide which areas
are critical and provide cost effective means to provide resiliency. The first
area that full redundancy can be employed is at the network edge. Typical
customer edge devices are low cost, low reliability and primarily support
the use of simple protocols. Implementing full redundancy in this area
allows an easy way to provide resiliency. The trade-off is elements at the
network edge have a higher population in terms of total equipment in the
network. While these elements may be low in cost, their use of full
redundancy throughout all of these elements may be extremely expensive,
and difficult to justify. Minor outages among a large customer base will
have only a minor impact upon total network Quality of Experience (QoE)
or revenue. The redundancy should only be associated to those critical
customers that have higher requirements for redundancy and resiliency.
The alternative at the edge would be to provide a more robust network
element, which may be less cost-effective based on the amount of traffic
the customer is generating. The other place full redundancy may be
implemented is in critical areas of the network. These areas utilize more
expensive robust equipment; however, the need to maintain constant
uptime regardless of issue requires the use of the redundancy. This scenario
manifests itself in the form of circuit and port redundancy versus element
duplication. In some rare cases, the entire element may be duplicated to
ensure full redundancy. It is important to remember that in the core, the use
of high-availability protocols that can reconverge are often used in
conjunction with these types of robust routers and switches.
Protection schemes
The expression protection scheme, or protection model, designates the
strategies noted 1+1, 1:1, 1:N and M:N in SONET and N+1 protection
when using the data reference model. For clarity and consistency
throughout the book and chapter, we will be using the SONET reference;
however, it is important to keep both models in mind.
There are several protection schemes, corresponding to various degrees of
network availability. Either one protection path protects one working path,
namely the models 1+1 and 1:1, or one protection path protects N working
paths, noted 1:N. This latter model may be generalized by allowing M
protection paths to protect N working paths (M<N), noted M:N.
In the 1+1 strategy, the traffic travels on the working path and the
protection path simultaneously. Traffic is carried only on the working path
in the case of the 1:1 strategy. The protection path may carry a lower
priority traffic that may be preempted upon occurrence of a failure on the
protected path to allow for the recovery of that path. The 1:1 strategy is
extended to 1:N, where one protection path protects N working paths.
Obviously, in the 1+1 strategy, the working and the protection paths must
be disjointed (a failure on one path does not fail the other one). Moreover,
the requirement of protecting against single failures leads to the need that
in the cases 1:1, 1:N, and M:N all the paths be disjointed.
In the 1:N protection model, upon occurrence of a failure on one of the N
working paths, the traffic that used to travel on that failed path is switched
to the protection path. At that time, the remaining N–1 working paths are
no longer protected and are required to find a new protection path(s) as
soon as possible. A long delay between the failure of a path and the finding
of a new protection path affects the degree of network availability.
You normally think that packets can take any number of (seemingly)
random paths to reach their destination. Many of today’s networks use core
routing protocols that are based on shortest path algorithms. That means
that at any given instant in time, there is exactly one shortest path between
any two end points. That path can be thought of as the working path for that
traffic.
In the event of a failure at an intermediate point in the network, the routing
protocols will recalculate the shortest path to create a new path for packets
to follow. Although the resources needed to create that new path are
identified on demand, only a single path is the new shortest path. Using the
terminology described above, with a single (working) path being protected
and a single (protection) path being created, the protection scheme for an
IP-routed network would be best classified as 1:1 protection. The fact that
resources are identified on demand means that we would call it rerouting
rather than protection switching.
Spanning tree
The simplest example is connecting two Ethernet switches together with
two lines for redundancy.
Multi-Link Trunks
can take 30-45 seconds. A failure in the root-bridge of the STP instance,
failure of the Root Forwarding Instance, or a simple configuration error in
the MST are some of the problems that can easily occur but can have
devastating affects on your network. This becomes incredibly complex for
something that should be quite simple.
Does a switch router need two to maintain two network topologies, one for
STP and another for OSPF? No, it does not, particularly at the edge. You do
not need a map to get out of the driveway and an edge switch does not need
a legacy protocol, like STP, blocking traffic for 35-45 seconds every time
there is a link failure somewhere nearby. Thankfully, there is a much
simpler and better way.
Multilink trunking
Multilink trunking (MLT) allows both Ethernet lines to logically connect to
single Layer 3 interfaces or IP Addresses. Both lines are active and the load
is shared across them; there is no need for STP and no need for multiple
VLANs. There are no blocking traffic or convergence delays associated to
MLT. In the event of failure, the Layer 3 protocols, such as OSPF, are not
affected since MLT is entirely Layer 2.
Multi-Link Trunks
MLT
Switch 1 Switch 2
Distributed MLT
Distributed Multilink Trunking (DMLT) simply expands the MLT concept
to spread Multilink Trunks over multiple switches. Typically, this is done
within a single stack where Ethernet switches are stacked and daisy-
chained together, as depicted in the following. MLT only protects against
links failures, whereas DMLT extends protection to cover a switch failure
within the stack.
Switch
Stack 1
DMLT
Switch 2
Split MLT
Split Multilink Trunking (SMLT) is designed to provide aggregation switch
redundancy. An aggregation switch is generally defined as being a switch
not at the edge of the network, connected to switches at the edge. SMLT
can tolerate any link or any switch failing. The network will restore within
milliseconds. SMLT overcomes the shortcomings of the STP by
eliminating the loops that would cause Spanning Tree ports to be blocked.
VRRP No Loops
No Spanning Tree
Split-MLT Fast fail over < 1 sec
Load sharing
Aggregation
Switches
Layer 3 redundancy
Multilink Trunking (MLT) operates at Layer 2—not Layer 3. Apart from
routing IP itself, there is a range of simple failover mechanisms at Layer 3.
These include Virtual Router Redundancy Protocol (VRRP), Equal Cost
Multipath (ECMP), and Border Gateway Protocol (BGP) Multiexit
Discriminators (MEDs). Each have a role to play depending on where
failover is required in the network. For example, VRRP and ECMP can be
used on the same router where VRRP protects from equipment failure,
while ECMP protects from a path failure. The following section examines
each mechanism and provides examples of where they can be best used.
VRRP
1/2
1/2 1/7
1/7 1/7
1/7 1/2
1/2
.2 .2 .2 .2
R 1/1 1/1 R
Clinic / Hospital A 1/1 1/1 Clinic / Hospital B
Passport 8600 VRRP.2 VRRP.3 Passport 8600
VL 12 = 10.19.12.2
1/8 1/8
1/8
VRRP 10.19.12.1 backup 1/8
VL 12 = 10.19.12.3
VL 10 = 10.19.10.1 (IST)
10.19.12.0/24 VRRP 10.19.12.1 master
IST peer 10.19.10.2
CLIP: 10.1.1.3 .1 CLIP: 10.1.1.4 VL 10 = 10.19.10.2 (IST)
1/6
1/6 1/5
1/5
IST peer 10.19.10.1
VLAN 12
48
48 47
47
OSPF ECMP
Equal Cost Multipath protects against link failure and is best used on the
links between high availability routers where load sharing (not balancing)
and quick recovery from a failure are required. ECMP allows a router
running OSPF to distribute traffic across multiple, equal cost routed paths.
The benefits of ECMP routing include load sharing across multiple paths to
the same destination and rapid convergence to the alternate path if a path
becomes unavailable due to a network event. When configured between
switch routers for load balancing, each ECMP connection should have its
own VLAN and Spanning Tree Group (STG).
IP routing OSPF/BGP
RIP is not a High Availability routing protocol, as it takes too long to
recover from a failure and does not scale easily. For this reason, network
engineers need to evaluate IP protocols that provide better recovery times
and scale easily. Two such protocols are Open Shortest Path First (OSPF)
and Border Gateway Protocol (BGP). This is not to say RIP cannot be used
at the network edge to provide easy connectivity to end users, as RIP
allows for easy connection and introduction of elements into the network.
Network engineers need to build an IP routing hierarchy.
OSPF was the first Dijkstra-based routing protocol introduced into IP
routing. It was completed in 1992, and many of the same authors then went
on to the ATM forum to create Private Network-to-Network Interface
(PNNI). The next section covers OSPF first, and then points out where
OSPF and PNNI share similar characteristics.
The IP routing hierarchy for OSPF is based on having areas contained
within autonomous systems. Each autonomous system maintains its own
separate routing databases. Within each autonomous system, there are
areas defined that permit segregation of the routing database information
from the autonomous system. Within an Area, all routers will completely
share all routing table information with all other routers. However, only a
summary of that information is shared between the areas. This form of
route summarization allows the routing table exchange to be smaller,
allowing for faster convergence times.
OSPF does not share routing information between autonomous systems.
BGP is used to provide this function. BGP is essentially a distance-vector
protocol. BGP uses AS, similar to OSPF, as well as next hop metrics. OPSF
operates only within an autonomous system and is called an Interior
Gateway Protocol (IGP); whereas, BGP usually operates between
autonomous systems and is called an Exterior Gateway Protocol (EGP)
(although BGP sometimes is used internally).
OSPF OSPF
Area 0 Area 0
BGP
Area 2 Area 6
Area 1 Area 1
Area 0
192.10.0.0/22
192.20.0.0/22
Area 2
Only 2 Summarised
Area 1 routes sent to Area
192.10.0.0/24 192.20.0.0/24 2, not all 8
192.10.1.0/24 192.20.1.0/24
192.10.2.0/24 192.20.2.0/24
192.10.3.0/24 192.20.3.0/24
Autonomous System 1
Figure 17-9: Address Summarization
PNNI
Private Network-to-Network Interface (PNNI) provides dynamic routing
and signaling. PNNI-based switching systems monitor network topology
and available resources where calls automatically route around points of
congestion and failure. Nodes in a PNNI network exchange topology and
link state information on an ongoing basis. In this way, nodes maintain an
up-to-date view of the state of the network. PNNI was developed
immediately after OSPF. In PNNI, the Dijkstra SPF algorithm was added
for automatic routing (along with a host of other features that extend far
beyond OSPF capabilities). These capabilities are only available on the
more expensive ATM switches typically used within Carrier systems.
Many of ATMs PNNI traffic engineering capabilities are replicated within
MPLS. Even at a very basic level, label swapping in MPLS is very similar
to header swapping in ATM and it could be said that MPLS provides ATM-
like characteristics to the broader IP/Ethernet market.
Address matching
An incoming call setup request is received over the signaling channel of a
UNI interface. The call routing table is then scanned for the specified
called address. From the table, the switch selects the best matching address.
This address is the table entry with the most hexadecimal digits identical to
those of the called address. This process is called a maximal address match.
The call setup request is then forwarded to the next-hop UNI interface
associated with the best-match address. If no match exists, the call is
cleared back to the previous node.
PP4 PP7
blocked
PP2 PP5
PNNI i/fs with
blocked Resources
Unavailable
A B
UNI UNI
PNNI
PP1 Interfaces PP3 PP6
Figure 17-10: Forwarding
PG(A.1) PG(A.2)
PG(A.3)
A.1.3
A.2.3 A.3.4
A.1.5
A.1.4 A.2.2 A.3.3
PVC
NEW NODE
Re-Optimized SPVC SPVC Destination
SPVC Source
Setup to
4720
Source
‘4710’
QoS Variance
Variance is provisioned per QoS. It allows a call to ‘see’ more available
paths in the network. The green lines in Figure 17-15 represent the ‘paths’
available for a call to be routed on.
Variance
Setup Load
Request Balance
Destination
Source
Acceptable Paths
Figure 17-15: QoS variance under PNNI
For example, assume the middle two paths in Figure 17-15 provide a Cell
Transfer Delay (CTD) of 150 ms. The upper path provides a CTD of 250
ms and the lower path a CTD of 220 ms.
Without variance, if the call setup request requires a Cell Transfer Delay of
200 ms, only the middle two paths would be acceptable. This limits the
choices in routing the connection across the network. If there are many
such connections, the middle two paths will become more heavily loaded
than the only slightly different upper and lower paths.
If the QoS variance is set to 25%, all paths meeting a CTD metric of 250
ms (200 ms*125%) would be acceptable. Thus, the upper and lower paths
could be used and calls would have more paths to choose from. Combined
with one of the load balancing algorithms, this will provide excellent call
and load distribution.
What you are seeing is the ability of the ATM network to evaluate all of the
available links within the network and determine the ability of each
individual link to provide the desired QoS. This can be a very tricky thing;
however, the network engineer can use this to their advantage. By slightly
relaxing the QoS requirements on a service, the network is available to a
number of parallel links allowing ATM to load share across the number of
links. This makes data transfer more efficient across a network. By setting
the connection QoS requirements too stringently, the network engineer can
choke the links not allowing connection completion even while bandwidth
is available.
Chapter 18
MPLS Recovery Mechanisms
Ali Labed
.
Media Session Related
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLSMPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
The various MPLS protection schemes
The components of an MPLS recovery solution
LSP Setup using RSVP-TE
LSP monitoring, detection, and notification using RSVP-TE and
ITU-T Y.1711
MPLS Scope of Recovery - Global and Local
Introduction
“Network Survivability refers to the capability of the network to maintain
service continuity in the presence of faults within the network. This can be
Protection schemes
The expression protection scheme, or protection model, designates the
strategies noted 1+1, 1:1, 1:N and M:N. In all of these strategies, the
protection path must be disjointed from the corresponding working path on
the working path segment (link, node, or the whole path) that is targeted for
protection by the protection path. Two paths are said to be disjointed on a
given segment of the working path if a failure on one path does not cause a
failure on the other path on the same segment.
There are several protection schemes, corresponding to various degrees of
network availability. Either one protection path protects one working path,
namely the models 1+1 and 1:1, or one protection path protects N working
paths, noted 1:N. This latter model may be generalized by allowing M
protection paths to protect N working paths (M<N), noted M:N.
In the 1+1 strategy, the traffic travels on the working path LSP and on the
protection path LSP, simultaneously. Obviously, the working and the
protection paths must be disjointed. Under normal operational conditions,
the LSPs egress Label Edge Router (LER) accepts the packets arriving on
the working path LSP and drops those arriving on the protection path LSP.
However, upon occurrence of a failure on the working path LSP, and after
the egress LER of that working path LSP receives the failure notification
(or detects the failure itself), the egress LER starts accepting packets
arriving on the protection path.
The traffic is carried only on the working path in the case of the 1:1 (1:N)
strategy. The protection path may carry a lower priority traffic that may be
preempted upon occurrence of a failure on the protected path (on one of the
protection paths). The M:N protection model is a variant of the 1:N model
where there are M protection paths instead of 1. In these three protection
models, all paths must be disjoint (that is, physically and logically
separated, except for the endpoints) in order to meet the requirement of
protection against single failures.
Association-Configuration component
There should be a component that keeps track, at the decision nodes, of the
association between the Working Paths (WPs) and their corresponding
Protection Paths (PPs).
Monitoring component
In order to detect the faults (described in the Detection section), the
network must be monitored. An important issue addressed by the
monitoring component is the frequency of monitoring. The higher the
frequency of monitoring, the faster the defect is detected, but the higher the
overhead incurred. The recovery target time dictates the frequency of
monitoring.
Furthermore, monitoring may be available at the network layers that are
below the MPLS layer, such as the physical, and the data-link layers. At
those layers, the monitoring happens at higher frequencies than at higher
layers.
Detection component
The design of the detection functionality needs to address the issues of
which defects to detect, which component detects each of the defects, and
at which network layer. Mechanisms must be provided in order to detect
network connectivity defects. Furthermore, in the context of LSP
protection, there is a need for an MPLS layer OAM mechanism in order to
detect MPLS-fabric defects. Two dimensions to this is sins of omission
(that is, missing heartbeat, large gaps in sequence numbers) and sins of
commission (leakage of traffic). Defects may result in multiple points of
detection; not all of which may be able to perform notification. Network
impairments, such as congestion, that result in lower throughput are not
included because they are outside the scope of this chapter.
Notification component
Upon occurrence of a fault and after its detection, the “detecting” node
needs to have means to notify the decision node, and needs to know when it
should send its notification: immediately or after a predefined delay. In
other words, the Notification mechanism addresses the following issues:
Who to notify. The notification message needs to be transported
and forwarded from the detection node to the decision node.
When to notify. In order to allow a lower layer recovery
mechanism, if any, to take place, the MPLS layer must react to a
defect if it persists for a time defined to be long enough to allow the
lower layer to try to recover from the failure.
How to notify. The notification message needs a transport method.
Path Switching
After being notified of a failure on the working path of an LSP, the decision
node needs to switch traffic from the failed working path to the
corresponding protection path.
Routing component
Within the context of MPLS, the path of an LSP can be computed and
established in various ways. This task can be achieved by an Interior
Gateway Protocol (IGP), which chooses the (generally) shortest path.
Other ways to compute and establish an LSP path include manually,
automatically online using constraint-based routing processes, and
automatically offline using constraint-based routing entities implemented
on external support systems [INTERNET-TE].
“Constraint-based routing system refers to a class of routing systems that
compute routes through a network subject to the satisfaction of a set of
constraints and requirements” [INTERNET-TE]. Constraint-based routing
processes can be provided by a Traffic-Engineering system. The latter can
be centralized or distributed. In the centralized design, all decisions are
centralized as well as the necessary information about the network to take
those decisions. In the decentralized case, the decisions are taken by each
router autonomously based on the routers view of the state of the network,
which requires a protocol between these decision entities in order to
exchange information on that state.
In order to set up an LSP, the LSP ingress LER builds an RSVP-TE Path
message and sends it to the LSP egress LER. The Path message includes
the explicit route (ER) that must be followed by that message. The Path
message establishes an RSVP Path state on each node it traverses (and
must be listed in the ER object) towards the egress LER. In order to
respond to that Path message, the egress LER builds a Resv message and
sends it to the Ingress LER. The Resv message crosses, in the reverse
direction, the same path traversed by the corresponding RSVP Path
message and establishes an RSVP Resv state on each node it traverses
towards the Ingress LER (see Figure 18-2). Those states include the
parameters associated with the corresponding LSP.
S1 R
Ingress N1 N N3 N4 Egress
1: Path 2: Resv
2
Figure 18-2: LSP setup using RSVP-TE signaling protocol
The RSVP (Path and Resv) states are called “soft states”; that is, they need
a periodic refresh in order not to be deleted (and the corresponding LSP
torn down). For a given LSP and a given node traversed by that LSP, the
RSVP-TE refresh operates as follows:
The RSVP-TE process on the node periodically retransmits to its
downstream neighbor, the Path message.
The RSVP-TE process on the node periodically retransmits to its
upstream neighbor, the Resv message.
A node detects a failure when it does not receive the expected refresh
message from a neighbor after a (configurable) delay. Upon detection of a
failure, a node sends an RSVP tear message to the end node.
S1 R
Ingress N1 N2 X N3 N Egress
ResvTear PathTearr 4
Figure 18-3: RSVP tear message upon detection of a failure
In order to detect faults in a timely fashion, the refresh messages must run
at a high frequency. However, this raises the scalability issue as the refresh
mechanism is per LSP. This issue has been overcome in RSVP-TE. The
latter extends RSVP with several features and mechanisms, including a
keep-alive protocol called RSVP-TE Hello protocol that is described in the
sequel.
RSVP-TE Hello
RSVP-TE Hello protocol provides node-to-node failure detection. It runs
between neighboring nodes (at the control plane). It is a Layer 3 keep-alive
mechanism that enables RSVP nodes to detect when a neighboring node is
not reachable. The neighbors periodically exchange keep-alive (Hello)
messages. The loss of communication with a neighbor is declared after a
configurable number (default=3) of consecutive Hello messages are
missing. A node can detect a loss of communication with a neighbor over a
specific link (in case multiple links run between neighbors, a different
instance of RSVP Hello runs on each link). When such a fault is detected,
the detecting node reacts exactly in the same way as when a fault is
detected through non–reception of an RSVP refresh messages.
In the context of LSP protection, there are two types of faults that need to
be detected:
Network connectivity faults. An interruption in the data path.
MPLS fabric defects. The data continue to be forwarded, but on a
wrong path, not the one that was configured for it to be forwarded
on. In other words, the symptom of MPLS fabric defects is the
sending, by any node on an LSP path, of packets of a certain LSP
on a different LSP. This symptom is called misrouting.
RSVP-TE Hello protocol detects network connectivity defects, but does
not detect MPLS fabric defects. There is a need for an MPLS layer OAM
mechanism in order to detect MPLS-fabric defects. Two mechanisms are
available to fulfill that role: ITU-T Y.1711 and IETF MPLS Ping. From a
functionality point of view (that of detecting MPLS table failures), both
mechanisms are similar. MPLS ping offers a supplementary functionality
that supports debugging. After occurrence of a failure, an operator can use
MPLS ping in order to locate the failure.
ITU-T Y.1711
ITU-T Y.1711 provides a mechanism for path continuity test in order to
detect “path failures.” In this scheme, the Ingress of an LSP periodically
inserts, in the LSP, a specific OAM packet—called Connectivity
Verification (CV)—into the concerned LSP (in-band). The Egress of the
LSP detects a defect on the LSP when it does not receive three consecutive
CV packets for that LSP, at which time it sends a Backward Defect
Indication packet (BDI) to the Ingress of that LSP to notify the Ingress
about the fault. Therefore, the Ingress is the only node that has the ability to
recover from a failure—called the Path Switch LSR (PSL).
ITU-T Y.1711 includes a mechanism, whereby, on any node on a path of an
LSP, a layer below the MPLS layer that detects a fault notifies the MPLS
layer. The MPLS layer sends a Forward Defect Indication (FDI) towards
the Egresses of the affected LSPs. As the lower layer defect detection and
BDI
Egress
Ingress LER
LER
LSP 1 2 3 X 4 5
MPLS Ping
MPLS Ping is used to detect data-plane failures in MPLS LSPs. This
mechanism is modeled after the ping/traceroute philosophy. It operates
under two modes. The first mode is the fault detection at the data plane,
where MPLS Ping is used for connectivity checks, while the second mode
complements the first one by providing a fault isolation mechanism. In this
mode, Traceroute is used for hop-by-hop fault localization.
MPLS Ping may be used in various ways. The following depicts how it can
be used for LSP path continuity test, which belongs to the first mode. In
this case, MPLS Ping operation is similar to that of ITU-T Y.1711. The
Ingress of an LSP periodically inserts, in the LSP, an Echo packet and
expects to receive a Reply message from the expected (the configured) LSP
egress. A failure is detected when either a no reply is received or a different
egress responds with the inclusion of the corresponding error code.
However, unlike ITU-T Y1711, MPLS Ping relies on many non-LSP
components, and a fault notification is not a reliable indication of an actual
problem on the LSP. Furthermore, with no alarm suppression mechanisms,
a ping failure is not coordinated with local detection mechanisms. If a link
fails, both the local LSRs and the pinging LSR will detect a problem in an
uncoordinated fashion.
Global recovery
In global recovery (abbreviation: Global), also called end-to-end recovery,
the working path LSP is protected by an end-to-end protection path LSP
(see Figure 18-5). The latter is preestablished and does not depend on use
of either Path-switching or rerouting.
The working path LSP and its corresponding protection path LSP are
disjoint; that is, they have the same end points1 (ingress and egress LERs)
and those are the only network elements that they have in common. In this
case, Global protects against link and node failures on the working path,
except for the ingress (and possibly the egress) LER. However, in certain
cases, Global protects only a segment of the working path LSP. In which
case, the protection path LSP starts at a node downstream from the ingress
LER node, called PSL (Path Switch LSR), and merges back with the
working path LSP at a node upstream from the egress LER (called PML:
Path Merge LSR).
In both cases, upon occurrence of a fault, the PSL or the ingress LER
receives a notification message. It is responsible for switching the traffic
from the working path to the protection path. The time the fault notification
message takes to reach the PSL (or the ingress) is important because it
makes the recovery time unacceptable for certain applications. Local
recovery overcomes this problem.
1. The Egress node of the working path LSP may be different from that of the protection path LSP, if the
destination of the traffic is reachable through another Egress. In this case, the protection path LSP
protects against failures on the working path Egress, as well.
Local recovery
In local recovery, a node traversed by an LSP protects against failures on a
subtending link or on a neighboring node, traversed by that LSP. In other
words, it is the node directly upstream of the failed component that is
responsible for detecting the failures and switching the traffic on an
alternate route (using Protection-switching or rerouting). That node is
called Point of Local Repair (PLR). The local protection path LSP merges
back with the main working path LSP, at a downstream node, called Path
Merge LSR (PML) (see Figure 18-5).
Local recovery achieves better recovery times than Global, since the
notification message does not have to travel upstream to the Path Switch
LSR (PSL) (the PLR is itself the PSL).
LSP 1 X X2 3 4 X 5
PP_1
PP_3
PP_2
6 7
8
9
Figure 18-5: Global (end-to-end) and local protection
Node recovery imposes itself when the downstream node is deemed to be
unreliable, otherwise link recovery is sufficient. Therefore, the choice
between these two alternatives is based on the degree of nodes reliability
and the target level of network availability.
References
RFC 2205, R. Braden et al., “Resource ReSerVation Protocol (RSVP),”
IETF, September 1997, http://www.ietf.org/rfc/rfc2205.txt
RFC 3209, D. Awduche et al., “RSVP-TE: Extensions to RSVP for LSP
Tunnels,” IETF, December 2001, http://ietf.org/rfc/rfc3209.txt
FRR_atlas, Alia Atlas et al., “Fast Reroute Extensions to RSVP-TE for
LSP Tunnels,” IETF draft-ietf-mpls-rsvp-lsp-fastreroute-01.txt, IETF.
ITU-T Recommendation Y.1711, “OAM Mechanism for MPLS
Networks,” Study Group 13, International Telecommunication Union
Telecommunication Standardization Sector (ITU-T), February 2002.
Informative references
MPLS-ARCH, “Multi-protocol Label Switching Architecture”, E. Rosen et
al. Request for Comments: 3031 http://ietf.org/rfc/rfc3031.txt, January
2001
MPLS-RECOV, “Framework for MPLS-based Recovery”, Vishal Sharma
et. al., <draft-ietf-mpls-recovery-frmwrk-05.txt>, http://search.ietf.org/
internet-drafts/draft-ietf-mpls-recovery-frmwrk-05.txt
NET-SUSRVIV, “Network Survivability Considerations for Traffic
Engineered IP Networks”, Ken Owens et al., <draft-owens-te-network-
survivability-03.txt>, http://search.ietf.org/internet-drafts/draft-owens-te-
network-survivability-03.txt
MPLS-TE, “Requirements for Traffic Engineering Over MPLS”, D.
Awduche et al., IETF RFC-2702http://www.ietf.org/rfc/rfc2702.txt, Sept.
1999
INTERNET-TE, “Overview and Principles of Internet Traffic
Engineering”, D. Awduche et al.,Request for Comments: 3272 http://
www.ietf.org/rfc/rfc3272.txt, May 2002
Chapter 19
Implementing QoS: Achieving
Consistent Application Performance
Ralph Santitoro
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
SIP Perspective
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
Mapping DiffServ to Layer 2 QoS mechanisms
Mapping DSCP to and from 802.1p user priorities
Mapping DiffServ to frame relay
Mapping DiffServ to ATM
Mapping DiffServ to PPP classes
Mapping DiffServ to MPLS E-LSPs and L-LSPs
Application performance requirements
Categorizing applications based on end user expectations and
performance objectives
Making QoS simple with network service classes
Introduction
Chapter 10 describes the many mechanisms that are used to provide good
QoS performance for real-time applications. Chapter 10 looked at QoS
from a bottom-up approach. This chapter takes a top-down approach to
implementing end-to-end QoS policies. First, the mapping between
DiffServ (IP QoS) and various Layer 2 QoS mechanisms are discussed.
This is followed by a discussion of the performance requirements and
categorization of applications supported over a converged network. Finally,
an approach to simplify QoS is discussed, which is based on network
services classes that provide common QoS policies for popular real-time
and non–real-time applications with similar QoS performance
requirements.
Maps to
DSCP
802.1p value
CS7
7
CS6
EF, CS5 6
AF41, AF42, AF43, CS4 5
AF31, AF32, AF33, CS3 4
AF21, AF22, AF23, CS2 3
AF11, AF12, AF13, CS1 2
DF, CS0, all undefined DSCPs 0
CS7, CS6 0
EF, CS5 0
AF41, CS4 0
AF42, AF43 1
AF31, CS3 0
AF32, AF33 1
AF21, CS2 0
AF22, AF23 1
AF11, CS1 0
AF12, AF13 1
DF, CS0 1
1. In general, packets marked with the CS7 DSCP should be mapped to rt-VBR. However, there are
critical protocols that provide constant rate heartbeats that require the lowest loss and delay for
optimal network. Such protocol packets should be mapped to CBR.
2. In general, packets marked with the EF DSCP should be mapped to rt-VBR. However, circuit
emulation over IP application packets marked with the EF DSCP should be mapped to CBR2.
The short sequence number format (four classes) is used for connections
operating at less than 1 Mbps. Popular low-speed WAN connections in use
today are 56 kbps, 64 kbps, 128 kbps, 256 kbps and 384 kbps. Additionally,
popular DSL services use PPP over Ethernet (PPPoE) over DSL
connections, which have symmetrical or asymmetrical connections speeds
similar to the aforementioned low-speed WAN connections.
The short sequence number format only allows a subset of the DiffServ
PHBs to be supported. Table 19-5 provides an example mapping. Other
mapping arrangements are possible. Typically, the EF-marked packets are
marked with the PPP class number corresponding to the PPP class that
provides the lowest forwarding delay.
mapping of the EXP bits to PSC can be either explicitly signaled during
label setup or statically configured. Since there are many ways to configure
the EXP bits to support PSCs, a flexible mapping approach is required.
Table 19-6 provides an example DSCP to EXP mapping that maximizes
the number of DiffServ PSCs that can be supported.
Table 19-8: Example DSCP to EXP mapping for L-LSPs using labels 100-110
E-LSPs support many DiffServ PSCs per E-LSP compared to L-LSPs that
support only one DiffServ PSC per L-LSP. Fewer LSPs simplify network
operations, administrations, management and provisioning (OAM&P),
resulting in lower cost of operations. This is why E-LSPs are
predominantly used when implementing QoS in MPLS networks.
Performance Dimensions
Bandwidth Sensitivity to
Example Applications
Needs Delay Jitter Loss
IP Telephony (VoIP) Low High High Med-High
Interactive Video
Med-High High High High-Med
Conferencing
Streaming Video on Demand Med-High Med Med Med
Streaming Audio (Webcasts) Low Med Low Med
Client / Server Transactions Med Med Low High
Email Low Low Low High
File transfer Med-High Low Low High
Categorizing Applications
Networked applications can be categorized based on end-user expectations
or application performance requirements. Some applications are between
people while other applications are between a person and a networked host;
for example, a PC (user) and a web server. Finally, some applications are
between networking devices (for example, router to router).
Applications can be divided into four different traffic categories: namely,
Network Control, Interactive, Responsive and Timely. Refer to Table 19-
10. The table includes some representative applications in the different
categories.
Interactive Applications
Some applications are “interactive”; whereby, two or more people actively
participate. The participants expect the networked application to respond in
real time. In this context, real time means that there is minimal delay
(latency) and delay variation (jitter) between the sender and receiver. Some
interactive applications, such as a telephone call, have operated in real time
over the telephone companies' circuit switched networks for decades. The
QoS expectations for real-time voice applications have been set and,
therefore, must also be achieved as voice applications are migrating from
being circuit-based to being packet-based (for example, VoIP).
Other interactive applications include video conferencing and interactive
gaming. Since the interactive applications operate in real time, packet loss
must also be minimized. Imagine a telephone call where whole or partial
words regularly get lost during the conversation. This level of QoS
performance would not only be unsatisfactory but would make the
application (telephone call) not very usable.
Interactive applications typically use Universal Datagram Protocol (UDP)
and, hence, cannot retransmit lost or dropped packets as with Transport
Control Protocol (TCP)-based applications. However, lost packet
retransmission would not be beneficial because interactive applications are
time-based. For example, if a voice packet was lost, it doesn't make sense
for the sender to retransmit it because the conversation has progressed in
time and the lost packet might be from part of the conversation that had
already passed in time.
Responsive Applications
Some applications are between a person and a networked host or
application. End users require these applications to be “responsive”, so a
request sent to the networked host requires a relatively quick response back
to the sender. These applications are sometimes referred to as being “near
real-time” and require relatively low packet delay, jitter and loss. However,
QoS performance requirements for responsive applications are not as
stringent as for the interactive (real-time) applications. This category
includes streaming media and client/server transaction-oriented
applications.
Streaming media applications (for example, movies on demand or
webcasts) require the network to be responsive when they are initiated so
the user doesn't wait too long before the media begins playing. These
applications also require the network to be responsive for certain types of
signaling. For example, with movies on demand, when one changes
channels or “forwards”, “rewinds” or “pauses” the media, one expects the
application to react similarly to the response time of their 'standalone' video
player controls.
Web-based applications involve a user selecting a hyperlink to jump to a
new page or submit information; for example, place an order or submit a
request. These applications also require the network to be responsive, such
that once the hyperlink is selected, a response (for example, a new page
begins loading) occurs typically within one to two seconds. With
broadband Internet access connections, this type of performance is often
achieved over a best-effort network, albeit somewhat inconsistently. Other
Timely Applications
Some applications between a person and networked host or application
require 'timely' and reliable delivery of the information in 'minutes' instead
of 'seconds'. Such applications include e-mail and file transfer. The relative
importance of these applications is based on their business priorities.
These applications require that the packets arrive within a bounded delay.
For example, if an e-mail takes a few minutes to arrive at its destination,
this is acceptable. However, in a business environment, if an e-mail took
ten minutes to arrive at its destination, this may be unacceptable. The same
bounded delay applies to file transfers. Once a file transfer is initiated,
delay and jitter are less critical because file transfers often take many
minutes to complete. Note that these timely applications use TCP-based
transport and, therefore, packet loss is managed by TCP, which retransmits
any lost packets resulting in no packet loss.
Timely applications expect the network to provide packets with a bounded
amount of delay. Jitter has a negligible effect on these types of applications
and packet loss is reduced to zero due to TCPs loss recovery mechanisms.
and delay. This is done because the network must be operating properly in
order for it to provide proper QoS performance for the end-user
applications.
Network control applications require a relatively low amount of delay.
Jitter has a negligible effect on these types of applications and packet loss
must be minimized since some of these applications are not transported via
TCP and, hence, do not have packet loss recovery mechanisms.
Performance Characteristics
Category
N
Traffic
Tolerance
Tolerance
Tolerance
Profile
Traffic
Delay
S Target Applications
Jitter
Loss
C
Critical
Network Control
Very Small
Critical heartbeats between nodes Very Low N/A
Low packets
Low to Variable-
COPS, RSVP
Low N/A Very sized
DNS, DHCP, BootP, high priority Low packets
OAM
VoIP (G.711, G.729 and other
codecs)
Typically,
Premium
sized
Lawful Intercept
packets
T.38 Fax over IP
Circuit Emulation over IP
Typically,
Platinum
Variable-
Billing record transfer
High N/A Low sized
Non critical OAM&P (SNMP, packets
Timely
TFTP)
Standard
Nodal QoS
Premium NSC default settings
Mechanism
For trusted interfaces, all packets marked with Expedited
Classifier Forwarding (EF) DSCP or Class Selector (CS) 5 DSCP
are placed into Premium NSC.
Once classified from untrusted interfaces, mark voice
media packets with EF DSCP.
Marker
Once classified from untrusted interfaces, mark voice
signaling packets with CS 5 DSCP.
Meter packets to the configured rate and committed burst
Policer size; for example, CIR and CBS. Drop packets
exceeding configured rate or burst size.
For Ethernet interfaces, mark all packets with 802.1p
user priority 6.
For ATM interfaces, use ATM service category rt-VBR,
Link Layer
mark all packets with CLP=0.
QoS
For Frame Relay interfaces, mark all packets with DE=0.
For multiclass PPP interfaces, mark all packets with PPP
Class Number 3.
Scheduler Use priority scheduler.
Queue Use tail drop queue management. Disable Active Queue
Management Management (AQM) techniques such as WRED.
Shaping Disable shaping.
Table 19-12: Premium NSC default nodal settings
Table 19-13 provides a summary of traffic conditioning per NSC.
DiffServ Standard
Policing Action for Scheduler Queue
NSC PHB DSCPs per
out of profile traffic Type Mgmt.
Group NSC
Critical CS CS7 Drop Priority 1 Tail Drop
Network CS CS6 Drop Weighted Tail Drop
Premium EF EF, CS5 Drop Priority 2 Tail Drop
AF41, AF42,
Platinum AF4 Remark Weighted AQM
AF43, CS4
AF31, AF32,
Gold AF3 Remark Weighted AQM
AF33, CS3
AF21, AF22,
Silver AF2 Remark Weighted AQM
AF23, CS2
AF11, AF12,
Bronze AF1 Remark Weighted AQM
AF13, CS1
None
Standard DF DF, CS0 Weighted AQM
(managed by AQM)
3. The Premium NSC traffic uses a higher PPP class number because it is scheduled over a connection
before Critical and Network NSC traffic to minimize any jitter that would be introduced by the
network control traffic over low bandwidth (<1 Mbps) connections.
Network Default
IP Telephony Application-to- Default
Service 802.1p User
Flow Type Application DSCP
Class Priority
Gateway-Gateway Premium EF 6
Voice Media
Terminal-Terminal Premium EF 6
(bearer)
Terminal-Gateway Premium EF 6
Terminal-Terminal Premium CS5 6
Terminal-Gateway Premium CS5 6
Voice
Signaling Gateway-Gateway Premium CS5 6
(Control) Controller
Gateway-Gateway Premium CS5 6
Controller
Fax (T.38) Gateway-Gateway Premium CS5 6
Table 19-15: Default markings for telephony gateways, terminals, and gateway controllers
This 8 kbps is the “raw voice” bandwidth, which is then encapsulated into
an IP payload that results in 24 kbps of IP bandwidth.
Voice compression is typically not used over high bandwidth, Ethernet
connections. Uncompressed voice is encoded using the ITU G.711 codec
with the voice samples encapsulated into an IP packet. This results in 80
kbps of IP bandwidth using 20 ms voice samples. This is quite small
compared to 10 Mbps or 100 Mbps of Ethernet bandwidth.
In addition to the IP bandwidth, the Layer 2 protocols (for example,
Ethernet, PPP, frame relay or ATM) used to transport the IP packet must be
included when calculating the total bandwidth.
that as the 200 ms delay budget is exceeded, most users tend to perceive the
delay, resulting in dissatisfaction in voice quality. Every time a VoIP packet
passes through a device or network connection, delay is introduced. A
significant amount of delay can be introduced over low-bandwidth
connections.
Note that with PPP fragmentation, the packets are only fragmented over the
PPP connection. With IP fragmentation, packets are fragmented from
source to destination resulting in reduced application performance. For
PPP fragmentation, the fragment size of the data packet is selected based
on the maximum size of one VoIP packet. Depending upon the voice codec
used and the number of voice samples per voice payload, the PPP
fragmentation size will vary. Refer to Table 19-16.
minimize the delay and jitter that other traffic can introduce to the VoIP
packets.
Packet Reordering
In some cases there may be multiple paths for a VoIP packet to take when
traveling from its source to its destination. If all VoIP packets do not take
the same path, then packets could arrive out of order. This can cause voice
quality issues, even though packet reordering often has little or no adverse
affect on data application quality since such data packets can be
retransmitted via the TCP protocol.
If two locations connect via two frame relay PVCs, one must ensure that all
VoIP packets for a particular call traverse the same PVC. The routers can
be configured to direct the voice packets from the same source/destination
IP address to traverse the same PVC. Another approach is to configure the
router to send all voice traffic over one of the PVCs.
References
RFC 2597, J. Heinanen, et. al., “Assured Forwarding PHB Group,” IETF,
http://www.ietf.org/rfc/rfc2597.txt
ATM Forum AF-TM-0121.000 Version 4.1 “Traffic Management
Specification,” ftp://ftp.atmforum.com/pub/approved-specs/af-tm-
0121.000.pdf
RFC 2474, K. Nichols, et al., “Definition of the Differentiated Services
Field (DiffServ Field) in the IPv4 and Ipv6 Headers,” IETF, http://
www.ietf.org/rfc/rfc2474.txt
RFC 2475, S. Blake, et. al., “An Architecture for Differentiated Services,”
IETF, http://www.ietf.org/rfc/rfc2475.txt
RFC 3270, F. Le Faucheur, et. al., “Multi-Protocol Label Switching
(MPLS) Support of Differentiated Services,” IETF, May 2002, http://
www.ietf.org/rfc/rfc3270.txt
RFC 3246, B. Davie, et. al., “An Expedited Forwarding PHB,” IETF, http://
www.ietf.org/rfc/rfc3246.txt
IEEE 802.1Q, “Virtual Bridged Local Area Networks,” http://
standards.ieee.org/getieee802/download/802.1Q-2003.pdf
R. Santitoro, “Introduction to Quality of Service (QoS),” April 2003, http://
www.nortelnetworks.com/products/02/bstk/switches/bps/collateral/
56058.25_022403.pdf
RFC 1990, K. Sklower, et. al., “The PPP Multilink Protocol (MP),” IETF,
August 1996, http://www.ietf.org/rfc/rfc1990.txt
RFC 2686, C. Bormann, “The Multi-Class Extension to Multi-Link PPP,”
IETF, September 1999, http://www.ietf.org/rfc/rfc2686.txt
RFC 1973, W. Simpson, “PPP in Frame Relay,” IETF, June 1996, http://
www.ietf.org/rfc/rfc1973.txt
FRF.12, “Frame Relay Fragmentation Implementation Agreement,”
December 1997, http://www.mplsforum.org/frame/Approved/FRF.12/
frf12.pdf
Chapter 20
Achieving QoE: Engineering Network
Performance
François Blouin
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Concepts covered
Why network engineering is important
E-Model
Real-time applications engineering & planning process
Hypothetical reference connection
Echo control
Budgeting delay & jitter
Silence suppression considerations
Router schedulers and buffer sizing
Introduction
Historically, TDM voice networks were engineered to ITU-T quality
standards; that is, the end-to-end TDM impairment budget was well
understood, and each operator's network is allowed a distinct portion of
that budget. Element requirements are well-defined and measurable. Where
packet voice standards exist, they are rudimentary and incomplete, thus
requiring further development regarding the needs of adequate packet
based applications planning and engineering.
Packet networks are being deployed as replacements for TDM for
switching voice and multimedia traffic. Packet transmission changes the
impairment budget and network design has to consider additional
impairment, such as delay, distortion, and jitter. The additional packet
network impairments need to be characterized and modeled to accurately
predict the application performance and define workable operating
margins.
Some network impairments are unavoidable: propagation delay (physics),
packetization delay, and legacy equipment. However, many can be
engineered to achieve predictable, acceptable voice quality through careful
control of the remaining impairment margin. Network planning and
engineering provide guidance on the correct choices for each parameter
including optimal packet size, jitter, total end-to-end delay, loss plan, echo
control, choice of codec, link speed, buffer dimensioning and so on.
Engineering is complicated by evolutionary migration to packet networks,
which creates “islands” of packet transmission in the global multioperator
TDM network. The large number of operators offering voice services and
the lack of packet interface standards result in the use of TDM to patch
together different packet domains. Each conversion between TDM hop and
packet networks adds significant impairment to the connection. Network
planning should be done to avoid TDM hops between packet islands. In
this chapter, potential issues related to real-time voice over packet and data
applications will be described along with mitigations and best practice
engineering guidelines.
This section describes in more detail the method and process for
engineering a network to deliver acceptable end-user QoE, also referred as
QoE engineering. There is a distinction between QoE engineering and
traffic engineering. Traffic engineering is the prevailing technique for
mapping traffic flows to ensure optimally-utilized bandwidth and prevent
congestion build up. Traffic engineering is one of the necessary steps of
QoE engineering, but it is not sufficient. To meet specific service quality
targets and user QoE, traffic engineering needs to be supplemented by
additional considerations highlighted in the QoE engineering process. In
order to deliver acceptable service quality and even differentiated services,
QoE should be part of the engineering methodology process; hence, a top-
down approach is proposed as an effective technique to deliver enhanced
customer value while performing network engineering. A four-step QoE
engineering process is shown in Figure 20-2 and will be discussed in the
remaining sections of this chapter for both voice and data services.
Conversational Voice
R-factor ∆R PSTN - Packet < 3R
delay < 150 ms
distortion Ie < 3R
Conversational
voice
(CBR and VBR) Path Interruptions Due to Failure
Frequent Interruption 80ms (affects speech
intelligibility)
Infrequent interruption 3 sec (perceived as call
drop)
1. R : Model transmission rating. R is the predicted output quality index of the E-model (ITU-G.107). See
“Chapter 3 Voice Quality” for details.
Distortion R User
Satisfaction
1. Speech Coding 100 category
Very
• Introduced by codec and potential Satisfied
transcoding 90
Satisfied
4-Wire
80
2. Talker Echo CODEC
D
A 2-Wire Increasing Some users
D
A
THL
Loop Loss
Distortion dissatisfied
Talker Loss Hybrid
(each
70
direction)
Pad
Echo Many users
Echo
Path M dissatisfied
M M
G
9
60 Increasing
G G Nearly all users
PBX
4 4 End Office 0 Delay dissatisfied
0
0 0 50
0
0 0 0 100 200 300 ms
0
• Echo when delay > 20 msec
0
3. Late/Lost Packets
• Packet loss is caused by buffer
Delays
overflow within the network 4a. Queuing & Jitter 4c. Packetization Delay
• In congestion situations, packets • Large buffer size • Includes intrinsic speech
are lost and jitter is uncontrolled, will add to overall sample accumulation
leading to late packets delay
4b. Network delay
delay + DSP processing
PSTN
MG
LER LER
MG MG
LER LER
MG overhead
• Includes propagation due to
distance and serialization
delay on low speed link
Figure 20-3: Summary of the voice QoE impairments and the impact on the
E-Model “R”
A-Side B-Side
2-Wire 4-Wire 2-Wire
Side A Echo Path
SLRA = 11 dB ELA
Side A Side B
Loss PadB
SLR = +11 dB 6 dB RLR = -3 dB
THLB
PSTN 14 dB
User Satisfaction
100
Analog connection w hich employs only
Very
6 dB loss pads for echo control
satisfied
Analog connection w hich employs
90
ECANs (ERL = 55 dB) for echo control TELR = 69
Decreasing TELR Satisfied TELR = 64
TELR = 59
80
TELR = 54
Some users
TELR = 49
R
dissatisfied
TELR = 44
70
TELR = 39
Many users
dissatisfied TELR = 34
TELR = 29
60
HRXs
It has been common practice for carriers to use HRXs to determine budget
allocation standards for nodal roles within their networks, such as
transmission, switching and access delays, loudness ratings, echo signal
path performance and quantization distortion. The connection is
hypothetical in the sense that it takes a sensible stab at the distances and
number and type of equipment involved, usually lumping together common
parts, rather than being a model of a real network connection. HRXs can be
Router Router
Router
Router Router Router
MG
900
EC
MG MG 0
EC
MG MG
4K 4K 4K 9K
Meter/ Drop/
Policer Re-mark?
Incoming
Flows Outgoing
Scheduler
Flows
Queue x
Shaper/
Classifier Marker Dropper
2. Hard QoE means the service quality is guaranteed all the time under any conditions, any load and traffic
patterns absolute limit.
QoE Level
Traffic Engineering/QoS Mechanisms
Hard QoE Soft QoE
Traffic classification √ √
Nodal/distributed QoS
Traffic Engineering/QoS Mechanisms Strategies
Traffic buffering √ √
Mechanisms
Traffic scheduling √ √
any
BW reservation/Dynamic provisionning
Mechanisms
Traffic demands
One of the first steps in traffic engineering is to identify traffic demands
and traffic sources in terms of distribution, characteristics and aggregate
volume. Traffic demands can be classified in at least two categories:
Constant Bit Rate (CBR) and Variable Bit Rate (VBR). Once traffic
demands have been established, network resources (buffer, scheduler share,
link size) can be allocated. The following sections will describe voice,
video and data traffic source demands.
DS1
Mux
…
DS3
Mux
TDM
…
24 DSOs OC3
Mux
28 DS1s network
3 DS3s
Max num of voice band sessions/channels on OC-3 =
24 x 28 x 3 = 2016 DSOs 2016 x 64Kbits/s = 129 Mbits/s
Packet
Circuit
n DS-1s To Packet OC-3c Packet
TDM network
No Silence Supression
30.0
Average Witn Silence Supression
link BW
25.0
10.0
5.0
0.0
1 4 24 32 48 64 256
Number of source
G.729 10ms AVG BW per Call (CU=5ms)
G.729 with VAD 10ms AVG BW per Call (CU=5ms)
Figure 20-12: Comparison of CBR and VBR voice traffic demand per call for
G.729/AAL-2 codec with/without silence suppression
The columns in Figure 20-12 indicate the average traffic, while the thin
bars indicate the peak bandwidth generated per voice call for the silence
suppression-enabled VBR sources. This graphs highlights the fact that
when only a few silence suppression-enabled sources are multiplexed, the
peak traffic exceeds the CBR sources; therefore, it is not an effective
solution. (Talker model assumes average talkspurt of 0.352 sec, silence gap
of 0.650 sec).
One of the most important elements affecting silence suppression-enabled
voice sources is the voice activity level. Bandwidth required to support
voice calls with silence suppression depends primarily on the voice activity
level; that is, the ratio of talkspurt/(talkspurt + silence) and the mix of voice
calls and voiceband data. The differences in talkspurt/silence distribution
types of several talker models reported in the literature are not exhibiting
significant differences in terms of voice traffic profiles generated at the
aggregate level, and thus are less critical than the effective voice activity
level. The voice activity level varies depending on the talker model used
and the type of service. It generally varies from 35% to 55% for
conversational voice and up to 88% for scripted speech. No single talker
model fits all voice-based applications and services. The voice activity
level assumption is key to effective silence suppression engineering and
design, but it is not fully understood; further characterization is required.
At this time, the 55% activity level is a conservative value for engineering.
Figure 20-13 shows the maximum number of voiceband sessions [1] for
North American TDM and SONET-based capacity on an OC-3 link and
upon the Central Limit Theorem (CLT). This engineering graph can serve
for capacity planning. A reference line has been set to 2016, which
indicates the number of voiceband sessions carried over a TDM
infrastructure; therefore, all the points above it indicate the operating
conditions where silence suppression would offer equivalent or superior
capacity.
Voice Band Sessions and Voice Activity Level w ith Increasing Voice Band Data Contribution
2800
2700
2600
2500
2400
100% Voice - 142Mbps
Maxim um Voice Band Sessions
1700
1600
1500
1400
0.45 0.50 0.55 0.60
V AD Le ve l
[1]: Voice band sessions includes voice calls & voice band data
142 Mbits/s is derived from OC-3 – SONET overhead – 5% BW for call control/signaling
129 Mbits/s is derived from 2016 channelized DS0s at 64Kbits/s each
Figure 20-13: Maximum number of voiceband sessions [1] for North America
TDM and SONET-based OC-3 capacity as a function of the
voice activity level
Web-browsing, e-commerce
stdev: 25Kbytes Distribution: Weibull
Distribution: Lognormal
inline object size
Short Flows mean: 7.7Kbytes
stdev: 125Kbytes
Distribution: Lognormal
number of inline object
min: 1, max 10, mean 5
Distribution: uniform
mean: 40bytes mean: 300 sec
Telnet,
Distribution: Distribution:
Deterministic
Extreme (a=80, b=5.7)
Table 20-5: Typical data application traffic demands based upon industry
published papers and internal Nortel research
Figure 20-14 shows the aggregate data user traffic demand requirement for
various QoE level. As expected, the bandwidth requirement varies as a
function of the number of users as well as with the level of QoE provided.
This engineering table has been derived from network simulation based
upon a predetermined data user traffic profile (see Table 20-5) and, as one
of many typical traffic profiles, is provided as a guideline only. There is no
“one size fits all” traffic profile that would suit all Enterprise type
businesses and customers. So it is expected that some characterization
would be required to derive accurate traffic demand matrixes for specific
user profile. In general, it was found that the average individual data user
traffic would vary 20 kb/s up to 130 kb/s, depending on the number of
sources, aggregation level and QoE targets. The level of aggregation is
highly beneficial due to the statistical multiplexing. For example, for a ten-
user network, it would require about 130 kb/s for each user to deliver
optimal QoE, while it would only need 32 kb/s for a 2000 user network.
Example 5. Determine the bandwidth requirement for an
Enterprise WAN access link size requirements for 100 users. A 100
user network would require 6.3 Mbits/s link bandwidth to deliver
optimal QoE, while it would require about half (2.8 Mbits/s) to
provide acceptable QoE.
Bandwidth (Mbits/s)
Users QoE
50.0 optimal application QoE
(green curve). 10 1.3 0.5 0.3
40.0 • Other QoE levels (yellow and 50 3.6 1.6 1.2
pink curve) can be satisfied 100 6.3 2.8 2.0
30.0
with lower bandwidth 200 9.5 5.2 4.0
20.0 500 20.0 13.0 10.0
10.0 1000 35.0 25.0 20.0
2000 64.0 50.0 40.0
0.0
10 100 1000 10000
Figure 20-14: Data user traffic demand requirements for various QoE level
Figure 20-14 is based upon Table 20-5 user traffic profiles, including TCP
short and long flows. Long flow traffic represents 80% of all traffic
volume. Heavy Peer-to-Peer (P2P) traffic will most likely skew these
engineering rules and QoE work is ongoing to study the impact of P2P on
provisioning and QoE.
Delay budgeting
The delay margin for voice can be derived by either using the E-model 3R
delta rule or by using a hard limit maximum one-way delay as a threshold
point that meets the desired QoE. The absolute delay threshold technique
would most likely be appropriate for UDP-based data services. It should be
pointed out that the delay budget should be done from an end-to-end
perspective; that is, that no one gets to use it all. From a pragmatic
perspective, that means that the impairment budget should be allocated
across all of the elements of a connection (see the next example).
Example 6. Determination of delay margin using the 3R rule for a
POTS-to-POTS call going through a packet core is shown on Figure
20-15. Delay impairment sources include PSTN switching offices
(End Office and Tandem Office) plus propagation delay, circuit-to-
packet media gateway (PVG) for voice encoding and decoding, and
IP core routers switching and queuing. The delay margin is
computed by finding the intersection point on the E-model where
the –3R line crosses the reference connection. The delay/jitter
Circuit-to- Circuit-to-
Packet Packet
PVG PVG
POTS EO TO PSTN PSTN TO EO POTS
TDM core distance Packet core distance = 2000km TDM core distance
70 Prop IP MG POTS-to-POTS
POTS A ccess Jitter Margin Performance
POTS Access (95 ms)
60
50
0 100 200 300
Average One-W ay Delay (ms)
Home MG + Home MG +
Regional/Core N/W
modem+ client DSL Access DSL Access modem+ client
LER LER
Provider A Provider B
Note: Home MG delays includes DSL modem frame inter-leaving correction delay + voice encoding/decoding
Distortion budgeting
Conversational voice services distortion budgeting includes packet loss,
echo control and codec distortion. For a packet network replacing a
traditional TDM infrastructure, a 3R distortion margin is recommended to
produce unnoticeable degradation. Other margins can also be used,
depending on the business model and service quality expectations.
Although the 3R margin produces equivalent quality 3R margin, the margin
allocation is very small and offers very limited option and flexibility for the
controllable parameters. The codec should solely use G.711 as all other
codecs have an equipment impairment; that is, greater than 3R.
Transcoding should be eliminated as a call traverses multiple service
provider’s networks and/or packet island. Packet-to-packet handoff is
required to prevent additional voice decoding/encoding stages. If TDM
handoff is used between packet islands, the impairments budget is thus
fractioned (see Figure 20-18).
1:Response time targets are derived from ITU-G.1010 as well as from subjective
studies conducted at Nortel.
2:Recommended packet loss targets are based on modeling and simulation studies
conducted at Nortel.
loss ratio target of 10–6 or less. Another rule for voice buffer provisioning
is to use 1/10-1/20 packet per G.711 10 ms voice call (see example below).
Example 8. Calculate the buffer size requirements for a scheduler
share of 20% on an interface rate of OC-3 while maintaining a
maximum of 2 ms queueing delay.
Buffer size = 2 ms / 8 bit * share of rate (bits/sec) = 2 ms / 8 bit
*20% * 155 Mbits/s = 7750 Bytes.
Alternatively, the buffer size can be estimated using the one tenth
rule. If a G.711-20 ms voice call requires 100 Kbits, about 300
voice calls can be supported out of a scheduler provisioned for
31 Mbits/s (20% of an OC-3).
Buffer size ~ 1/10 x 300 voice calls ~ 30 voice packet buffers.
Therefore, approximately 30 voice packet buffers will be required
to prevent packet loss on a strict priority scheduler.
9.E-04
Drop Probability (M/M/1/K
8.E-04
queueing discipline)
7.E-04
6.E-04
5.E-04
4.E-04
3.E-04
2.E-04
1.E-04
1.E-06
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Offered Load
0.1ms Buffer Size 1ms Buffer Size 2ms buffer size
Figure 20-19: Voice buffer provisioning versus offered load and drop
probability
In a well engineered network, where load is balanced and does not exceed
the provisioned rate, buffers will be lightly utilized so the queuing delays
will be close to 0. Note that the recommended max transmit queue delay/
size is 2 ms.
element delay, but instead a function of the number of active flows, link
speed and loss rate. For data services, the buffer size would be typically
engineered to control TCP flows loss with 1% or less (0.1% preferred) in
order to minimize TCP timeout and retransmissions. There is a trade-off
between queuing delay and packet. Small queue size (small buffer) implies
low average queuing delay; however, a small queue size does not always
lead to faster TCP application response time. TCP timeouts caused by
insufficient network buffering can actually increase response time.
Therefore, the transport protocol characteristics and its impact on QoE
need to understood before determining the queue size.
Response Time
Data QoE
Scheduler share
The queuing delay introduced by a scheduler is greatly influenced by the
offered load and the output link capacity. Queuing theory shows that as the
scheduler load increases, the queuing probability increases, and above a
given threshold increases in an exponential fashion and becomes infinite as
it reaches 100% link occupancy (see Figure 20-21).
Source Jitter as a function of Voice Link Utilization and Link Speed (bits/s)
Voice +data traffic - 2 Queues stat mux, strict priority
20ms G.711 voice packets (200 bytes)
1500 bytes data packet size
1000.00
ISDN
100.00
128k
256k
Average Source Jitter (ms)
512K
10.00 1M
T1
10M
DS3
1.00
100M
OC-3
0C-12
0.10
0.01
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Figure 20-21: jitter relationship versus loading for various link size jitter=
f(buffer size, link speed & loading)
Figure 20-21 shows that in a well-engineered network where load is
controlled within a given range (typically less than 90%), queuing delay
and jitter are bounded and stable. Where loading is not controlled and link
loading approaches 100%, queuing delay increases exponentially (in
practice, this is limited by the buffer size). Note also that lower speed links
have lower maximum practical operating points; that is, points above
which jitter becomes greatly inflated compared to its value in the unloaded
network.
As the link size diminishes, the maximum operating loading point
diminishes before saturation.
Routers should not allocate more than 95% of interface rate on high-speed
interface (10 Mbits/s and above) to prevent queue buildup and control
packet loss/jitter. For lower speed link, the maximum loading threshold
should be reduced as shown in Figure 20-21.
Bandwidth provisioning
The bandwidth provisioning is part of the resource allocation process by
which a certain amount of link resources will be allocated to traffic
demands to ensure acceptable QoE is delivered to the end-user. Essentially,
performing mapping of traffic flows into network links. Network traffic is
3. If step 3 identifies links utilized 3. If routes for some LSP cannot be found
more than 95% due to constraints on router resources
• Increase bandwidth of those links and – Split LSPs,
finish engineering, or, – Add more bandwidth to links where constraints
• Add bandwidth where feasible, and/or would be violated and finish engineering,
economical and repeat step 2. – Add bandwidth where feasible, and/or
economical and reroute enough LSPs to make
room.
Table 20-7: Bandwidth provisioning steps for native IP and MPLS networks
Bandwidth provisioning can be done in either static or dynamic
provisioning. Static provisioning is achieved by allocating bandwidth for
the highest load over a time window and by implementing efficient QoS
mechanisms to ensure that sufficient bandwidth is available for the priority
traffic with predetermined constraints. The drawback of this approach is
that the capacity may be highly under utilized when the load is significantly
below the peak load within the time window. The other approach, dynamic
bandwidth provisioning, would solve this problem by using network
resources more efficiently, but on the other end being far more complex and
requires some centralized coordination intelligence to maintain knowledge
on link/capacity availability. Dynamic bandwidth provisioning is currently
under development and is expected to replace traditional static provisioning
as a long-term solution.
Bandwidth provisioning should also include some extra bandwidth to
include redundancy path restoration, call control, and future traffic growth.
Jitter buffer
The main purpose of a jitter buffer is to compensate for packet delay
variation, which would affect playout of deterministic packet sequence
such as real-time voice or video. The jitter buffer must be designed for an
expected traffic profile. That is, for dimensioning jitter buffer, the packet
interarrival delay must be known or be within predetermined bounds for
the jitter buffer depth to be adjusted within an expected range. The jitter
buffer wait time can be statically provisioned or adjusted dynamically as a
function of varying network operating conditions. To prevent packet loss
from jitter buffer overflow, the persistence of instantaneous arrival and the
average arrival rate must not exceed the available jitter buffer storage
space. Obviously, the jitter buffer usage would be a function of the
underlying network stability the traffic is traversing. To minimize jitter
buffer wait time, the network jitter should be minimized and/or eliminated
by ensuring that the average arrival rate does not exceed a certain
percentage (utilization/loading) of the outgoing link speed of each
multiplexing stage in the connection. For voice traffic with an assumed
uniform periodic profile at point of origin and a constant bandwidth
requirement, the percentage can be as high as 90%-95%, provided the
traffic is all voice. If shared with data, voice has absolute priority over data.
Under those circumstances, where voice packets originate, traverse, and
terminate over links in excess of 10 Mb/s, one can expect induced jitter
(resultant from the convolution of the behavior of the concatenated
multiplexing stages) to be a few milliseconds at most.
Consequently, a 10 ms jitter buffer should be sufficient.
Jitter buffers can and should be sized independently of a packet
size.
Where those circumstances do not prevail (that is, network loading is not
controlled or bounded), there is little that can be said about determining the
correct jitter buffer settings. Without any firm expectation of the
instantaneous and average traffic profile (that is, knowledge of the total
traffic admitted to the network and its load balance across the network),
then the probability of unbounded persistence and uncontrolled average
loading is increased, but also unbounded. No size of jitter buffer in any
router or receiving media gateway can be considered big enough. One
should always engineer a managed network for well-behaved normal
operation, with a sufficiency of controls and monitors, and a capacity
suited to demand. Jitter buffer wait time of a few milliseconds will have no
significant impact on voice quality, and fully absorbs the packet delay
variation of packet switching.
that exceed real-time voice delay targets (150-250 ms), as well as the
highly bursty nature of packet loss distribution.
If an adaptive jitter buffer wait time is deployed, it must be able to adapt to
the wide range of jitter distributions that are typical in today’s IP networks.
Adaptation schemes may not perform well with all distributions of jitter.
Tuning of the adaptation algorithm may be necessary to match the delay
variation characteristic of a network. Tuning can be done by adjusting the
weighting used for the calculation of the moving averages and the
thresholds (sensitivity) to the occurrence of spikes in the delay variation. A
single setting for these parameters that works for all traces may not be
feasible. Note that the long-term average packet loss rate and jitter are in
many cases misleading, as they hide transient events that are only visible
on short time scales. Packet loss periods and delays span several orders of
magnitude—distributions of loss and delay bursts have heavy tails. Where
more than 40–60 ms of speech are lost, there is no longer sufficient
information to reconstruct the speech. This places a hard limitation on the
effectiveness of packet loss concealment techniques.
Section VI:
Examples
The material presented in previous chapters has been mostly theoretical or
hypothetical. In this section, we offer some concrete examples of network
architectures and configurations. Chapter 21 provides a look at a specific
large Enterprise network; namely, Nortel’s own corporate network, which
we use as a proving ground for our equipment. Chapters 22 and 23 provide
a contrast between the perspectives of the Public Carrier and Private
Enterprise view of real-time networking. Chapter 24 describes the
implementation of Real-Time Control applications in the video space.
While Enterprise networks and Carrier networks generally rely on the same
technologies and QoS protocols, their business environments are
completely different. A Carrier's network and the features it supports
constitute services to be sold. An Enterprise network is generally a
constrained resource and a business tool. These different perspectives
create different challenges and require different strategies to leverage and
implement network, communications, and application convergence.
When a Carrier implements a VoIP, multimedia, and/or converged network,
it works through a process of determining what level of service to provide,
the size of the target market, and which of the available technologies to
implement. Sophisticated simulation tools are used to determine network
requirements and to predict performance. The Carrier then implements a
carefully defined solution within a well-understood usage environment and
operating conditions, and monitor the loads the network carries to ensure
that it does not exceed the load it was designed for.
Typically, the Enterprise situation is the complete inverse. An Enterprise
usually starts with a network that was designed for data. The network is not
well-documented or well characterized. Enterprises consider voice and
real-time multimedia as simply additional data applications. And while
they assume that bandwidth solves all problems, they are also continually
looking for ways to reduce bandwidth and constrain their network growth.
These examples provide contrasting perspectives on how the principles set
out in the previous sections can be used to achieve various network
performance and user quality targets.
Disclosure Statement
The business case scenarios and examples used in the following chapters
are intended for illustrative purposes only. These case examples represent
potential results based on certain assumptions, which may not take into
account all factors potentially affecting results. If actual operating factors
differ from assumptions made, actual results may vary. Specific customer
operating factors such as deployment scenarios, actual growth rates, and
competition could cause actual results to vary compared to other
customers.
Chapter 21
VoIP and QoS in a Global Enterprise
Sandra Brown
Rob Miller
Shane Fernandes
Gwyneth Edwards
The following example reflects facts and figures obtained during a case
study performed by Nortel in the Spring of 2004. All information —
financial, people, technical—corresponds to Nortel’s environment at the
time of the case study. As applicable, facts will be denoted by “Case Study
Figures”.
This chapter describes a real life implementation of Quality of Service
(QoS), in two parts:
“Voice over IP: Raising the need for Quality of Service”
“The Quality of Service (QoS) design”
The applications
Nortel runs one of the largest real-time Enterprise networks in the world,
equivalent in breadth and scope to a Tier 2 Service Provider. In a typical
month, more than 1,500 terabytes of routed data traffic runs across the
network, headed for one of the 2,700 computer servers. By comparison, the
books in the U.S. Library of Congress, the world’s largest library, contain
about 20 terabytes of text.
The IP network carries data from a variety of sources, grouped in the table
below by traffic category:
The network
The following network description is as described in the Nortel 2004 Case
Study. Nortel’s Enterprise network is based on a backbone architecture split
into four Border Gateway Protocol (BGP) regions—Europe, Americas,
Asia and India—each of which is assigned an Autonomous System (AS)
for Internet connectivity and transport. Routing is done hierarchically
through the core, distribution and access layers. Interregional routing is
based on OSPF routing principles and all regional traffic traverses the core.
Between major campuses, the Wide Area Network (WAN) runs over
Optical SONET technology, much of which uses Optical Ethernet. Some
small offices are also connected to the WAN through SONET although
many are connected through Asynchronous Transfer Mode (ATM), Frame
Relay (FR) and Virtual Private Network (VPN) links.
Over the past few years Nortel collapsed literally hundreds of private
virtual lines, frame relay and ATM circuits, public and private voice onto
the converged core and moved much of the public voice onto the private
network. Upgrades to VoIP were done on the line side and are now
evolving to H.323 and SIP trunking.
Figure 21-1 illustrates an overview of the company’s real-time network
architecture.
Mobility at Nortel
Mobility in the work place has become a standard requirement; however,
Nortel has led the industry in the mobility of its employees. Through secure
VPN solutions, for more than a decade employees have been accessing the
network remotely to perform their work. Therefore, leveraging current
investments and the installed base was a key driver in the deployment of
VoIP; the IS team wanted to enable the mobile worker to be more
productive and more connected than ever before.
1. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
Jitter in the IP layer can impair the voice channel. As noted in Chapter 3,
jitter is the variation in packet arrival: moment-to-moment changes in
network traffic and loading affect the transit times of individual packets.
VoIP and other real-time applications cannot be queued without increasing
the end-to-end delay, which degrades the application performance.
Network engineering and network management need to keep jitter low to
maintain quality for delay-sensitive applications.
So jitter and latency values are very important, and in this example, they
were the key drivers for implementing QoS.
Note: Quality of Service (QoS), as defined in Chapter 2, refers to a
set of technologies—traffic management and QoS mechanisms—
that enable the network administrator to achieve the desired traffic
performance targets. We assume that, in this example, Quality of
Experience (QoE) for VoIP calls is equivalent to that of PSTN
service.
Traffic shaping
At the small office, two frame relay Data Link Connection Identifiers
(DLCI) are used to separate Voice over IP from all other data application
traffic across the Wide Area Network (WAN), with higher priority given to
the VoIP traffic. To direct the traffic to the appropriate DLCI, Forward Next
Hop (FNH) filters based on the DSCP/TOS bits within the IP header are
implemented on the Ethernet ports as the IP traffic ingresses the router
ports. BayRS Protocol Priority Queuing (PPQ) is implemented at the small
site’s router.
The VoIP DLCI is a “shaped” DLCI and the Data DLCI is “unshaped.” By
default, shaped DLCI traffic is prioritized over unshaped DLCI traffic. All
VoIP signaling and media path messages are assigned an Expedited
Forwarding (EF) DiffServ Code Point (DSCP) of 46.
To prevent traffic congestion on one DLCI from causing packet drops on an
uncongested DLCI, clipping is enabled on the ATM interface card that sits
on the BN router. “Clipping enabled” implies that the BN ATM card will
drop all packets in excess of the frame relay Sustainable Cell Rates (SCR)
and Peak Cell Rate (PCR) values. The data circuit will typically drop
packets first because data is bursty in nature, thus ensuring that the VoIP
traffic is not dropped by the BN router.
The existing Nortel core router architecture supports an eight queue QOS
design, three of which are currently used, as follows:
Voice is prioritized into the highest queue
Video and multicast into the next queue
All other traffic into a lower queue.
The optical network is fronted by a Passport 7480, which is used to convert
the traffic into ATM packets. Please see Figure 21-4 for an illustration of
the core QoS strategy.
Lessons learned
For companies wanting to move to a real-time network, the following
lessons learned can assist in the implementation of Quality of Service
(QoS):
The QoS strategy should be driven by the needs of the applications
that run over the corporate network.
The applications must be categorized by traffic category to
determine the QoS requirements and priorities (let Quality of
Experience drive the categorization).
Consider not only current interactive applications but also future
applications that will run across the real-time network.
Even if QoS is implemented on a site-by-site basis, a complete QoS
strategy should be defined prior to deployment to ensure that the
network becomes real-time end-to-end.
To minimize costs, leverage current infrastructure and managed
services, especially in the small sites.
To simplify QoS implementation, take advantage of the DiffServ
Code Points (DSCP) mappings to Nortel Networks Service Classes
(NNSC) and the service categories of the various transport
technologies such as IP, ATM and frame relay.
Chapter 22
Real-Time Carrier Examples
Edited by Kathy Joyner
Centrex IP
An April 2003 InfoTech* report entitled “Enterprise Convergence: The
Race for IP Telephony Supremacy” shows that by the end of 2004, over
seventy percent of U.S. Enterprises will have implemented IP Telephony in
at least one site. This escalation in demand for IP Telephony is driven by
the desire to reduce costs while retaining the ability to enhance employee
mobility and increase worker productivity.
It is not surprising then that Anycarrier.com has experienced a five percent
decline in its Centrex customer base over the last eighteen months. In fact,
research by IDC (2003) shows that service providers as a whole are seeing
their Centrex installed base eroding by three to twelve percent per year
(depending on segment and service provider).
Understanding that the erosion of its Centrex customer base is only going
to accelerate, Anycarrier.com initiated a study to find a solution that would
allow it to retain its current Centrex customers and also add high-demand
VoIP services as part of a comprehensive product offering. Based on the
results of that study, Anycarrier.com determined that Centrex IP is the only
solution that provides it with an evolutionary approach to VoIP, allowing it
to retain its existing Centrex revenues while providing a platform for new,
IP-based business services.
Technical challenge
With Centrex IP, Anycarrier.com can offer the reliability and rich feature
set of hosted Centrex in conjunction with the next-generation services of
VoIP, allowing it to retain and grow its Centrex base. Because Centrex IP
builds on the industry-leading business voice benefits of Centrex,
businesses can take advantage of the benefits of IP Telephony in a flexible,
cost-effective and low risk way. Key market segments are as follows:
Existing Centrex Base. Companies who are already seeing the
benefits of the full feature set and reliability of Anycarrier.com’s
Centrex service are the prime target for the move to Centrex IP.
Medium/Large Enterprises. Medium to large companies across
many industries also provide a great opportunity to introduce
Centrex IP. In many cases, these companies have already
considered implementing VoIP in some fashion and may have a
budget set aside for that step. In addition, these companies are also
under competitive pressure to increase employee productivity by
providing better communications, while simultaneously lowering
operating and IT costs.
Within the Medium/Large Enterprise segment, those companies with the
following characteristics are the primary targets for Centrex IP services:
New Branches. Companies that are opening a new branch or site
for their company. The branch will need to be set up with a cost-
effective extension of services to connect and communicate with
the main corporate site.
Major Renovations. Companies that are overhauling/rebuilding
their office space or telecommunications systems. This renovation
may provide the opportunity to upgrade their telephone and LAN
infrastructure.
Small Businesses. Small businesses across many different
industries are another potential opportunity for Centrex IP services.
These companies typically make changes more quickly and easily.
Solution
Network diagram
From a high level perspective, Centrex services can continue to be offered
from either existing switch platforms or from newer call server platforms.
The diagrams below illustrate how the migration to an IP-based Centrex
service can be accommodated.
Architecture overview
With Nortel’s Centrex IP solution, Anycarrier.com can offer full-featured
Centrex services over an IP infrastructure using two primary components:
Centrex IP Client Manager with a DMS*-100/5000 or a
Communication Server 2000
IP phones
Key elements
Centrex IP Client Manager. The Centrex IP Client Manager
(CICM) is a high-availability, NEBS-compliant platform that is
hosted from a DMS-100/500. The CICM is responsible for hosting
Local
As wireline revenues continue to decrease, local service providers are
looking for ways to reduce costs and converge networks so that voice, data,
and wireless can leverage the same network. They also want to lay the
foundation for new services that provide additional revenue opportunities.
For these reasons, major local exchange carriers are taking on the challenge
of converting their Class 5 circuit switches to packet switches.
Drivers for considering migrating to a packet network vary by service
provider. However, common requirements are as follows:
Meeting market demands for data services
Delivering solutions cost-effectively
Providing single, integrated, carrier-grade packet network for voice,
high-speed data, and special services with efficient network
management capabilities
Reducing capital and operating costs
Finding sources of new revenue
In the end, many service providers determine that it is more cost-effective
to migrate to new packet technology than to grow and maintain their
existing circuit switches.
Technical challenge
Service providers are closely reviewing their technology choices in order to
decide whether to continue to grow and maintain the circuit-switched
network or migrate to a packet network.
Most service providers have a wide range of circuit switching and back-
office equipment, requiring experienced craft personnel to maintain and
manage circuit switches from different vendors. From a network support
perspective, there are many different products to understand and manage
on a daily basis. In addition to the various and dated circuit switching
equipment in networks today, some switches do not support Local Number
Portability (LNP), a regulatory requirement that must be provided to all
subscribers. In addition, capital expenditure decisions are looming for
many service providers.
From a business perspective, packet networks can deliver operational cost
savings. For example, migration can reduce the number and different types
of back-office systems, simplifying network management. In addition, the
number of nodes in the network is decreased; in one real-world case, by
almost 75 percent. Several Class 5 switches can be collapsed into one
centrally located communication server that serves a much wider
geographic area. Craft personnel no longer have to know and manage
multiple types of back-office systems. In addition, they no longer have to
manage as many elements because separate layers for Tandems and
Remotes, as well as multiple networks, are eliminated.
Solution
Network diagram
From a high level perspective, convergence offers an opportunity to reduce
complexity and to reduce costs over traditional TDM Class 4 and Class 5
networks. As the diagram below illustrates, the transition from a TDM-
based network to a packet-based (IP or ATM) call server network
significantly reduces the number of trunks that have to be maintained, as
well as reducing the overall load on the network.
EO IXC
IXC #1 EO #4
IXC #3
Packet
Network
TANDEM
IXC #4 EO EO EO
EO
E E EO EO
EO EO EO EO
TANDEM TANDEM
E E
EO EO
Solution Details
Nortel provides a Carrier Voice over IP Local Solution. Whether a service
provider is interested in ATM or IP, Nortel can provide the solution. With
decades of Class 5 experience, Nortel is uniquely qualified to provide
service providers a packet solution that can evolve their circuit networks.
This solution delivers full feature transparency delivering a full set of Class
4 and Class 5 features with over 3,000 features in every software load.
The major components of the Nortel VoIP Local Solution include:
Communication Server 2000 superclass softswitches providing
comprehensive services, carrier-grade attributes, and regulatory
features
Media Gateway 9000, a line gateway supporting both broad- and
narrowband services
Media Gateway 4000, a trunking gateway used in ATM networks
(North America only)
Packet Voice Gateway 7000 or 15000, a trunking gateway used in
ATM or IP networks
Service providers may also use Nortel Multiservice Switches to provide
high-capacity, carrier-grade switching. While this is not a requirement
Long distance
Long distance providers and new carriers alike are faced with a paradox:
the total minutes of use for long distance services is expanding, but the
revenues per minute are decreasing. However, there is still tremendous
potential in this market, creating opportunities for some and challenges for
others. Ascendant carriers are staying ahead of the competition by reducing
operating costs, expanding capacity, and delivering reliable services.
Nortel Carrier VoIP Long Distance Solution is an ideal step for long
distance service providers. This low-risk solution helps lower transport/
transit and capital costs with the efficiencies of multivendor packet
telephony. Packet trunking is the economical engine that can help pay for
network transformation today as the service provider explores new revenue
opportunities enabled by new voice and multimedia services and easier
access into other markets.
Technical challenge
This solution delivers full-featured, carrier-grade telephony, data, and
multimedia services over multiservice packet networks. It uses open
standards packet technology for the packet backbone. Carriers can chose
either AAL2 protocol or IP transport to provide a full-featured packet
transit application.
Packet networks offer cost efficiency, open standards, and fast time-to-
market for new packet services, without compromising the values of
traditional telephony, including service richness, voice quality, reliability,
scalability, and manageability.
This solution is based on Nortel’s Packet Trunking application and allows
service providers to deploy their own differentiating telephony, data, and
multimedia services.
This solution also lays the foundation for delivery of local and transit
services for business and residential customers with the future addition of
line-side multiservice gateways. The service provider can also add cable or
wireless gateways to explore other market opportunities, or take advantage
of Enterprise network connectivity and SIP capabilities to deliver new
services.
Solution
Network diagram
From a high level perspective, the transition to an IP or ATM-based packet
network can result in large savings in long distance costs. The diagram
below shows a comparison of long distance trunking requirements for a
TDM-based trunk network and a packet-based network. The larger the
number of flows over a link or Trunk Group, the less difference between
statistical worst case and average. Because in the IP case we consider all of
the traffic on a link when sizing the link and do not have to segment it up
into a smaller point to point flows as in the TDM case (that is, individual
Trunk Groups, each sized according to a statistical worst case), it can be
more efficient.
Solution Details
Instead of managing multiple overlay networks, the service provider can
deliver all types of services over a single infrastructure. This design allows
more choices in service deployment and vendor selection to help decrease
long-term capital costs. H.248-compliant multiservice gateways connect
existing trunks to the backbone, with no need to modify existing facilities
or their originating multivendor offices. The efficiencies of a packetized
backbone can reduce ongoing operating costs by twenty to forty percent as
proven by a Nortel business case1. In contrast to today's individually
engineered fixed-bandwidth trunks, a packet network efficiently routes all
types of traffic by allocating and sharing network resources on demand.
The packet network also helps reduce cross-connects, multiplexers, IMT
facilities, and associated peripherals, reducing capital expenses by twenty
to thirty percent, again as shown in a Nortel business case1.
1. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
This solution also offers reliability, security, and quality of service. The
service provider can transition a node-centric, hierarchical topology to a
simplified architecture where a converged network performs like a single,
unified switch. The streamlined network design offers greater service
capacity, variety, and speed-to-market, all with fewer nodes. And because
fault-tolerant elements are distributed across the network, single points of
failure are removed and superior survivability is realized, with no sacrifice
in voice quality, latency, or capacity.
The Carrier VoIP Long Distance solution offers a standards-based
switching and routing infrastructure that transports today's revenue
generating services while supporting competitive, next generation services
– all over a high-capacity ATM or IP backbone. The service provider can
deliver leading-edge long distance applications while reducing transport
costs and deferring or eliminating future capital expenses.
The following listing summarizes key network elements in this solution.
All are built to meet or exceed carrier-grade standards and protocols set by
ITU, Telcordia, ANSI, ETSI, IETF, ATMF, and other standards bodies.
Gateways. With an ATM AAL2 backbone or IP—the robust Packet
Voice Gateway 15000 connects standard TDM trunks to the service
provider's packet backbone. This trunk gateway appears as a
tandem/transit office termination to any vendor's circuit-switching
office.
Superclass Softswitch. The first vendor to deliver a superclass
softswitch, Nortel offers a choice of two platforms; the
Communication Server 2000 and Communication Server 2200
(previously the Communication Server 2000 Compact). A
superclass softswitch offers the critical attributes associated with
successful softswitch deployment, such as consolidated local, long
distance, wireless, and cable applications; comprehensive service
(3000+ features); regulatory capabilities; and carrier-grade
attributes. Both of these platforms are designed to control
multivendor gateways and provide the call control and other
network intelligence required to deliver revenue-enhancing
services.
Multiservice Switches (MSS). As a high-capacity ATM switch, the
Nortel Multiservice Switch 15000 supports ATM IP, frame relay,
circuit emulation, and voice services. This system scales from 40
Gbps of redundant bearer capacity to terabits. High-capacity, fault-
tolerant Layer 2 switching/routing is provided by the Ethernet
Switch 8600, which aggregates local IP traffic providing an
interface to a high-speed optical backbone for IP voice and
signaling traffic.
Multimedia
In the past, communications were based on a single media: voice. In the
21st century, communications require the integration of multiple media. To
ensure effective communication for both Enterprises and end users, next-
generation SIP services allow consumers to integrate voice, video, and data
into a conversation as simply an easily as they pick up the telephone and
make a simple voice call today.
Technical challenge
For the purpose of this example, we have chosen to focus on a service
provider who is solely deploying multimedia services for the residential
and small office/home office markets. Similar multimedia service offerings
are available for medium to large Enterprises through a carrier-hosted
model.
To take advantage of this market opportunity, 123com has initiated a
program to design and deploy unique consumer multimedia service
offerings that create opportunities for new revenue streams, increase
Solution
Network diagram
The network diagram below illustrates how five “Go to Market” services
can be offered. These services are as follows:
Voice and multimedia over a broadband connection to the Home
Office
Voice and multimedia from a soft client over a broadband
connection to the residence
Voice and bundled long distance over a broadband connected
Integrated Access Device (IAD) in the residence
Personal Agent web portal services to enhance existing 123com
residential voice subscribers
Voice and multimedia Remote Access over a broadband connection
to the Internet
Each of the service offers makes use of 123com’s or other carrier’s high-
speed data services, the public Internet, 123com’s IP Backbone, and
123com’s PSTN network.
Solution Details
For the residential consumer market, 123com considered four service
offerings, described below.
123com Multimedia Communications Center. Intended for the
installed base of broadband customers and telephone customers,
this service offers advance multimedia features such as video
calling, picture ID, file transfers, and Web pushes using the
customer's personal computer. Unlimited on-net calls are provided
as part of the service, along with an optional outbound long
distance calling plan. Inbound calls from the traditional telephone
network will be allowed in the markets where 123com has primary
line service or selected states where this type of service is allowed
from a regulatory perspective.
123com Broadband Telephone Service. Intended for the installed
base of broadband customers who don’t use or have a personal
Cable
While data and video on the Internet aren't new, having voice included in
the mix is. Using Internet Protocol (IP) telephony or Voice over IP (VoIP)
technologies, phone conversations are converted into packages of data to be
sent over the Internet in ways similar to e-mail and web sites. Phone calls
anywhere in the world can be significantly less expensive with IP
telephony. And because voice, data and video are all using one network,
packaged as similar data packets, services can be bundled together from
one service provider with the flexibility for the user to receive the
information on any communications device, regardless of location—office,
home, or on the road.
According to market research on communications preferences completed
by Pollara Inc. for Nortel, consumers are growing impatient with various
communications devices that don't work together. The research found that
today's consumers expect instant communications but instead are plagued
by having to navigate cumbersome menus on each device they own when
they want to reach someone.
Traditional cable providers are aggressively moving into service areas
formerly dominated by traditional voice providers. The incumbent service
provider is faced with a two-fold challenge: find a way to generate new
revenues, and curb subscriber flight to satellite service.
The current opportunity for cable service providers implementing VoIP
technologies is to simplify communications while, at the same time,
creating a user friendly communications environment that seamlessly
adapts to the lifestyle and needs of each individual. The technology that is
being used with VoIP to give businesses and consumers a wider range of
services and the ability to fine tune the management of their
communications is called Session Initiation Protocol (SIP).
Technical challenge
When making the decision to enter the VoIP business, the service provider
should be aware that VoIP creates more than one business opportunity. A
good part of the reason lies in the underlying technology. The concept is to
convert an analog voice signal into a series of ones and zeros that can be
reconstructed into the original analog format without perceptible loss of
quality. Once any information is converted into this digital form, all the
services developed for data switching, routing, and storage become
available as tools to tailor voice product for the service provider's market.
One of these tools is IP, which is the underlying technology used to move
information on most data networks, including the public Internet.
VoIP opportunities for cable include primary line telephone service, long
distance, SIP-based broadband voice, and business services. This section
will concentrate on primary line service.
Primary line service is a one-for-one replacement of the incumbent
telephone company's service, because it is carrier-grade telephony. This
means it must be highly reliable, scalable, feature-rich, maintainable
without service outage, and include the ability to track and measure key
performance metrics. Most importantly, carrier-grade means a quality of
service (QoS) for end-to-end transmission that keeps voice quality within
the levels expected by consumers for commercial telephone service.
Solution
Network diagram
From a high level perspective, the same packet network can be used to
deliver multiple media to wireline customers including voice, video and
data. The diagram below illustrates connection of a call server to a Cable
based customer for providing voice data, and video services.
C all M anagem ent S erver
H eaden d
M ed ia G atew a y IP R o uter C M TS
p acket netw ork
PSTN
HFC
E m b edd ed
M TA
Solution Details
There are several options for using VoIP as the underlying technology for
primary line technology. The diagram at the end of this section shows the
architecture detailed by the CableLabs PacketCableTM specifications for
end-to-end VoIP, which is being deployed today. In this scenario, a standard
subscriber telephone connects through existing phone wiring to a new
Embedded Multimedia Terminal Adapter (E-MTA) that may be located on
the side of a home or within the home, depending on the packaging. The E-
MTA does the analog-to-digital conversion and packetizing functions, and
its embedded cable modem communicates with a Cable Modem
Termination System (CMTS) at the headend.
The network side of the CMTS typically includes the ability to route
signaling and voice packets. Signaling packets are exchanged with
softswitches in the service provider's network to set up and supervise the
call. Voice packets are routed to the called party through a packet network
to a remote CMTS or the PSTN via a media gateway. In addition to
handling call setup and supervision, the Call Management Server (CMS),
which at Nortel we consider a communication server, is the source of
revenue-generating subscriber features.
Key takeaways
Nortel can help the service provider make the transition. Nortel has a rich
history in telephony and a proven record in VoIP for cable, as well as an
installed base of more than 140 DMS switches carrying 4.9 million cable
telephony lines globally.
On the VoIP front, Nortel has been active in PacketCableTM
interoperability work at CableLabs. Its softswitch is PacketCableTM
qualified, and it has had a visiting engineer on-site at CableLabs for years.
Several cable operators have deployed Nortel VoIP solutions, giving it real-
world experience.
In addition, Nortel recognizes that VoIP networks come in all sizes and
should support all service types, and offers two versions of its carrier-class
Communication Server (CS). The CS 2200 (formerly known as the
Communication Server 2000 Compact) softswitch occupies a smaller
footprint, consumes less power, and is built on a commercially available
Compact PCI platform with open base software architecture. The CS 2000
is built on the DMS XA-Core multiprocessor platform. Both of these
platforms provide cable operators with the same powerful ability to deliver
local, long distance, and tandem VoIP services on a single platform. Both
platforms deliver the same applications, protocols, and functionality by
fulfilling the softswitch promise: software functionality independent of
hardware platform.
Broadband
The broadband market in North America continues to be a dynamic sector
as the competitive landscape and consumer demand for new
communication services continue to evolve. Driven by the need to find new
sources of revenue, service providers are looking for ways to unleash the
potential of broadband networks.
Nortel understands that wireline service providers need to deliver value-
rich service bundles—services such as VoIP, Multimedia Communication
Services (integrated voice, video, and data), broadcast and IP-video
(television), and data services. Our next-generation broadband solutions
are ultra-broadband ready, meaning that they have the high bandwidth,
Quality of Experience attributes needed to deliver the new “triple play”
service set (voice, data, and video).
Wireline carriers are losing customer ownership as their strategic position
slips. Cable competitors are targeting their customers with value-priced and
value-added alternatives to basic phone services. If the wireline carriers are
to survive and thrive, real service differentiation is needed.
Technical challenge
According to The Yankee Group*, significant capital spending will occur
in the broadband access market in the next four to five years—
approximately US$5 billion annually. This longer term spending trend is
being driven by a need for service providers to replace much of the existing
broadband equipment with a newer generation of infrastructure that is
capable of supporting a “triple play” business model.
Solution
Network diagram
The Broadband market has a wide range of technologies, all of which can
leverage a packet-based network to provide voice or other services. The
following diagram illustrates some of these technologies including voice
service through a traditional copper loop from a central office, voice
service through a Digital Loop Carrier (DLC), voice and other services
through Digital Subscriber Loop (for example, ADSL), voice and other
services through Fiber to the Curb (FTTC), and Voice and other services
through Fiber to the Home (FTTH).
Solution Details
Nortel has significantly expanded its Broadband Networks portfolio to
enable traditional wireline service providers to deliver a new set of value-
rich, revenue-generating services to consumers and small-to-medium
business customers over a high-bandwidth, ultra-broadband infrastructure.
Nortel Broadband Access Solutions couples best-in-class access products
from strategic alliances with a world-class portfolio of voice, date, and
transport products.
Strategic alliances provide a complete range of access products including
the following:
DSLAM, PON, and Mini-RAM products from ECI* Telecom
Multiservice Broadband Loop Carrier products for the North
American market from Calix*
Multiservice access products for the European or ETSI market from
KEYMILE*.
This powerful combination of new and existing products enables service
providers to deliver high-value, revenue generating services through a
reliable and scalable broadband infrastructure.
Nortel Broadband Fiber Solutions are based on leading Optical Access
technologies such as Fiber-to-the-Premise, Curb, or Business (FTTx)
utilizing PON technology. These powerful access products offer the
convergence of voice, video and data services over a single fiber
infrastructure, thereby delivering ubiquitous and seamless solutions and
eliminating the network bottleneck. Features include future proof and full
service set offering, reduced Capex for full service set network, reduced
Conclusion
Network convergence
The varied networks that exist today have evolved in parallel, and each
offers important attributes of its own. TDM networks are reliable, secure,
easy to use, and optimized for voice traffic. Data networks are efficient,
scalable, and optimized for packet traffic. Wireless networks are
ubiquitous, convenient, and optimized for mobility. The solution is to
transform these networks to maximize profit and market share without
sacrificing the valuable attributes of each. These dual goals can be achieved
by migration: transforming traditional networks into packet-based
networks that can offer all of the features and attributes of traditional
service in a simplified, cost-effective, service-enabling manner.
Nortel offers a broad and deep portfolio of services that fully leverage the
packet-based networks that service providers need to build. Service
categories are as follows:
Data networking services such as Virtual Private Networks
Mobility services such as voice over Wireless LAN
Integrated voice, video and data
Personalization of services to include content delivery and security
Next-generation residential and business services
Optical broadband services such as Storage Area Networks
(SANs), and optical Ethernet
Carrier-grade reliability
Nortel has a strong track record for service delivery that spans decades of
innovation while providing customer support with the implementation and
maintenance of these service-bearing networks. Carrier-grade service is
dependable, secure, and evolvable. We understand the importance of these
attributes, and support our customers in maintaining them.
Service intelligence
Nortel understands the capabilities and constraints in the transformed
network and how they need to be used, and can add new capabilities into
the network quickly, to give service providers the edge in offering new
services. This capability is called Service Intelligence, and it guarantees
network performance based on the end user's requests, network resources,
and service application needs. Service Intelligence allows IP networks to
move beyond “best effort” and dynamically adapt to customer
requirements.
Given convergence and higher performance levels, the next challenge to
value-rich service is resource allocation. The transformed network must
make intelligent use of resources through policy, authentication, billing and
QoS. When this is in place, the service provider can deliver with
confidence the services that users demand.
References
IDC, U.S. Hosted IP Voice: Market Analysis and Forecast 2002-2007.
Author: Thomas S. Valovic, IDC Study No. 28803, released January, 2003.
InfoTech, Enterprise Convergence: The Race for IP Telephony Supremacy,
report released April, 2003.
Chapter 23
Private Network Examples
Stéphane Duval
Tim Mendonca
Introduction
The purpose of this chapter is to demonstrate the ability to satisfy customer
requirements for real-time, converged networks with the technologies
discussed in this book using Nortel products as the example.
The focus is voice over a data infrastructure. It is not the intent of this
section to describe and explain data routed and routing protocol standards.
Also, the example will focus on the Headquarters that incorporates all
aspects of deployment. By adjusting the scale of the deployment, the
solution can be adapted to all sizes of organizations.
This example is by no means the only approach that can achieve the needed
QoE results. Used as a model, it can be adapted to develop a custom
solution based on unique organizational needs. A large variety of
interchangeable products from Nortel create limitless deployment options
for converged infrastructures.
Starting with the definition of the four types of convergence, Quality of
Experience (QoE) and Quality of Service (QoS), the identification,
definition, categorization, and characterization of a set of Solution Design
Attributes (SDA) to address different aspects of a solution’s architecture, a
common convergence vocabulary will be established. The goal is to
systematically gain a clear understanding of issues that arise in Enterprise
data networks when deploying real-time applications and learn analysis
and design techniques to ensure a high level of customer satisfaction and
network performance.
Getting Started
Business success in moving to a converged network will rely on knowing
the underlying criteria affecting the current state of your infrastructure.
This knowledge assists in the development of processes to evolve your
Business continuity
During the transition to a converged network, you will need to consider
what measures you can take to ensure the continuity of external services for
customers and the availability of communication applications needed by
employees, suppliers, and partners. Plan disaster recovery and redundancy
for mission-critical operations that are essential for conducting business.
Organizational dynamics
Moving to a single network that carries voice, data, and video traffic will
necessitate a new IT paradigm for managing the unified infrastructure.
Consolidated management policies will be needed, as will a redefinition of
roles and responsibilities of network management personnel previously
aligned with either the voice or data side of the Enterprise. During the
transition period, you need to consider that there will be real costs
associated with personnel realignment and retraining activities. Assess the
impact that moving to converged applications will have on how employees
carry out their day-to-day tasks.
Proven
Scalability Reliability Internet Performance
& &
Efficiency Service Management
Provider
District District
Office Office
Solution Redundancy
Resiliency
Regional Regional
Office Office
Virtual Circuit
Enterprise Solution
Physical Circuit
Corporate
Office
Figure 23-1: Example network
ISP1
ISP2
For the purpose of this design example1, we chose ATM and frame relay
circuits supplied by a local service provider. ATM is chosen for the core
network because of its proven capability to deliver QoS in a packet
environment. Frame relay has been chosen for the branch/district offices
based on its ability to deliver high bandwidth at low cost. While frame
relay can prove to be challenging when it comes to real-time, converged
networks, we have kept it in this example due to its widespread adoption
and low cost.
1. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
Scalability Proven
& Reliability
Efficiency Contivity
Alteon Switched
Effectiveness, or
Firewall
Passport
Ability
&
Security Redundancy
WAN
Solution Service
Resiliency Provider
DMZ
Performance
&
Redundancy
Management Contivity
or Internet
Alteon Switched
Firewall
Passport
Service
Provider
2. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
WAN
Internet
Service Service
Provider Provider
FTP & Web Servers LDAP Symposium OTM Call MSC 5100
Servers Call Center Pilot PSTN
Server
Alteon Switched
Firewall Firewall Security
IP Telephones
Software Telephones
Desktop PCs
Optivity Proven
NMS
Scalability Reliability
& & Alteon Switched
Efficiency InteroperabilityPassport
8600
Firewall
Contivity
Application
Switch WSM
Dual-homed
Servers
Redundancy WAN
Service
Performance Redundancy
Provider
& DMZ
Dual-homed Management
Servers
Application
Switch WSM
Contivity
session and voice calls will not be terminated and user performance impact
will be kept to a minimum.
appropriate access control rules, and help identify and discover violations.
Risk and vulnerability assessment must be performed at all levels of the
network.
Without appropriate security features, VoIP networks are much more
vulnerable to eavesdropping, theft and denial of service than traditional
telephony networks. Logical security, where the system is contained to an
isolated intranet protected by firewalls, is not sufficient in today’s
environment. A system can be subjected to internal attacks from a
malicious user or from a pervasive worm that is transferred to a hard drive.
IP telephony systems are now connected to the corporate intranet or the
Internet. As such, security needs to be enhanced to address the new world
threats.
Security in an IP environment is based on the following components:
Physical and logical security of the infrastructure (such as end
points, switches, and routers)
Network Element (NE) security of all the system components
Equipment security of the servers and other hardware
Software security of the applications
Client security regarding access control and privileges to the
systems
Security of soft clients on multiuse personal computers
Demilitarized Zone (DMZ). A DMZ is a term first used in complex
multiple machine firewall setups, where a computer is placed outside the
firewall, but is still available for use by the internal (protected) network.
The advantage of a DMZ computer is it can use and receive the entire
Internet. The disadvantage is that it may be vulnerable to attack from
parties unknown.
Secure Voice Zones (SVZ). Securing telephony is an important step in a
comprehensive security strategy. A secure telephony solution framework as
part of a unified security architecture leverages both traditional resilient
telephony switched networks levels and a sustainable migration path to a
converged IP network. All levels of security call for a secure voice or IP
telephony zone since all IP telephony servers are vulnerable to attack,
malicious or otherwise from within the Enterprise as well as outside. A
stateful firewall with SIP and H.323 protocol support is needed to provide
an SVZ and four levels of security, minimum, basic, enhanced and
advanced that are based a unified security architecture to ensure these
critical servers and call servers are highly available.
Security. SVZs for IP telephony devices must ensure accessibility without
compromising the confidentiality and integrity of other Enterprise network
resources.
Proven
Scalability Reliability
& &
Efficiency Interoperability
Optivity Content Cache
NMS
Alteon Content Director
SSL
Alteon Switched
Passport Firewall
8600
Contivity
Application
Switch WSM
Dual-homed
Servers
Redundancy WAN
Service
Performance Redundancy
Provider
& DMZ
Dual-homed Management
Servers
Application
Switch WSM
Contivity
Passport
Effectiveness, 8600 Internet
Ability
SSL Alteon Switched
Firewall
Service
Alteon Content Director
& Content Cache Provider
Solution
Security
Resiliency
Solutions Management
Several management service services are added in order to control the
HDC, a security Manager to manage the Firewalls and Contivity devices
and a Network Optivity* NMS, OSM, and QoS Policy Manager to monitor
network devices and establish QoS policies and packet prioritization.
Effectiveness, Scalability
Ability & Proven
Wireless LAN Performance
Security Manager & Efficiency Reliability
WLAN 2250 Security & &
Management Interoperability
Optivity
NMS Content Cache
Alteon Content Director
Network SSL
Alteon Switched
Manager
Passport Firewall
8600
Contivity
Application
Switch WSM
Dual-homed
WAN
Servers Solution Service
Resiliency Redundancy Provider
Redundancy
DMZ
Dual-homed
Servers
Application
Switch WSM
Passport
Contivity
Internet
8600 Service
SSL Alteon Switched
Firewall Provider
Alteon Content Director
Content Cache
Security
Manager
Content Manager
Wireless LAN access can also be added with the use of a WLAN 2220
wireless access point and the security is managed by the WLAN security
switch (WSS) 2250. An adaptive Wireless LAN solution is also available
with Access Ports (WLAN 2XXX) and WSS 2270 providing security and
end-user roaming capabilities.
Several underlying protocols and services are used to maximize the
manageability and performance of this solution. DHCP is used to provide
IP address, network mask, and default gateway to devices.
Users are used to having phone service maintained during power outages
for emergency calls, including 911. This requires the consideration of
Power over Ethernet (POE) (802.3af), which is a new standard to provide
power to hard VoIP clients. There are a number of issues that have to be
addressed. Before picking a strategy, certain business and regulatory
aspects need to be considered including: 911 services, redundancy, heat
dissipation and power requirements.
Before POE, most VoIP phones got power from a power brick that was
plugged into a standard power outlet. In the case of a power failure, the
phone would be inoperable. If POE is implemented, it is important to make
sure that the power source for POE will continue to operate in the case of a
power outage.
When sizing POE requirements, a number of additional issues need to be
taken into consideration to include: redundancy, survivability, power draw,
heat dissipation and air conditioning. Usually POE is implemented in the
wiring closet and the last three items are overlooked, which create other
problems and additional cost.
IP clients can be assigned IP addresses basically three different ways:
statically, partial DHCP and Full DHCP. A static IP strategy is the most
secure, but also the most costly and cumbersome to implement. A Full
DHCP strategy is the least costly and easiest to use from the user
perspective but does introduce some security risk. VoIP and multimedia
clients can use either an existing data DHCP server or provision a separate
DHCP server for VoIP and multimedia applications. It is preferable to
provision a separate DHCP server for security and performance reasons.
Furthermore, the VoIP and multimedia DHCP server should be placed in a
Secure Voice zone with other VoIP and multimedia components such as
Call Servers and application servers to limit access.
3. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
Two existing major sites (New York and San Francisco) that have a
traditional PBX with 12,000 existing TDM users of which 2,000
users need to be mobile is required.
A new campus in Los Angeles for 10,000 new users with
geographical redundancy is required.
The new campus requirement in Los Angeles is the new
corporate headquarters and the customer is looking at
maintaining the rich telephony set of features they currently
have in their voice network but want to move to IP Telephony.
They are looking for a fully distributed solution that has
geographical redundancy, which is defined as the ability to
distribute redundant call servers in different locations to provide
site redundancy.
This site will be required to support 10,000 total users. The
customer has determined that 2,000 of the users will receive
sufficient service from digital sets that were freed up from the
existing PBX site.
Support a new regional office in Chicago with 800 users.
The new regional office in Chicago has a requirement for 800
users and the customer wants to have the rich set of telephony
features they already are accustom to, but implemented on a
pure IP Telephony platform. They want network transparency
and a common dialing plan for their VoIP network.
Support a growing branch network of twenty to thirty users per
branch.
The customers new branch office requirement is for much
smaller sites and a different division of the company. They are
looking are looking for a centralized call processing approach
with the same feature set as the rest of the division due to the
mobility of the users.
The customers existing branch offices are supported by
traditional key systems. The customer wants to upgrade these
sites to be integrated into the overall network and upgrade them
to VoIP capable systems. The customer wants to consolidate all
voice and data requirements in the branch into a single
platform. This includes data and voice services to include:
Routing, VPN, VoIP, TDM set support, Call Center, Unified
Messaging and web support.
In this case, we will assume the customer recently acquired a
company that had an installed Norstar* key system that can
easily be upgraded to BCM maintaining the majority of the
investment in technology, training and support.
Call Server
The Call server provides the basic telephony services traditionally found in
the PBX along with new services required to deal with an IP infrastructure.
Firewall
Network
Call Signaling
Gateway Server Server
Up to 15,000 Users
IP Predominant
Communication Server
1000E
Regional office
The regional office similar to the new campus has the advantage of building
a new network that can accommodate the requirements and bandwidth of a
real-time network. However, special consideration has to be taken as it will
be considered a smaller site than the campus and, therefore, the tendency is
to take shortcuts and cost cutting measures when building a network for a
location like this.
While this may be acceptable for a pure data network, a real-time voice
network always has to be built to the highest specifications if quality and
connectivity are the goal. This is a design goal that is built into voice
networks and never questioned. In general, a voice network has been built
to meet specific bandwidth requirements to carry a specific load of traffic.
Data networks traditionally were built based on a constrained budget and
the assumption was due to the difference between LAN and WAN
technologies that users would never have enough WAN bandwidth;
100-1000 Users
IP Predominant
Communication Server
1000S
IP Call Center
The CS 1000B is recommended for the distributed Symposium IP Call
Center sites. The CS 1000B is a uniquely configured branch solution that
allows Symposium call center to be distributed to remote sites. The IP sets
on the CS 1000B are redirected to appear on the central site system in the
case of this network the CS1000E.
While frame relay can carry VoIP effectively, it is a big challenge and there
is no real QoS in frame relay, only congestion notification. Therefore,
special engineering and care needs to be taken if you plan on implementing
VoIP over a frame relay network.
The new branch network is shown in Figure 23-14. As can be seen, there
are a number of various speeds depending on the site location. When
dealing with frame relay, you need to be concerned with a number of issues
including the speed of the service, access rate, segmentation, shaping,
policing, pacing and PVC allocation.
Branch Offices
In this case, the Nortel solution would be the Survivable Remote Gateway
(SRG). It is recommended that the frame relay network be built as a full
mesh with separate PVCs for VoIP to assure the best possible QoE. It is
further recommended that all remote sites frame relay channels be put on a
full T-1 and not a fractional T-1. Adhere to proper shaping and pacing to
keep the ingress side of the frame relay network from applying policing
actions that may make any offending packets either Discard Eligible (DE)
or actually discard them.
Gateways
Gateways basically provide some type of protocol conversion function.
Gateways in IP Telephony provide the same functionality to legacy
environments and to different connection protocols. There are proprietary
and third party gateways. Some of the basic gateway functions that may be
required to be performed are as follows:
VoIP to TDM phone
VoIP to PSTN Facilities
VoIP to Legacy Applications
VoIP to Legacy (TDM) terminals
SIP to H.323
H.323 to SIP
Currently, there are two major and competing connection protocols:
Session Initiated Protocol (SIP) developed by the Internet Engineering
Task Force (IETF) and H.323 developed by the International
Telecommunications Union (ITU). Calls can be connected from an H.323-
based system to an SIP-based through a H.323 to SIP gateway. However,
only a small set of basic call features are supported.
The following table list the Nortel Gateway options.
Clients
The customer has a requirement for a number of different clients and
services for all their sites. They prefer to have a ubiquitous service offering;
that is, seamless across the network. This implies support for many types of
clients and facilities to include: Wired IP, Soft IP, TDM, wireless IP, and
PDAs. It also includes TDM trunking and SIP & H.323 IP Trunking.
Soft IP
Wireless IP PDA
Wired IP
TDM
Any Place
Any Time
Any Device
TDM SIP & H.323
Trunking IP Trunking
Figure 23-15: Support clients and facilities
The decision to go with standards-based SIP or H.323 phones and
proprietary solutions that will work in the SIP and H.323 is a very heated
discussion. Some believe that standards-based is the only solution; but
these solutions, while reducing cost, generally lack in the rich feature set
that proprietary clients will provide.
IP clients are comprised of a number of devices to include: hard clients (IP
phones); soft clients (PC Clients); wireless clients (PDAs); and multimedia
clients that may support VoIP, Video, and Instant Messaging.
Deployment issues for these devices depend on a number of issues; but,
generally should be governed by a well thought out QoS and Security
Policy. The QoS strategy should include tagging at either Layer 2 (802.1p)
or Layer 3 (DiffServ). This not only provides QoS on the LAN, but allows
you to map these priorities to core technologies in the backbone.
Additionally, the voice or multimedia can flows can be separated onto
separate physical subnets or separate VLANs.
Clients
There are no best options here as Nortel supports a plethora of clients to
include: analog, digital, IP, wireless, PDA and third party devices on all call
server systems. The hard clients get their software load from the call server
they log into and, therefore, can be used on different platforms with no
changes.
Applications
Applications cover both traditional applications found in telephony, such as
conferencing, unified messaging and call center, along with new
applications found in the multimedia revolution, which include instant
messaging, desktop video, collaboration, follow-me services and
customized personal control (Personal Agent).
Applications can be implemented in both a centralized and distributed
architecture. The choice between centralized and distributed applications is
a complex decision based on a number of cost, performance and scaling
issues. For instance, in the case of unified messaging, you need to evaluate
the overall requirement for individual sites and all sites together to first
determine if a single system could be used. If a single application platform
can support the overall requirement plus scale to the projected growth of
the network the next step would be to determine the network bandwidth
required to support a distributed environment, along with the cost and
performance.
If a distributed application approach is required, there are a number of
additional issues. The first is to measure the cost of duplicated services as
opposed to a centralized approach. The second issue is to determine the
complexity of networking multiple application servers over the network to
provide the same level of service and performance as a single server.
Network bandwidth will have to be evaluated even though it should not be
near the load of a centralized solution.
Sometimes the solution is easier based on the requirements, if evaluated
properly. For instance, in the case where two major sites required a total of
two servers to service the load and each site could be serviced by a single
server, the decision would be obvious from a technical perspective, which
would be to go to a distributed environment. However, note that this does
not take into account the cost of managing and maintaining the system,
which should be considered.
The basic nature of application servers in a VoIP and multimedia
environment naturally challenge the ability to deliver a high level of QoE.
Most application servers, due to requirements or architecture, break down
the end-to-end nature the Internet was built on. In the case of a Unified
Messaging system, it has to store and forward messages. In the case of
VoIP, it can potentially cause a double transcoding, which in turn increases
the demand for the network to be loss-free, error-free, minimal delay and
minimal jitter.
Both centralized and distributed approaches have their benefits, depending
on the requirements. However, a long standing computer paradigm has
been to use bandwidth instead of computing cycles whenever possible to
minimize the complexity of the system.
Unified Messaging
When designing a network for a centralized unified messaging platform,
much care should be taken into account on the architecture of the network
and of the unified messaging platform. Most VoIP networks today are
financially based on implementing the G.729 codec (CELP) for bandwidth
savings. This codec is known to be near toll quality performance, but that is
based on a single transcoding.
Many store and forward applications servers like a unified messaging
platform, which may introduce an anomaly called multiple transcoding.
This is where the message will be transmitted across the network at G.729
and transcoded back to G.711 when it hits the application platform and
then is stored in another compression mode. When the message is picked
up, it may be compressed back to G.729 for transmission on playback.
Depending on the quality of the network, voice quality may degrade
substantially.
In this case, you will want to design the network to be able to either record
or playback at G.711; thereby, eliminating a multiple transcoding. Another,
solution is to store the message in the original algorithm or as an RTP
stream. However, this is in the domain of product enhancements and these
solutions still have a number of limitations and generally are not
implemented well at the time of this writing. The purpose of this discussion
is to shed light on some implementation issues and potential solutions to
unified messaging architectures, and not to discuss the relevant advantages
and disadvantages of unified messaging architectures.
Unified Messaging
CallPilot is a unified messaging tool that utilizes speech recognition and
TCP/IP digital networking to give complete access and total control of fax,
e-mail and voice messages. Using simple voice commands, like “play” or
“print,” a user can remotely manage their multimedia communications over
the telephone. The user can print faxes, store or delete voice messages and
more just by speaking.
Call Center
As discussed before, call centers are generally customer facing and key to a
company’s success. Therefore, IP Call Centers need to be implemented
with the highest priority for these VoIP flows. Additionally, they should be
made as robust and resilient as possible assuring that there is sufficient
bandwidth along with load sharing appliances to assure maximum
performance. A Call Center is a high candidate for a separate IP network
for the application to assure maximum quality.
It should be realized that once an IP network becomes congested, the TCP
algorithms will kick in and throw away packets to preserve the network for
high priority applications. However, this is very disruptive to real-time
protocols. One way to protect against this is to put critical VoIP network
requirements on their own network.
Call Center
Symposium Call Center is recommended as the call center solution.
Symposium is an industry leading solution that traditionally is considered a
centralized solution being associated with a single Call Server. The unique
configuration of the CS 1000B allows symposium to be implemented as a
distributed IP Call Center Application.
Multimedia
It should be noted that while multimedia applications are considered to be
real-time applications similar to VoIP, there are differences as originally
identified in ATM AAL service class analysis of applications that define
the attributes to be the timing relationship required between the source and
destination, whether the bit rate is constant or variable, and if the
connection mode is connection-oriented or connectionless.
Multimedia requirements are diverse in their bandwidth and timing
relationships; however, for the most part VoIP is the most demanding when
it comes to the timing relationship. Therefore, if voice is expected to be
near toll quality, it will require being given the highest priority in the
network over all other applications.
For instance, in a true multimedia call between users where VoIP, video,
instant messaging, application collaboration and FTP may all be happening
simultaneously, VoIP is the only service that if call quality is affected the
entire session is in jeopardy. Most times, minor degradation in quality of
the other services will generally be a mild distraction. Many times the
video can be turned off to gain bandwidth for other applications and turned
on only when applicable.
The MCS5100 Multimedia Call Server
Our VoIP offering is enhanced by multimedia services, which remove the
barriers of distance and location with applications including video
conferencing, instant messaging, collaborative whiteboarding, and
dynamic call handling. These applications are delivered by the MCS5100.
MCS5100 is designed as Multimedia Call Server that can provide
multimedia services as an overlay to any customers existing voice network
and a total voice and multimedia solution to new customer requirement.
MCS5100 can act as a standalone system providing a basic set of telephony
features available with the SIP protocol and can, additionally, overlay
existing PBX solutions with an option called converged desktop.
SIP/PRI
Gateway
Current Phone
Existing
Multimedia
PBX
PC Client
send them an instant message that may appear on the user’s phone, PC,
pager, or whatever the user’s preference is currently set to.
The end-user should have the capability of connecting to the conference
bridge from PC soft-client, mobile phone, or office phone. The last stage of
a call usually involves discussing or sharing a document. Instead of
collecting FAX numbers and e-mail addresses, the conference chairperson
can send files, push web pages, and share a whiteboard application with
other conference participants.
A highly-mobile worker deals with coworkers in several different regions
globally. A worker should be able to communicate with others when they
are in the office using their phone and PC, or when they are at home using
their PC on a cable modem or DSL line, even from a hotel with Internet
access or a web terminal. Communications over the public network should
be protected utilizing secured, encrypted VPN technology.
If this were a Nortel solution, the MCS 5100 would provide Instant
Messaging, conference bridging, whiteboarding and follow-me voice
services.
Meet Me Conferencing
In addition to the base service ad hoc conferencing capability, an optional
“Meet Me” Media Conferencing application. Meet Me Media
Conferencing should require no reservations and utilize soft DSP
technology, which reduces the cost and footprint when compared to TDM
based in-house conferencing. Users access the service with a dial-in
number and passcode just like many used today with TDM-based
outsourced conferencing.
The service should support both a G.711 as well as a G.729 codec to
accommodate lower throughput networks or DSL access. For Enterprises
currently outsourcing their conferencing, Meet-Me Media conferencing
can produce immediate, significant savings.
If this were a Nortel solution, the MCS 5100 provides a highly scalable
audio conferencing solution providing visual notifications to the
Chairperson of conference activity and from an ROI perspective offers
significant savings to an Enterprise that may today be providing services
through a third party provider. Unlike most conferencing services, the
chairperson gets notification of participants entering and leaving the call,
so there is no more asking who joined during the middle of the
conversation. The solution will support point-to-multipoint video
conferencing by late 2004.
Personalized management
A personal agent is a web-based portal for accessing all of the advanced
features listed here. Customizable settings include all of your contact
Personal Agent
Flexible Access
Figure 23-19: Personal agent
Presence is a key feature of doing mobility well. Presence is the concept of
a system treating you as one user with multiple devices. Instead of multiple
phone numbers, addresses, and separate services, the user has a single set
of services, coordinated across their multiple devices.
The Automatic Presence feature can be enabled if you want so if you don’t
touch the keyboard or mouse for a selectable period of time, your presence
status is changed to offline; then, all the people that have you as a “friend”
will see that you are offline.
For example, if a manager has a little phone next to their status icon, then
you know they are on the phone and your calls will go to voicemail. If you
require a quick answer to an easy question, you can send an IM and they
can get your answer right back. This allows you to get quick answers to
questions while you are still on the phone with the person who asked the
question, which reduces multiline phone interruptions while on the phone
and reduces the number of voicemail messages that need to return.
MCS 5100 Call Manager allows preferences to be set on how to handle
calls. Though an administrator can set defaults and get in there if they need
to, this is designed to be set up by the end user. MCS 5100 Call Manager is
very intuitive and easy to use. Different profiles can set up, based on
whether you are in or out of the office, allowing calls to be routed
differently. For instance, different treatment to calls are allowed coming
from your family to get to you wherever you are—or perhaps automatically
straight to Voice Mail.
Utilizing the presence concept, the MCS 5100 provides you a visual
indication of the status of your close contacts (that is, “Friends”). Time can
be saved by looking at their status and availability, helping to decide the
best way to communicate with them at any given time.
Custom applications
Custom applications are the abilities to take a multimedia system and
leverage the advantage of being a standards-based solution while
leveraging “killer apps” (applications) that may be developed by a third
party. Custom applications are about taking a feature set like MCS 5100
and customizing it for specific applications.
Conclusion
A major advantage of the Nortel solution is that an existing Nortel
customer does not have to implement a “forklift” strategy to take advantage
of new offerings and capabilities offered in the new IP space. Additionally,
new non–Nortel customers are not necessarily forced into a full VoIP
solution if it is not required. As much of a compelling story that VoIP and
multimedia is, it is beginning to be realized that not all customer voice
requirements include the flexibility and mobility that VoIP delivers. While
VoIP is probably the ultimate solution, it requires a substantial investment
in the data network to provide and guarantee the same quality and
bandwidth that a traditional TDM system provides. Until data networks can
be built without constrained bandwidth and all devices are QoS aware,
many customers may find it easier to implement traditional voice systems
for many of their voice applications.
Chapter 24
IP Television Example
Ed Koehler
Chris Busch
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
SONET / TDM
Introduction
In today’s carrier, service-provider, and Enterprise networks, the use of
advanced IP-based networks has become more commonly accepted as an
alternative to the more traditional modes of media transport.
The use of this technology for IP-based television has been gradually
increasing without fanfare in the industry at large. For some time, it has
enjoyed acceptance outside of North America, with early beachhead
deployments in both Europe and the Asia-Pacific Rim. Recently, however,
it has begun to generate increased interest from Local Exchange Carriers,
Regional Bell Operating Companies, and Internet Service Providers (ISPs),
as well as Metro Area Transport providers that are beginning to implement
Ethernet-to-the-User (ETTU) networks in residential high-rise
applications.
Enterprises are also finding that properly leveraging high-speed advanced
IP networks allows them to offset much of the cost of implementing a
traditional television headend and fiber/coax distribution system.
The requirements demanded by IP-based television are very stringent on
the IP network. Not only is there the requirement for IP multicast
capabilities in the network, but also the need for a robust and stable
deployment that has only recently arrived in the market.
In order to understand the improvements that had to be made, it is
important to review IP multicast from a generic perspective. This document
reviews the basics of multicast and then compares that generic working
architecture against the requirements of an IP-based television head end.
By comparing a generic IP multicast deployment against the application
requirements of the reference model, areas where optimization had to be
made can be highlighted.
was not. However, achieving it in a scalable and stable manner, with the
necessary flexibility, proved to be more difficult.
In light of the fact that routers do not forward any multicast or broadcast
activity, a method was needed to provide Layer 3 links across the routed
boundary. All multicast routing protocols are methods to address this
requirement.
DVMRP is a good all-around multicast technology. While it does score
comparatively high from a network overhead perspective (a result of the
routing table update requirement inherent in vector-based routing
protocols), it scales relatively high as well and is well adapted to dense
mode networks. This is particularly true when DVMRP is implemented
with the right routing policies and features. The newer PIM-SSM,
technology, however, raises expectations by showing a promise of scale
beyond that of any other protocol.
There is another consideration that is equally important: IP-based
television is a single-source multicast model. The premise of the
application is to make a single stream (the television channel, which is the
equivalent of an IP multicast group) available to multiple viewers. Both
DVRMP and PIM-SSM are source-driven or reverse-path implementations
of multicast. For this reason, the source of the multicast activity is always
the root of the network tree, unlike a shared-trees approach such as PIM-
SM, which starts the build of the multicast tree from an independent root
known as a Rendezvous Point (RP). For these reasons, both DVMRP and
PIM-SSM are logical choices as Layer 3 routing protocols for IP television
headend implementations.
roughly one half second. The LMQI is the amount of time between the
group-specific queries, and the robustness is a factor of expected data loss
on the network (high values mean high data loss, and, therefore, more
queries will be sent).
By this coordination of the two protocol environments, IGMP and
multicast routing (DVMRP and PIM-SSM), the state of the multicast event
is maintained. Figure 24-2 illustrates several stations on a common Layer 2
segment. One station is leaving multicast group 228.1.1.1, but there is
another client on the segment who is sourcing a membership report. This
signals to the edge router that there is still solicited interest in the channel,
and the edge router does nothing but continues to serve the stream for that
group. There is another client that is leaving 224.1.90.5. Because the router
does not see any other IGMP reports for that group after the standard
interval, it is pruned off of the segment. A new station request, such as the
one for 224.1.1.1, will require a reverse-route Shortest Path Tree (SPT) set
up as a join to the multicast group. This will require some latency for the
build of the tree join. The farther away the client is from the sending
source, the longer the setup latency will be.
edge, where the IGMP control is accomplished along with the routing
functionality at the same interface.
1. Although some implementations (specifically those of ADSL providers) are looking to reduce this
requirement, others (like some cable applications) use more bandwidth for better image quality.
2. It should be noted that speeds as high as 6 or 8 Mbps are also used in Standard Definition Video.
3. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
4. The requesting client, and build the extension of the tree out to the new viewer.
5. To ease the configuration task of such tables, a centralized management system usually allows bulk
configuration for hundreds of switches with such entries in the Nortel Layer 2/3 core switch
implementations.
6. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
viewing the same channel and one of them changes to another channel,
both STBs will lose the channel. The second STB will source an IGMP
report for the channel and the edge router will join the port back onto the
event so the loss of video will be intermittent, but it could be for up to one-
half second. This is enough to cause issues. As a result, a feature known as
adjustable LMQI was developed, which allows for multiple STBs on a
single VLAN model. This allows for the fine-tuning of the LMQI value,
and its association with the IGMPv2 leave improves the handling of this
process.
In this scenario, when the first STB sends a leave, the Layer 2 switch
removes the STB port from the group. But the edge router still keeps the
stream active. At this point, the Last Member Query Interval (LMQI) timer
is started at the edge router. During this time, the edge router listens for any
report activity for the multicast group, following the IGMPv2 leave
process. In the case of the above example, STB2 will source an IGMP
report according to standard interval (125 ms). When the switch sees the
report, it continues to serve the stream and STB 2 does not experience a
service interrupt. If the LMQI were to expire and no reports were received,
the stream would be deactivated. Figure 24-6 illustrates these features.
7. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).
case. There are methods to use the RTSP port connection as noted below.)
The figure below illustrates these dialogs.
the case even though the service provider has already allocated 10.5 Gb/s to
figure for the 50% of the subscriber base. Clearly, there needs to be another
option for VoD deployment.
All of this is symptomatic of centralized serving of the video on demand
service. The difficulty with the centralized server approaches is that they
are prone to these issues. These are in fact the tractable limits that such an
approach has inherent in it. Distributing the server process out to edge
provides a much more scalable method for video on demand services by
doing two things. By distributing servers out closer to the viewing base, the
bandwidth aggregation factor is reduced in relation to the reduction in the
size of the subscriber base served.
As an example, if the 7,000 subscriber networks were served by ten servers
instead of one, the subscribing base to each server would be 700. Now a
bandwidth aggregation factor of 2.1 Gb/s is the result with a 100%
overprovision. By oversubscribing 50%, the bandwidth factor becomes a
manageable 1 Gb/s and only 350 subscribers are thrown to the wolves with
the possibility of service refusal during the average program run. Now let’s
multiply the number by four, so that the 7,000 subscriber network is served
by forty VoD servers. This means that each server will provide streams to
175 subscribers. This results in a bandwidth aggregation factor of a quite
manageable 525 Mb/s. In this scenario, the service provider can now easily
provision for 100% of the viewing base; thereby, resulting in a true real-
time video on demand service offering.
The second aspect to consider is that distribution of VoD servers out
towards the edge of the RTSP control flow loop is likewise shortened. This
improves user experience as well as reduces the burden on the server
because of the reduced subscriber base that the servers are supporting. The
figure below shows a network topology. In this figure, the VoD servers are
placed out at the topology aggregate to the subscriber edge. As shown, the
viewing population has been reduced as a result to this as well as the fact
that the video on demand traffic burden has largely been lifted off of the
core of the network; thereby, freeing it up for multicast video as well as
other real-time services such as VoIP.
Any other requests for that piece of content would be served from the edge.
After a given period of time, provided no one else requests the content, the
suffix would be purged from the edge server, whereas the prefix would
remain cached at the edge for the next request.
The distribution of content is not in itself sufficient to address the complete
demands of the clients request for media. There needs to be a way to
provide for request routing of the clients request to the appropriate video
server. When a video asset is published into the video server system, a
series of metadata is typically created. This metadata may provide content
description and duration as well as stream speed and appropriate player
call. One of the other metrics that are created is a URL for the content
request. By providing an RTSP redirection to the local edge server for each
VoD request, the previous content distribution model would be leveraged.
Appendix F contains further details on the Web Streaming methods and
practices.
VoD bandwidth
To understand bandwidth in a Cable MSO network, we will describe how
bandwidth is determined in the last mile; therefore, how services are
constrained for VoD.
A 6 MHz North American Cable channel modulated using 256 QAM
yields 38 Megabits of derived bandwidth given the modulation chosen.
Therefore, with Mpeg-2 encoding qualities, it is proper to assume each
channel contains ten VoD services—each service equal to 3.8 Megabits.
VoD narrowcasting
A Cable network, as with any network, requires segmentation in order to
scale to the subscriber base and offer additional bandwidth efficiency.
Cable networks accomplish segmentation via RF combiners in the Hubs
serving subscribers. If 5-550 MHz of frequency is always present on one
coax feed, you can “insert” higher frequency ranges at an individual hub
for “narrowcasted services.”
Example:
Broadcast and Inband set-top services are always present and carried to all
hubs as the “flat” 5-550 MHz network. In the Hubs, we choose to add VoD
into the 550-750 MHz range.
We do so by contributing the services to the proper RF channels, then
combining them with the “flat” 5-550 network; therefore, delivering 5-750
MHz with the 550-750 range being unique to this hub alone. This method
of micro segmentation for a frequency plant is known as spatial reuse. The
term is appropriate as we reuse sections of the 600-750 MHz space for
every narrowcast segment created.
Summary
This chapter has covered all aspects of unidirectional video transport
utilizing IP networking technology. First, the multicast delivery of video
content was covered with a review of industry prevalent methods and
directions for multicast technology. Many optimizations need to be made to
the standard Internet Service Model (ISM) for multicast. Among these are
DVMRP and PIM multicast static routes, IGMP snooping and timing
optimization, as well as access control and management. The reader should
be able to discuss these enhancements and how they provide drastic
increases in the performance profile of the multicast service model for the
support of IP-based television services in a noncable network topology
environment. The reader should also be able to discuss dense versus sparse
mode multicast models, as well as single source modifications to sparse
mode multicast.
Aspects of unicast video were also discussed. First, standards-based Video
on Demand services that utilize RTSP/RTP transport were covered. The
mechanics of each protocol were discussed and how they relate to the
actual service from the user, client or set-top box perspective. Traffic
engineering concerns were also discussed and the different bandwidth
demands that Video on Demand services exhibit on the IP network when
compared with multicast video delivery. Centralized versus distributed
Video on Demand architectures were discussed with a comparison of
bandwidth demands for each.
Finally, VoD service offerings within the cable provider environment were
covered. Cable networking communication paths were discussed with
particular emphasis on Video on Demand. The reader should be able to
explain how IP-based VoD services are ‘overlaid’ onto the QAM transport
that is used in the CATV network. The reader should also be able to
describe the initialization of the set-top box and how it is brought up on line
to the cable service offering.
Appendix A
Additional Details about TDM
Networking
SONET/SDH hierarchy
Knowing how many voice channels fit into each level of the hierarchy can
be used to calculate the total payload of each system. Moving up the chart
in Figure A-1, for example, we see:
DS1 24 channels
DS3 24 channels X 28 DS1/DS3 = 672 channels
OC-3 24 channels X 28 DS1/DS3 X 3 DS3/OC3 = 2016 channels
OC-12 24 channels X 28 DS1/DS3 X 12 DS3/OC3 = 8064
channels
OC-19224 channels X 28 DS1/DS3 X 192 DS3/OC3 = 129024
channels
Appendix B
RTP Protocol Structure
The structure of the RTP packet is shown in Figure B-1. The following
paragraphs give a detailed description of each field.
bit 0 8 16 24 32
CSRC
V(2) P X M payload type sequence number
count
timestamp
UDP packet
… padding count
length of the variable elements depends on the packet type, but it must end
on a 32-bit boundary. The alignment requirement and a length field in the
fixed part of each packet are included to make RTCP packets “stackable”.
The means that multiple RTCP packets can be concatenated to form a
compound packet to be sent as a single packet of the lower layer protocol,
such as UDP. No separators are needed. An example of a compound RTCP
packet as produced by a mixer is shown in Figure B-2.
SSRC
SSRC
SSRC
SSRC
SSRC
SSRC
SSRC
SDES
sender
BYE
SR
CNAME PHONE CNAME LOC report report reason
report
site 1 site 2
compound packet
UDP packet
Figure B-2: Example of an RTCP compound packet
Appendix C
Additional Information on Voice
Performance Engineering
This appendix provides additional details and discussion of jitter and the
jitter buffer.
A dedicated voice packet network in a steady state (that is, having
continuous voice calls, no silence suppression, no data load, and
congestion free) will have a quasi-static flow pattern. Thus, it will be no
packet jitter. There will be a distribution of delay for the various calls, but
delay will be invariant for any particular call. In a changing flow pattern,
where voice calls are being set up, cleared down, or where silence
suppression is in use, the changing instantaneous load vs. output link speed
will give rise to changing contention for the output link resulting in
homogenous jitter. When data is added (and associated forwarding classes
to prioritize voice and data traffic), the relative traffic load of each
forwarding class vs. the output link speed will give rise to changing
contention for the output link, resulting in heterogeneous jitter.
Figure C-1 shows the relationship between the loading (utilization) of a
link and the amount of jitter experienced on the delay. Note that lower
speed links have generally higher jitter at all values of utilization, and that
lower speed links also show inflated jitter at lower loading than do high-
speed links. This reflects statistical smoothing of the traffic for high-speed
links.
Network jitter
Network jitter refers to the jitter in a core network that uses generally high-
speed links (>= 10 Mb/s). Jitter is no longer significant above 10 Mb/s,
providing the post-90% loading asymptote is not reached. In situations
where the statistical multiplexer output link loading is less than 90%, jitter
is bounded to a few milliseconds and will have no or negligible impact on
voice quality. Where statistical multiplexer output link loading is not
controlled (that is, no admission control or under-provisioning,) loading is
unbounded (>90% –> 100%) and, therefore, jitter is unbounded – the delay
can rise asymptotically. Voice quality becomes unpredictable and unstable,
especially as packets are being dropped. For every 10 ms of additional
jitter, voice quality degrades by 0.5R @ Delay =150 ms, 1R @ Delay = 200
ms, 1.3R @ Delay = 250 ms (Delay is the mouth-to-ear delay one-way).
The potentially deleterious effects of homogeneous jitter and packet loss
are only adequately resolved by bounding the% load, using network
Access/source jitter
Access jitter refers to the jitter in the access network that generally uses
low-speed links (< 10 Mb/s). As the data loading increases relative to the
voice, the probability that a data packet is in the process of transmission
increases, and the voice jitter increases, even with strict priority of voice
over data. For a given relative voice/data loading, as link speed drops, long
data packets take more serialization time, scaling the voice jitter. Jitter in
all low-speed packet access networks (cable, Enterprise, xDSL) can dwarf
network jitter in high-speed networks by several orders of magnitude.
Depending on the access link speed, the potentially deleterious effects of
heterogeneous jitter may be bounded by limiting data load, and segmenting
and/or preempting long data packets.
Jitter distribution as a function of changing % voice load
Voice traffic - 1 queues stat mux, E/D/1
10m s G.729 voice packets with silence suppre ssion.
Jitter: 94ms
Instantaneous sou rce jitter delay (sec)
Simulation conditions:
256K DSL links
24 calls on AAL2 Cu=1ms
G.729, 10ms with silence sup
Voice packet
Figure C-2: Jitter delay distribution on a congested link (> 90% average
voice load)
Jitter: 3ms
Insta nta ne ous sou rce jitte r de la y (se c)
Simulation conditions:
256K DSL links
24 calls on AAL2, CU=1ms
Congestion control mechanism
G.729, 10ms with silence sup
}
Top/Down
User Define Service QoE Performance Metrics
Approach
QoE
Space Define Service QoE Performance Targets
}
Identify Network Level Contributing Factors &
Dependencies affecting QoE Metrics Network
Architecture
Determine QoS Enabled Network
Architecture Requirements and Configuration
(QoS) space
NO YES
Meet QoE
Validated = Services QoE
Targets
requirements are provided by the
QoS enabled solution
Figure C-5: Process for determining user level (QoE) and network level
(QoS) requirements
Performance metrics and targets defining QoE for the different telecom
services are in different states of development. The requirements for
interactive voice services are more or less completely understood, while the
requirements for browsing and remote applications remain undetermined.
Voice service users are interested in experiencing clear, noise-free, and
echo-free conversations. All the parameters contributing to the QoE of an
ordinary voice call have been combined in an industry standard model
(ITU-G.107, the E-Model). In order to provide an estimate of voice quality
based upon The E-Model, a set of sixteen input parameters is required to
generate its output factor—the transmission rating (R). Some of these
parameters depend on underlying packet network behavior, and various
methods assist in deriving estimates of these parameters using analytic or
simulation tools such as OPNET* Modeler.
Table C-1 shows the voice quality performance targets derived from
known users experiencing PSTN calls. These targets are what we should
aim for to provide equivalent quality to the PSTN network. Targets are
presented for both the A-side and B-side listener. Note that local and
regional calls have no ECAN. ECANs are active in National/International
and mobile calls. It has been determined that a difference of 3R is not
noticeable by typical users and, therefore, packet networks could be
engineered within this margin in order to provide an equivalent
replacement technology. A difference of 3-7R might be noticeable but
most likely acceptable. Larger R degradations (greater than 7R) are more
likely to be noticeable.
Table C-1: Voice quality performance targets based upon known user experience. Note that
PBX have better loss plan and echo control resulting in better R. Also local and
regional calls have no ECAN active, contrary to national and International.
[1] Should be engineered to meet E2E delay, 10ms is recommended - based upon Succession Voice Quality & Bearer Interworking Accreditation v3.3 (Blouin & Bruckheimer),
[2] sufficiently low such as loss only occur randomly as packet loss concealment algorithms are not efficient on bursty losses
[3] Succession_UA-AAL-1_VQ_ECAN_Planning_Report (F. Blouin, M Armstrong, R. Britt)
[4] J. G. Gruber and L. Strawczynski, "Subjective effects of variable delay and speech clipping in dynamically managed voice systems," IEEE Transactions on Communications, vol. COM-33, pp. 801--808,
[5] Performance of Analogue Fax over IP (Brueckheimer)
[6] Based upon a 20000km international call, 100ms was allocated for propagation delay
[7] 20ms includes 10ms for propagation and 10ms for message processing. This budget allows for 4-6 messages @ 20ms to complete a transaction
[8] IEEE INFOCOM 2002 1 Perceived Quality of Packet Audio under Bursty Losses Wenyu Jiang, Henning Schulzrinne
[9] Impact of Network Outages on Voice Quality (F Blouin. L.Thorpe)
[10] PMO: Present mode of operation, that is the existing infrastructure, most likely TDM. FMO: Future Mode of Operation, that is, the new packet/replacement solution
[11] assumes most of core delay will be due propagation delay - should be budgeted to accommodate international calls 15000-20000 km
[12] depends on the signalling protocol - H.323, SIP…
[13] not specified in the standard, implementation based, default is set to 1.6 sec
[14] 10-5 was ontained from 10-6 max BER in G.1010 and the packet size (20ms)
Table C-2:PSTN wireline conversational voice, voice band data and call control
performance
1. “Context” refers to things like the order in which the test cases are presented in the experiment, the
range of quality between the worst and best test cases used in the experiment, and whether the
subjects are asked to do a task before making a rating. If an experiment is repeated exactly (with
different subjects), the similar scores will be obtained within a known margin of error. This is not the
case from one experiment to another. Consistency from test to test is found in the pattern of scores,
not in the absolute value of the scores. For example, the MOS-LQS for G.711 may be 4.1 in one study,
3.9 in another, and 4.3 in a third, but whatever the value obtained, we expect to see a higher score for
G.711 than G.729, and G.729 and G.726 (32 kb/s) to be about equal.
Original
Perceptual
Difference Model
System under
test Σ
Output
Cognitive
Model
2. Many MOS-LQOs have been defined; aside from PESQ, the best-known are PSQM (Perceptual Speech
Quality Measure), and PAMS (Perceptual Analysis Measurement System). As the standard, PESQ
should be used in preference to the older measures.
Maximum R for
1 given delay
Figure C-7:The normal operating range of R, along with regions under the
curve shown with their interpretations for network planning.
Generally, relative comparisons of R are used to show changes
expected in a shift from an existing to a new network, or
differences between one proposed network and another. The
descriptions given here can be useful in the interpretation of
absolute values of R where that is necessary.
Appendix D
Additional Information about IPv6
Concepts covered
Routing in IPv6
Additional Details of Network Control in IPv6
Application Programming Interfaces in IPv6
Detailed Descriptions of Tunnelling Mechanisms
More on Interworking between IPv4 and IPv6
Introduction
This appendix extends the chapter on IPv6 technology in the main body of
the book covering concepts and functionality that are generally not related
to the performance or design of networks for real-time applications. The
information is presented here because it is necessary to fully understand
how an IPv6-enabled network works.
Link-local FE80::/10
IPv6 introduced the concept of address scopes in order to avoid the ad hoc
mechanisms based on the Private Addressing Scheme (address 10.x.x.x
etc) [RFC1918] and using IPv4 Network Address Translators to
circumvent the shortage of IPv4 globally unique addresses.
Addresses with a particular scope are not valid outside that scope and
should not be propagated outside that scope. The link-local scope
(addresses only valid on the wire to which the interface is connected) and
global scope are clearly defined and have been fully accepted. However
unicast site-local scope addresses have been extremely contentious for a
number of reasons, including the following:
the difficulties of defining the bounds of a site;
ensuring that any addresses which leak outside the site do not result
in ambiguous routing or loops; and
supporting the merging of sites.
There are a few changes in terminology for the parts of IPv6 addresses. In
IPv4, the address is logically split into a ‘network part’ and a ‘host part’.
Originally, the boundary was at a multiple of eight bits depending on the
class of the address (‘classful addressing’). Classless Inter-Domain
Routing (CIDR) modified this so the boundary could be at any bit position,
removing the concept of address classes.
‘Host part’ is really a misnomer because the address applies to an interface
rather than the whole host; this has been remedied in IPv6. All the
interfaces sharing the same network part make up an IPv4 ‘subnet’. In IPv4
networks, each interface can only have a single IP address. Consequently,
the subnet is usually identified with the physical link to which the
interfaces are connected – in some cases techniques are used to bridge
several physical links into a single virtual link and the subnet then spans the
whole virtual link.
The possibility that interfaces can have more than one IP address is a major
difference between IPv4 and IPv6. As a result, the identification between
subnets and links disappears from IPv6 and each link can support several
subnets.
In IPv6, each interface has an Interface Identifier (IID) replacing the ‘host
part’. The rest of an IPv6 address is the Subnet Identifier corresponding to
the IPv4 ‘network part’. As in IPv4, a contiguous set of bits starting from
the left-hand end of the address is known as an (Address) Prefix. The count
of significant bits in the Prefix is the Prefix Length. The same notation used
in IPv4 CIDR is used to specify prefixes.
Although the term ‘subnet’ is extensively used in IP networks, it is actually
not defined at all for IPv6 and is not very well defined in IPv4. The
interfaces belonging to an IPv4 subnet share a host part size and the value
of the network part. The logic for forwarding packets in IPv4 assumes that
addresses with the same network part are connected to the same link as the
sending interface and so can be sent directly without involving a router. A
subnet in IPv6 is a set of interfaces that share a Subnet Identifier. Typically,
they will all be attached to a single link but, as in IPv4, it is possible for
them to be distributed on several links. Forwarding of packets directed to
multilink subnets is more complicated than for single link subnets and they
should probably be avoided, if possible. However, they can be used for very
simple networks that want to avoid setting up any routing at all but, for
example, need to use several Ethernet segments.
An IPv6 node cannot determine whether the destination interface for a
packet is connected to the same link just by inspecting the Subnet
Identifier. Instead, the node normally relies on information from the routers
on the link to determine which Subnet Identifiers are ‘on link’ (see
“Neighbor Discovery and Stateless Auto-Configuration” for further
details).
Each interface has a link local address that is created and used as part of the
node startup process. The link local prefix can be thought of as identifying
a ‘link subnet’ that encompasses all the interfaces connected to the link –
other subnets derived from other sorts of addresses may cover only a subset
of those interfaces. The link local subnet identifier is reused on each link,
but this doesn’t cause problems because packets using these addresses are
never forwarded beyond the local link.
The original proposal for site-local addresses allocated a single prefix to be
used on all sites in a similar way to IPv4 ‘private addresses’, which are
extensively used in Enterprises with NATs. This proposal has been very
contentious. The same site-local addresses would have been used in many
sites, which makes it very difficult to merge addressing schemes when
companies are reorganized and can lead to routing ambiguities if routes to
these addresses leak out to the global routing system.
Site-locals have now been replaced by globally unique ‘local use’
addresses [I-D.ietf-ipv6-unique-local-addr]. This scheme allows for a set of
essentially unique site prefixes to be created either by acquiring one from a
central source or generating a random prefix locally using cryptographic
hashing techniques. Such prefixes have a very low probability of clashing
with any other local use prefix. The prefixes would not normally be used
outside the sites owning them or associated sites that agree to co-operate,
but would cause fewer problems if they did ‘escape’. They will
significantly reduce the problems of merging two sites, as it would be
extremely unlikely that the sites had the same local use prefix.
The aggregatable global unicast addresses are usually constructed by
adding a 64 bit prefix to a 64 bit IID. Often the IID will be a globally
unique number that can be combined with any address prefix to make an
address. The IID can be derived from the MAC address of the interface,
manually configured or generated cryptographically. At present, the
address prefix would be delegated from a provider that would also provide
routing of IPv6 traffic to and from nodes using this address prefix. This
‘provider addressing’ (PA) is designed to support the policy of ‘strict
aggregation’ described in “Unicast Routing and Addressing” .
The IPv6 address architecture allows for two types of addresses:
Unique stable IPv6 addresses: Assigned though manual
configuration, a DHCP server or auto-configuration using the IID
derived from a MAC address.
Temporary transient IPv6 addresses: Assigned using a random
number for the IID.
Transient addresses can be generated cryptographically and altered from
time to time to address security and privacy concerns as described in
[RFC3041]. Cryptographically generated addresses can also be used to
help secure the process of neighbor discovery (see “Neighbor Discovery
and Stateless Auto-Configuration” ).
it would effectively have limited the number of top level Internet providers
to 8192.
The basis of ‘strict aggregation’ is that a network should acquire address
space for its network from a provider – Provider Addressing (PA). The
provider delegates authority for this part of the address space to the
customer and, in turn, will provide the default route for traffic to and from
the customer using these addresses. The addresses come from a larger
block delegated to the provider by a ‘larger’ provider who in turn will route
traffic to and from all of their customer providers. Address delegation is
repeated up to the point where the highest level providers get address space
from the regional Internet Routing Registries such as APNIC, ARIN*,
LACNIC and RIPE. These registries, in turn, have address space delegated
to them by ICANN*/IANA.
The essence of the Internet is global connectivity, so a provider needs to be
able to route traffic from customers to any other destination. Providers at
the top level generally do this by connecting to all the other top level
providers through Internet exchange points and private peering
connections. All the others use a default connection through their parent
provider to route traffic not addressed to customers of peer networks with
which they connect directly.
Providers can build as many connections as they find economically and
technically expedient between their networks and other providers at any
level in the address delegation tree to carry traffic between their customers.
If this scheme is correctly implemented, the number of routes that a
provider has to deal with will be limited by the number of providers that it
peers with, rather than the number of customers who want to advertise
individual routes, as now happens in IPv4. Strict aggregation seeks to
prevent traffic for third parties being carried across links between provider
peers: providers will normally filter incoming and outgoing traffic to
eliminate packets that are not going to or from their customers.
It is now left up to the registries to define policies on what size of address
block they will delegate to providers with particular sizes of customer base,
and to provide recommendations on how this space should be further
divided up by the customers.
A typical small Enterprise or home network might expect to get a ‘/48’
address prefix, whereas a large Enterprise or a medium-sized provider
might get a /32 or /35 address prefix. This gives a lot more scope to
network managers to produce creative addressing plans as compared with
the very restrictive allocations now being given out for IPv4. For example,
the ‘sparse addressing’ plan can be used; instead of packing allocations as
tightly as possible and with a minimal allowance for growth, as has usually
been done with limited IPv4 allocations, it may be desirable to allocate
each new subnet from the middle of the available spare space. This may
seem wasteful, but it will give maximum scope for growth without the
nuisance of having to renumber. Even with a /48 allocation an Enterprise
could configure more than 65000 sub-networks with /64 prefixes using 64
bit IIDs….and there will still be something like 1500 addresses per square
meter of the Earth’s surface.
Anycast
Anycast is a new facility in IPv6, intended to ease the implementation of
services that should be available from any node but may be implemented
on more than one server. In many cases, this is useful just because it is
more efficient and more scalable to access the ‘nearest’ server rather than
centralizing the service; but, it is also convenient if the information
delivered should be different depending on where the request is made from.
At the application level, this might be because the information relates to the
geographic or topological location from which the request is made. One
example might be the network information delivered by ‘two-faced’ DNS
implementations, which restrict the publicly accessible information to
protect privacy but provide full information to insiders.
An anycast address is indistinguishable from an ordinary unicast address—
the distinction lies in what the destination does. At present, anycast is only
implemented on routers. The router is responsible for knowing the correct
server corresponding to an anycast address and forwarding the message
accordingly.
One application that has been suggested for anycast is locating a DNS
server without having to have the address configured or supplied by DHCP.
Multihoming in IPv6
The strict aggregation rules for Provider Addressing provide a major
problem for IPv6 traffic management that has not yet been resolved at the
time of writing. Traffic using a destination address delegated from a
provider normally has to be routed through that provider. Similarly,
outgoing traffic using a delegated address as source address has to be
routed through the delegating provider. This makes it extremely difficult to
provide redundant connectivity for an IPv6 network through the
multihoming techniques developed for IPv4. Connections between peers in
the delegation tree do not get around this because they only handle traffic
between customers of the connected peers.
In IPv4, a ‘more specific’ (that is, longer) prefix using the address space
provided by the network’s main IP service provider or as part of a ‘provider
independent’ allocation could be advertised via the BGP routing protocol
through an agreeable alternative provider. In this case, if the main provider
suffered a breakdown and had to withdraw its main route, the BGP routing
system would automatically reroute traffic through the alternate provider.
The cost to the IP routing system is the large number of long prefixes that
need to be managed by routers in the core of the network. This is one of the
causes of the explosion in the number of routes processed by BGP in the
core of the network – now more than 100,000 and still growing.
IPv6 wishes to avoid this problem but must still provide a multi-homing
solution to meet customer expectations for resilience and robustness. A
number of solutions have been discussed at length in the IPv6 and IPv6
Multihoming Working Groups in IETF; but, as of yet, there is (in mid
2004) no real solution in sight. This is a major problem for the deployment
of production IPv6 networks. It is not clear what problems the eventual
solution will pose for real-time networks beyond the problems of having
Multicast in IPv6
IP Multicast has a much more fundamental role in IPv6 than it did in IPv4.
This is partly because the support protocols that are used; for example, to
resolve the linkage between Layer 3 IP addresses and Layer 2 MAC or link
addresses that were not part of the IP protocol suite. Many of these
protocols, such as the Ethernet Address Resolution Protocol (ARP), relied
on link layer broadcast capabilities to determine the interface associated
with an IP address.
The functions of the various link layer specific support protocols have been
integrated into a uniform framework at the IP layer in IPv6 (see “ICMPv6
and IPv6 Network Configuration” ). Many of these functions rely on the
ability to send messages to groups of interfaces on a link according to their
roles (for example, to all nodes or to all routers) even before the sending
interface knows what other nodes are attached to the link.
As a result, IPv6 is heavily dependent on multicast packet delivery at least
at the level of one link. Well-known link scope multicast addresses are used
for a number of purposes during node startup.
IPv6 does not use broadcast mechanisms at the IP layer at all and the
broadcast address available on each subnet in IPv4 has no analog in IPv6.
In order to optimize the delivery of multicast, nodes have to inform routers
about the multicast groups to which they are listening (that is, the multicast
addresses for which they will accept packets). Routers then only need to
propagate multicast packets onto links where there is at least one listener.
The information needed is maintained by exchanges using the Multicast
Listener Discovery protocol (MLD v1 [RFC2710] or v2 [RFC3596]),
which replaces the IGMP used for the same purpose in IPv4.
Multicast Routing
After a long period of development, the IETF has settled on a small number
of protocols for routing multicast IP packets. The latest generation of
multicast routing protocols is relatively independent of the underlying
addressing and unicast routing infrastructure, and the protocols are
specified for both IPv4 and IPv6.
Multicasting with link scope is needed for the correct operation of all IPv6
networks and does not need any dynamic routing protocols, but networks
that use multicast at larger scopes require one or more of the multicast
routing protocols. The multicast routing protocols are still mostly under
development. Currently, an IPv6 network might provide the following:
Start
Interface
Identifier (IID)
Generate
link local address
for interface
Router Discovery
Fallback:
Yes DAD: Is No Manual Configuration
Issue Router if IID not from unique
link local address
Solicitation link layer address
unique?
else disable interface
Receive Router
Advertisement(s)
Get address
Use router info to
prefixes from
generate address
DHCPv6: generate
from each prefix:
address from each
Take other
prefix. Use other
config from DHCPv6
config from DHCPv6
DAD: Is Discard
each generated address non-unique
unique? addresses
Unique addresses
assigned to interface
48 bit MAC address of the interface as shown in Figure D-2, but there does
not have to be a relationship between the MAC address and the IID.
Individual(0)/Group(1)
Universal(0)/Local(1)
IEEE802 (Ethernet) MAC Address
xxxxxx00xxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyy
(invert bit)
Local(0)/Global(1)
‘A flag’: If set, indicates that this prefix can be used for stateless
(autonomous) address auto-configuration by nodes on the link.
‘L flag’: If set, indicates that addresses using this prefix are
‘on-link’ so that packets can be sent direct rather than through a
router (it is possible that some of the addresses belonging to a
prefix might be on-link and others off-link—for example,
addresses used for Mobile IPv6 nodes would not be on-link
unless the node was ‘at home’. Packets sent to a mobile node
from another node connected to its home link need to go to the
router to be redirected if the node is not at home. Consequently,
the values of the A and L flags are completely independent.)
Lifetime, during which an on-link prefix should be considered
valid when determining if an address is on-link
Lifetime, during which an address created from an
‘autonomous’ prefix should be considered a preferred address.
Once a host has received Router Advertisements, the stateless and stateful
auto-configuration processes diverge. If a Router Advertisement has the
Managed (M) flag set, the host can use another (stateful) means to obtain
addresses usable for communication beyond the local link (see “Stateful
Auto-Configuration and DHCPv6” ). Otherwise, the host expects to find
one or more prefixes in the Router Advertisement with the ‘A flag’ set that
it can use as subnet identifiers to build global or local use addresses by
adding its IID. These prefixes have to be the exact length of the subnet
identifier—no padding or overlap is possible. In most cases, both IID and
subnet identifier are 64 bits long, but other values are possible for some
address values (see “Address Types in IPv6” ).
This process is known as ‘Stateless Auto-configuration’ because there is no
need for the router that advertises a subnet identifier to maintain any state
about which interfaces are using the subnet. The Router Advertisements
may contain additional prefixes that are marked as ‘on-link’, but are not
intended to be used for auto-configuration. Destinations with addresses that
match on-link prefixes can be sent directly without needing to be sent to a
router as the next hop (see “Packet Transmission, Address Lifetimes and
Deprecation” ).
The host should check that the addresses it creates are unique by using
DAD again; but if the previous DAD checks made for the link local address
using the same IID were successful, the new addresses can be assumed to
be unique, classed as preferred addresses and used immediately for
communications.
A host can use a combination of stateful and stateless auto-configuration if
the network administrator finds this convenient. The most recent updates to
the auto-configuration standards stress the role of administrative policy in
determining what combination of mechanisms is used for a each node.
might want to join; for example, many different wireless LAN networks on
an ad hoc basis.
The Secure Neighbor Discovery (SEND) process [I-D.ietf-send-cga],
[I-D.ietf-send-ndopt] generates an IID linked to the subnet identifier, a
random modifier and a locally generated public/private key pair through a
cryptographic hashing process. Additional fields, including a digital
signature based on the key pair, are added to the ICMPv6 neighbor
discovery messages. This signature allows recipients to verify that the IPv6
address was generated for the subnet identifier in use by the originating
node and that the link layer address supplied is correctly associated with
this IPv6 address. Conversely, hosts receiving router advertisements can
verify, on the basis of the configured certificate(s), that they come from a
trusted source. In combination with a suitable node authentication
mechanism, this allows the nodes on a network to establish that packets are
coming from trusted sources and are being routed by a trusted router.
Start
Destination
Address
Present:
Next Hop Address from
Destination Cache In Absent
Destination
Cache?
Check
Reachable Next Hop Addr Stale
in Neighbor
Cache No More
Routers Alternative
Drop
Router
Reachable Do Neighbor Packet
Selection
Unreachability
Not
Detectionr
Reachable
Next Router:
Router
On-link Destination Off-link Address
Address in
Prefix List?
First
Drop Router
Packet Router Selection
List Empty
Do Address
Resolution on Check
Destination Address Absent Router Address Stale
in Neighbor
Cache
Reachable
Do Neighbor
Do Address Unreachability
Resolution on Detection Not
Router Address Reachable
Reachable
Packet Transmission
States
Tentative Preferred Deprecated Invalid
Time
Preferred Lifetime
Lifetimes
Valid Lifetime
State Description
Preferred The address has been verified as unique. A node can send
and receive unicast traffic to and from a preferred address.
The Router Advertisement message specifies the period of
time that an address can remain in this state.
Valid The address can send and receive unicast traffic. This state
covers both the preferred and deprecated states. The Router
Advertisement message specifies the period of time that an
address can remain in this state. The valid lifetime must be
greater than or equal to the preferred lifetime.
If an address in the Neighbor Cache is not used for a time, the cache entry
will become ‘stale’ and the node will need to refresh the information by
repeating the Address Resolution exchange. Router addresses can also
become stale, but will normally be refreshed by routers multicasting
unsolicited Router Advertisements. If a host misses an unsolicited Router
Advertisement, it can also solicit the router explicitly.
If a link layer address changes (for example, because of a change of IID for
privacy reasons or a hardware changeover) a node can multicast an
unsolicited Neighbor Advertisement to notify other nodes on the link of the
change: nodes receiving the advertisement update their Neighbor Cache to
reflect the change.
Finally, Neighbor Unreachability Detection (NUD) tracks interface failures
and nodes that leave the network. The corresponding entries in Neighbor
Caches will be removed when they become stale and there is no response to
Neighbor Solicitations.
avoid the ad hoc schemes that some operating systems provided for IPv4
(for example the use of ioctl in BSD and Linux systems). Some further
extensions exist to support particular aspects of IPv6, such as Multicast
Listener Discovery v2 [RFC3678], and more are planned, such as
interfaces to support Mobile IPv6.
Security APIs
These APIs are not fully standardized, but most implementations tend to
follow the APIs developed for the KAME project, one of the leading
developers of IPv6 software for the open source community.
The operating system for a node that supports IPsec needs to provide a
secure database of Security Associations (SADB) that can be accessed by
the IP layer during packet transmission. The usage of IPsec for individual
communication sessions is constrained by an overall security policy set by
the system administrator. The policy can typically be set to implement
varying degrees of security ranging from mandating the use of IPsec for all
incoming and outgoing connections through to allowing individual
applications a choice of whether to use IPsec on outgoing packets and
accepting both secured and unsecured incoming packets.
Typically, the SADB is implemented alongside the IP protocol stack within
the kernel of the operating system for performance and security reasons.
The operating system then has to provide APIs and libraries which:
Allow the creation and manipulation of SA data structures (KAME
provides the ipsec_set_policy library)
Allow suitably privileged processes (such as a key management
protocol daemon) access to the SADB (KAME uses socket based
communication with a specialized PF_KEY protocol family to
exchange messages between user processes and the SADB)
[RFC2367]
Allow individual applications to set the security requirements for
each communication socket that it uses within the constraints of the
overall security policy (KAME provides additional options for
setsockopt and getsockopt that allow the application to
modify the default security policy).
More information about a typical implementation can be found in the
NetBSD implementation [NetBSD-IPsec] and the KAME project webpage
has more details of ongoing work [KAME].
IPv6 Stable and secure DNS with support Tunnel between two points
Manually links for regular for IPv6 not only. Large management
Configured communication. required. overhead. No
Tunnel independently managed
Connection to
NAT2.
6bone1.
IPv6 over Stable and secure Well-known Tunnel between two points
IPv4 links for regular standard tunnel only. Management
GRE Tunnel communication. technique. Only overhead. No
tunnelling method independently managed
that will allow iIS-IS NAT. Cannot use to
to work through connect to 6bone.
tunnels.
Types of Tunnels
It is not necessary that there are any IPv6 routers on the site; isolated IPv6
capable hosts can interwork without needing routers.
Teredo Tunnels
Teredo is a proprietary specification designed by Microsoft and described
in [I-D.huitema-v6ops-teredo]. It has the distinction that it can operate
through and independently of NATs and firewalls because it uses UDP
encapsulation. However, it relies on a specialized server at one end of the
tunnel and needs to use a particular address format.
Silkroad Tunnels
Silkroad is an alternative solution to the Teredo proposal for tunnels that
have to traverse NATs and firewalls, which is described in
[I-D.liumin-v6ops-silkroad]. It also requires a server to assist in
determining what kind of NAT may be present on the tunnel path and
requires some modifications to access routers which terminate the Silkroad
tunnels. Unlike Teredo, Silkroad does not need specialized addresses.
documents both the dependencies and the protocol fixes that have been
provided or, in some cases, will be needed.
endpoint, and the tunnel endpoint decapsulation boxes with IPv4 capability
on one side and IPv6 capability on the other.
References
[I-D.bound-dstm-exp] Bound, J., “Dual Stack Transition Mechanism,”
draft-bound-dstm-exp-01 (work in progress), April 2004.
[I-D.huitema-v6ops-teredo] Huitema, C., “Teredo: Tunnelling IPv6 over
UDP through NATs,” draft-huitema-v6ops-teredo-02 (work in progress),
June 2004.
[I-D.ietf-bgmp-spec] Thaler, D., “Border Gateway Multicast Protocol
(BGMP): Protocol Specification,” draft-ietf-bgmp-spec-06 (work in
progress), January 2004.
[I-D.ietf-idmr-dvmrp-v3] Pusateri, T., “Distance Vector Multicast Routing
Protocol,” draft-ietf-idmr-dvmrp-v3-11 (work in progress), December
2003).
[I-D.ietf-ipv6-unique-local-addr] Hinden, R. and B. Haberman, “Unique
Local IPv6 Unicast Addresses,” draft-ietf-ipv6-unique-local-addr-05 (work
in progress), June 2004.
[I-D.ietf-isis-ipv6] Hopps, C., “Routing IPv6 with IS-IS,” draft-ietf-isis-
ipv6-05 (work in progress), January 2003.
[I-D.ietf-msec-arch] Hardjono, T. and B. Weis, “The Multicast Security
Architecture,” draft-ietf-msec-arch-05 (work in progress), January 2004.
[I-D.ietf-ngtrans-isatap] Templin, F., Gleeson, T., Talwar, M. and D.
Thaler, “Intra-Site Automatic Tunnel Addressing Protocol (ISATAP),”
draft-ietf-ngtrans-isatap-22 (work in progress), May 2004.
[I-D.ietf-ngtrans-mech-v2] Nordmark, E. and R. Gilligan, “Transition
Mechanisms for IPv6 Hosts and Routers,” draft-ietf-ngtrans-mech-v2-00
(work in progress), July 2002.
[I-D.ietf-pim-dm-new-v2]Adams, A., Nicholas, J. and W. Siadak,
“Protocol Independent Multicast - Dense Mode (PIM-DM): Protocol
Specification (Revised),” draft-ietf-pim-dm-new-v2-05 (work in progress),
June 2004.
[I-D.ietf-pim-sm-v2-new] Fenner, B., Handley, M., Holbrook, H. and I.
Kouvelas, “Protocol Independent Multicast - Sparse Mode PIM-SM):
Protocol Specification (Revised),” draft-ietf-pim-sm-v2-new-09 (work in
progress), February 2004.
[I-D.ietf-send-cga] Aura, T., “Cryptographically Generated Addresses
(CGA),” draft-ietf-send-cga-06 (work in progress), April 2004.
[I-D.ietf-send-ndopt]Arkko, J., Kempf, J., Sommerfeld, B., Zill, B. and P.
Nikander, “SEcure Neighbor Discovery (SEND),” draft-ietf-send-ndopt-05
(work in progress), April 2004.
[I-D.ietf-ssm-arch] Holbrook, H. and B. Cain, “Source-Specific Multicast
for IP,” draft-ietf-ssm-arch-04 (work in progress), October 2003.
Appendix E
Virtual Private Networks: Extending
the Corporate Network
NAT IPSec
IP IP MPLS
IPv4 IPv4 IPv6 L2TP
Introduction
This appendix covers Virtual Private Networking (VPNs), which one of
the group of technologies introduced into the original IP network to
increase its capabilities.
In general terms, VPNs have relatively little interaction with use of the
network for real-time applications. As with standard IP networks, effective
Quality of Service capabilities will be needed to ensure that the traffic is
delivered efficiently in the face of network congestion and applications
deliver the quality of experience that users have come to expect from
traditional TDM networks. However, VPNs typically use the same type of
mechanisms to provide QoS capabilities and the protocols used for real-
time applications run unchanged across VPNs.
Virtual Private Networks are already widely deployed in today’s Internet.
They provide a means for Enterprises to extend their internal networks
across multiple sites using leased or public infrastructure, as well as
allowing ‘road warriors’ such as salesmen to link into the corporate
network without exposing their assets to public access. A number of
different technologies are used to implement VPNs, both at Layer 2 and
Layer 3. Layer 2 solutions create virtual 'private wires' between 'points of
presence' (PoPs); increasingly, the private wires are emulated by MPLS
paths rather than using actual Layer 2 transports in the network core.
Layer 3 technologies typically involve ‘tunnelling’ the private network
packets through the public infrastructure by encapsulating the whole
1. PSTN connections typically use the Point-to-Point Protocol (PPP) [RFC1661]to carry IP packets across
the telephone network together with the Layer 2 Tunneling Protocol (L2TP) [RFC2661],[
I-D.ietf-l2tpext-l2tp-base] across the links between the PSTN exchange and the ISP's network. Since
PSTN connections are rapidly becoming obsolete, this isn't discussed any further here.
The IPsec tunnel then behaves like an extra link connected to the corporate
home network. Packets to and from the Road Warrior’s virtual address are
routed through the tunnel and the VPN gateway, and the Extranet Client
encrypt and decrypt the traffic according to the parameters in the security
association so that it cannot be subverted or intercepted. The Road
Warrior’s computer has become a logical extension of the corporate
network and the Road Warrior can now do anything that normally could be
done on the computer in the office, although it might be a long walk to the
office printer.
The Extranet Client may also prevent traffic entering or leaving the Road
warrior’s computer apart from through the IPsec tunnel by forbidding the
use of ‘split tunnels’—a split tunnel would potentially allow packets sent
between the address allocated to the Road warrior’s computer when it
connects to the Internet, and addresses that are not inside the corporate
network to reach the Road Warrior’s computer directly from the Internet
without passing through the corporate firewall. This opens a security
loophole that could be exploited to attack the corporate network by routing
packets from the Internet directly into the Road Warrior’s end of the
Extranet tunnel. If split tunnels are not allowed, the Extranet Client restricts
communication with the Internet to packets that are carrying the IPsec
tunnel traffic.
In addition to using Extranet Clients, Road Warriors may also be able to
use a more restricted form of VPN to access a limited set of corporate
applications. Suitably adapted applications can be accessed securely
through a web browser using the Secure Socket Layer (SSL) [SSL3]. This
kind of SSL VPN has the advantage that enabled applications can be
accessed from almost any web browser on any computer with Internet
access rather than requiring a special application on a corporate laptop. The
downside is that only the adapted applications can be accessed. One very
useful example is the web access to e-mail, which is frequently offered by
ISPs and some corporate networks.
between the PoPs, as is frequently the case, a new site must establish a
connection with each existing site—an O(n2) problem. The problem is not
so acute if a ‘hub and spoke’ model can be adopted; therefore, this might be
appropriate where traffic mostly flows between a central office and a set of
branch offices—the small amount of inter-branch traffic can be handled by
routing it via the central site.
A wide-area IP network infrastructure, possibly with the addition of MPLS
technology, can be used as the basis for a number of different types of VPN
that offer various advantages over the virtual circuit VPN for linking
permanent PoPs into a single company Intranet or providing business-to-
business connectivity with partner companies in an Extranet.
Current VPN offerings exploiting an IP infrastructure can be divided into:
Layer 2 VPNs, which tunnel Layer 2 frames across ‘pseudo-wires’
between PoPs, and
Layer 3 VPNs, which route Layer 3 packets across a virtual IP
network overlaid on the physical IP infrastructure.
Each category can be further divided into solutions where customer
equipment at the edge of the customer’s network does needed work to
establish the VPN, and the management burden falls on the customer
(Customer Edge – CE-based solutions); and solutions where a service
provider offers a managed service where the customer connects to
Site C
Site B
Site D
Site A
Site E
Layer 2 VPNs
Standardization work for Layer 2 VPNs using IP infrastructure is still in
progress at the time of writing (mid 2004). Two prestandard schemes
(named for their chief advocates) have been deployed. These schemes
allow service providers to offer Layer 2 VPN services over IP/MPLS
infrastructure. They use a common encapsulation scheme, defined in
[I-D.ietf-pwe3-arch] to embed ATM, frame relay, Ethernet and PPP/HDLC
frames into MPLS packets before sending them across the Label Switched
Path (LSP) that has been preestablished between the PoPs emulating a
'private wire'. The schemes differ in the way in which the addressing
information for the VPNs is signaled between PE sites.
‘Draft-Martini’ uses the MPLS Label Distribution Protocol (LDP)
to distribute Virtual Circuit (VC) labels between PE nodes. It
requires significant manual provisioning of both ends of each VC
and, hence, retains some of the disadvantages of the basic virtual
circuit VPN schemes. It is best suited to simple point-to-point
connections and small VPNs and is frequently described as a
pseudo-wire scheme. Efforts are under way at the IETF to reduce
the provisioning load, possibly by using BGP to perform
autodiscovery of the end-points while continuing to use LDP for
LSP setup.
‘Draft-Kompella’ uses a BGP session between the PE nodes to
distribute information about the CEs connected to the PE node, and
to allow autoconfiguration and provisioning of the LSPs between
the PEs.
These prestandard proposals are now being refined into standardized
offerings for edge-to-edge pseudo-wire services emulating the traditional
Virtual Private Wire Services (VPWS) and Virtual Private LAN Services
(VPLS), which emulate a bridged LAN extended over a wide area IP/
MPLS infrastructure. The proposals are described in [I-D.ietf-l2vpn-vpls-
ldp] and [I-D.ietf-l2vpn-vpls-bgp].
On traditional IP infrastructure without MPLS, a Layer 2 VPN can also be
constructed using the Layer 2 Tunnelling Protocol (now in its third version,
L2TPv3[I-D.ietf-l2tpext-l2tp-base]). L2TP VPNs offer similar capabilities
to ‘Draft-Martini’ but can be constructed by customers using standard IP
connectivity to support tunnels between PE nodes without special services
from the provider.
All Layer 2 VPNs offer a number of advantages:
Tunnel-based VPNs are, from the customer’s point of view,
indistinguishable from ‘traditional’ Layer 2 VPNs using physical
connections or virtual circuits. Migration from one to the other
raises few issues.
The service provider does not participate in the customer’s Layer 3
routing, which, therefore, remains totally private to the customer.
The provider does not have to do anything special to keep
individual customers’ routes separated from each other and routes
in the Internet infrastructure; there is no need to manage per-VPN
routing tables in the PE nodes.
The customers can run whatever Layer 3 protocols they choose
across the Layer 2 VPN.
Layer 3 VPNs
If the VPN traffic is exclusively IP packets, the optimum solution may be a
Layer 3 VPN, especially if customer sites are connected to the service
provider with a variety of Layer 2 technologies. The IETF provides a
document [RFC2764] which sets out a framework for all the different kinds
of Layer 3 IP-based VPNs.
If the customer wishes to manage the VPN rather than buying a service
from the provider (CE-based solution), the CE nodes can be configured to
provide IP tunnels to the remote CEs across the provider IP network. At the
simplest level, these tunnels could be IP in IP tunnels where the IP packets
originating in the corporate network, possibly using private IP addresses,
are encapsulated with an outer IP header at the tunnel ingress CE, using
globally routable addresses and routed to the egress CE across the Internet.
Of course, this offers almost no security or privacy, and so most customer-
managed VPNs use IPsec tunnel gateways as their CEs. In this case, the
corporate IP packet is encrypted and authentication data added before
encapsulation with the outer IP header.
The ability to share physical links or Layer 2 virtual circuits between many
tunnels makes the provisioning and management of a customer-managed
Layer 3 VPN slightly simpler than for a traditional VPN; but, it is still an
O(n2) problem if the VPN sites have fully-meshed connectivity and
additional equipment may be needed to support the tunnel endpoints and
IPsec encapsulation.
network to carry traffic between PEs. Some proprietary solutions use ATM
VCs; but, increasingly, MPLS is being used to provide the core VCs.
All the Layer 3 VPN solutions are variants on a theme—the provider
network implements a virtual overlay network for each VPN linking all the
PEs with attached CEs in the VPN. At each PE in the VPN, the PE has a
master routing table for the physical provider network and a virtual routing
table for each VPN that uses the PE. The master routing table is built by the
provider’s IGP running on the physical network. The virtual routing tables
and associated forwarding tables (VRFs) are built in various different ways
depending on the VPN solution. The virtual router has (real) interfaces to
the attached CEs at the PE and (virtual) interfaces linking to the other PEs
via the virtual overlay network.
Advantages of using a Layer 3 VPN include:
The customer can attach to the VPN using any Layer 2 technology
supported by the provider, and the technology used need not be
uniform across all the attachments. Layer 2 VPNs can overcome
this limitation only at the cost of losing Layer 3 independence and
being able to transport (typically) only IP packets.
A Layer 3 VPN can often handle more CEs per VPN than a Layer 2
VPN. For Layer 2 VPNs, the number is limited by how many
circuits are supported by the Layer 2 technology on each link. For
example, frame relay using two octet DLCIs would only allow a CE
to interconnect at most about a thousand other CEs in a VPN.
Providers can offer routing services as a value-added service on a
Layer 3 VPN. This can be a considerable advantage for a customer
where the network managers have limited routing expertise. For a
Layer 2 VPN, each CE router has as to exchange routing
information with all the other CE routers to which it is connected
by the VPN and building the routing scheme is entirely the
customer’s problem. For a provider provisioned Layer 3 VPN, each
CE router needs only a default route to the PE router—the provider
handles the routing between the PE routers in the connected PoPs.
Because the PE routers have visibility of the IP packets, the
provider can offer classification and CoS routing as value added
services.
Service providers can also provide multicast routing, forwarding
and packet replication in PE routers. In a Layer 2 VPN, multicast
issues have to be handled by the CE routers, which may have to
replicate packets resulting in duplication of traffic passing along the
access links between CE and PE. These access links are frequently
a bottleneck and using a Layer 3 VPN would allow best use to be
made of the available bandwidth.
To make a simple VPN, one RT is used for all the routes associated with
the VPN—each site where the VPN has a PoP also has a VRF for the VPN
in the PE router, and this VRF installs all the routes using the VPNs RT.
Using multiple RTs for a single VPN allows more complicated structures
such as 'hub and spoke' arrangements. Routes advertised by spokes (for
example, branch offices) use one Route Target and routes advertised by
hubs (for example, main offices) use a different one. The VRFs associated
with hubs only import and install routes with the spoke Route Target
attribute and vice versa; consequently, each spoke site only needs tunnels to
the hubs rather to every other spoke site. There is a great deal of flexibility
in the system, but considerable management effort is needed in the provider
network to maintain the Route Distinguishers and Route Targets.
In the data plane, packets arriving at a PE either:
come from a CE over an 'attachment circuit', or
come from another PE over an MPLS tunnel.
In either case, if the packet is a VPN packet, it will be associated with one
of the VRFs in the PE. The VRF for an attachment circuit is configured
into the PE—any packets arriving over the attachment circuit can be
forwarded by looking up the destination address in this VRF. Packets
coming from other PEs are following a route that was advertised from this
PE. Before the advertisement is sent out, the advertising PE creates the
VPN Route Label. This is a local MPLS label that can be used to identify
packets using the route to reach the advertising PE and associate them with
the correct VRF. Since this label is only interpreted by the PE that creates
it, it need not be different from VPN Route Labels created by other PEs.
Also, the VPN Route Label doesn't label any MPLS path unlike a standard
MPLS label. The VPN Route Label is carried in the advertisement as an
extra BGP attribute and is recorded in the appropriate VRFs when the route
is installed.
When a PE has to forward a VPN data packet to another PE, it identifies
the route to use from the correct VRF and then adds an MPLS header to the
packet. Two labels will be pushed onto the label stack of the packet: first
the VPN Route Label for the route, and then the label of the label switched
path (MPLS tunnel) towards the destination PE. The packet is then
dispatched down the tunnel and is switched across the backbone to the
destination PE as with any other MPLS packet. The VPN Route Label
remains at the bottom of the label stack and is not inspected until the packet
reaches the destination PE.
The VPN Route Label identifies the VRF in which the destination IP
address should be looked up, and the packet can then be dispatched either
to a local attachment circuit or to a remote PE depending on the results of
the lookup.
of sites per VPN to the number of VCs supported by the Layer 2—typically
in the low thousands.
For the Virtual Router Layer 3 VPN scheme, a large number of tunnels
between the VRFs on each PE has to be constructed and the number of
MPLS labels available may become a limitation.
All of the architectures discussed here can reduce the number of tunnels
between PEs by multiplexing the traffic for all the VPNs with CEs
connected to both terminal PEs onto a single tunnel. Isolation of the traffic
is maintained by suitable encapsulation, but at some cost in additional
processing at each end and additional encapsulation overhead in packets.
RFC2547 VPNs provide this optimization as part of the basic architecture;
however, other schemes need to provide it outside the basic architecture of
the scheme. In the case of Layer 2 VPNs, carriers already have extensive
experience creating and managing the VCs needed and minimizing the
overhead in the backbone network. Arguably, the scheme used in RFC2547
slightly reduces the security of the isolation between VPNs because there is
no actual tunnel associated with the VPN Route Label; it is possible that
customer routes could leak out to the wider Internet. To ensure that VPN
isolation is not compromised, the backbone P and PE routers need to
ensure that inappropriately labeled packets are not accepted from outside
the backbone; otherwise, a malicious device could insert packets that
would be routed into a VPN although not originated by the VPN.
Management and configuration complexity, coupled with control traffic
and processing overheads, are perhaps the major issues limiting the
scalability of provider provisioned Layer 3 VPNs compared with Layer 2
VPNs. Each PE has to have resources to run software, build the VRF and
maintain state for each VPN with attached CEs. In all cases, the inter-PE
virtual links have to be created and attached to the correct VRFs. For VR
VPNs, the provider may be offering a choice of routing protocol support;
but, in any case, has to configure the protocol with the correct virtual links
to other PEs and co-operate with the VPN customer to set up the virtual
router correctly either by allowing the customer access to the configuration
with all the associated security concerns or performing the configuration on
the customer's behalf. For RFC2547 VPNs, the customer choices may be
more limited—the provider may require the CE to run BGP so that the
customer has to work out the routes to export at the CE and configure BGP
accordingly. If the CE-PE routing uses an IGP, the PE has to run an
instance of the IGP as well as BGP and the provider has to configure the
extra routing protocol instance and the information exchanges between the
IGP and BGP. RFC2547 VPNs may also require additional configuration
of BGP route attributes if the customer wishes to run a single partitioned
AS across sites.
The additional configuration and management overhead of Layer 3
provider provisioned VPNs is a significant barrier to large scale
Summary
Virtual Private Networks are a class of extensions to IP networks. VPNs
are generally a means to provide a secure extension of a common corporate
environment to geographically separated sites using common public or
leased infrastructure (road warriors and intranets) or to provide a common
environment to partners in a business or industry (extranets). As a side
effect, VPNs can allow networks that use IPv4 private addresses to be
extended across the public Internet without risking address clashes
between the global addressing or other VPNs. A number of different
techniques were discussed, suitable either for connecting individual mobile
'road warriors' or fixed sites with multiple nodes. The possibilities for fixed
sites covered both customer-provisioned and provider-provisioned VPNs
using both Layer 2 and Layer 3 connectivity. The advantages and
disadvantages of the various techniques were discussed with some
emphasis on the Layer 3 PPVPN techniques, which include the Virtual
Router (VR) and MPLS-BGP (RFC2547) schemes. The use of MPLS as a
common infrastructure for implementing both Layer 2 and Layer 3
PPVPNs was noted.
References
[I-D.ietf-l2tpext-l2tp-base] Lau, J., Townsley, M. and I. Goyret, “Layer
Two Tunnelling Protocol (Version 3),” draft-ietf-l2tpext-l2tp-base-14
(work in progress), June 2004.
[I-D.ietf-l2vpn-vpls-bgp] Kompella, K., “Virtual Private LAN Service,”
draft-ietf-l2vpn-vpls-bgp-02 (work in progress), May 2004.
[I-D.ietf-l2vpn-vpls-ldp] Lasserre, M. and V. Kompella, “Virtual Private
LAN Services over MPLS,” draft-ietf-l2vpn-vpls-ldp-03 (work in
progress), April 2004.
[I-D.ietf-l3vpn-bgp-ipv6] Clercq, J., Ooms, D., Carugi, M. and F.
Faucheur, “BGP-MPLS VPN extension for IPv6 VPN,” draft-ietf-l3vpn-
bgp-ipv6-03 (work in progress), June 2004.
[I-D.ietf-l3vpn-gre-ip-2547] Rekhter, Y. and E. Rosen, “Use of PE-PE
GRE or IP in BGP/MPLS IP VPNs,” draft-ietf-l3vpn-gre-ip-2547-02
(work in progress), April 2004.
[I-D.ietf-l3vpn-ipsec-2547] Rosen, E., Clercq, J. and C. Sargor, “Use of
PE-PE IPsec in RFC 2547 VPNs,” draft-ietf-l3vpn-ipsec-2547-02 (work in
progress), March 2004.
[I-D.ietf-l3vpn-rfc2547bis] Rosen, E., “BGP/MPLS IP VPNs,” draft-ietf-
l3vpn-rfc2547bis-01 (work in progress), September 2003.
[I-D.ietf-l3vpn-vpn-vr] Knight, P., Ould-Brahim, H. and B. Gleeson,
“Network based IP VPN Architecture using Virtual Routers,” draft-ietf-
l3vpn-vpn-vr-02 (work in progress), April 2004.
[I-D.ietf-pwe3-arch] Bryant, S. and P. Pate, “PWE3 Architecture,” draft-
ietf-pwe3-arch-07 (work in progress), March 2004.
RFC 1661, Simpson, W., “The Point-to-Point Protocol (PPP),” STD 51,
IETF, July 1994.
RFC 2547, Rosen, E. and Y. Rekhter, “BGP/MPLS VPNs,” IETF, March
1999.
RFC 2661, Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, G. and
B. Palter, “Layer Two Tunnelling Protocol (L2TP),” IETF, August 1999.
RFC 2764, Gleeson, B., Heinanen, J., Lin, A., Armitage, G. and A. Malis,
“A Framework for IP Based Virtual Private Networks,” IETF, February
2000.
[SSL3] A. Frier, P. Karlton, and P. Kocher, The SSL 3.0 Protocol,
Netscape Communications Corp., Nov 18, 1996.
Appendix F
IP Multicast1
IP television head-end system
Figure F-1 illustrates a block diagram of the logical component subsystems
in the IP television head-end. There are three main components in the head-
end:
the media subsystem
the application server subsystem
the web server subsystem
These three components are shown in Figure F-1 with some references to
horizontal integrated dialogs between the systems. The focus is on the
media subsystem. The other modules allow for the correlation of user
accounts and billing. These systems also provide the maintenance of
subscribers and of the actual system itself.
1. Please refer to the Disclosure Notice given in the Section VI Introduction (page 498).
Sidebar: PIM-SSM
Originally, a PIM shortest-path tree (SPT) was established as a result of
a policy-based event within the PIM-SM rendezvous point (RP) for the
given multicast group. At this point, the edge PIM router will execute a
shortest-path tree build because it now knows the unicast IP address of
the sending source because of the packets that it has received via the RP
shared tree path. Once the SPT is built, the edge router initiates a prune
to the RP shared tree. Further logic dictates that if the unicast IP address
of the source were known a priori, there would be no need for the RP
shared tree phase of the tree build. A PIM router could immediately
reference its unicast routing table and perform a build to the SPT. This is
the essence of PIM-SSM.
In order to co-exist with a PIM-SM/IGMPv2 deployment, it became
necessary to establish an addressing range for the single-source mode of
behavior. The range selected was the 232/8 addressing range, or Class D
address range 232.0.0.0 through to 232.255.255.255. All IGMP requests
for channels within this range are to be source-specific. The RP is not
invoked and will ignore all activity in this address range. This means that
any request for an IP multicast group within the SSM range must be
accompanied by the unicast IP address of the sending source or of the
S,G format where S is the unicast IP address of the sending source and G
is the multicast IP address of the channel. All non–source-specific
requests within the 232/8 range that are directed to the *,G format will
be ignored by a PIM-SSM router.
All of this means two things: first, the requesting client must support
IGMPv3; and second, that the sending source must support source-
specific service advertisements or have a method for making the unicast
address known to the client. Also implied is that in order to use PIM-
SSM, the provider may very well have to re-address its existing IP
multicast deployments. It was perceived correctly that these
requirements and limitations would preclude the introduction of PIM-
SSM. As a consequence, a number of features were introduced to ease
the adoption of the technology.
2. Residual patterns come from the last channel that the subscriber was surfing that the switch has not yet
disconnected from the subscriber.
While a Single Source Mode (SSM) for multicast provides protection from
a rogue source hijacking the channel (because of its source-specific
nature), it does not provide content protection capabilities for the channels
being multicast. In theory, a non–paying individual could sniff out the
information required to join the group and then use a generic decoder to
actually view the content. Specific IGMP-based filters can efficiently
provide protection from this form of piracy.
Another aspect to consider is the need to direct certain multicast flows to
certain portions of the network. The reason for this may be demographic or,
in the case of a hybrid provider (a provider who has different access
networks within the subscriber base), to assure that higher-speed channels
do not get propagated out to portions of the network that cannot handle the
traffic load.
In Figure F-7, a simple hybrid provider network is shown. One leg of the
network supports DSL access speeds with the bandwidth limitations that
were discussed earlier in this article. The other portion of the network
provides direct ETTU. In this portion of the network, the channel speed can
effectively be doubled. This obviously provides an enhanced viewing
experience for this portion of the subscriber base, but it also adds the
complexity of assuring that the high-speed channels are not forwarded over
to the DSL access portion of the network. Multicast routing policies can
effectively deal with this scenario.
true with IP-based delivery of broadcast television. The paradigm that must
be met is prevention of access to content that the customer has not paid for,
not whether they choose to record it.
The issues of recording content are complex, both from a technical and a
legal perspective. The entertainment industry has been working on this
issue for quite sometime with little headway. From a technical perspective,
watermarking content is a feasible approach, but that would involve the
support of the encoder and the decoder or TV. Furthermore, watermarking
content would prevent any recording of the content, not just the content that
someone intends to resell. It would, in essence, render VCRs useless if all
content were protected in this matter. This would create a consumer outcry
that the entertainment industry would not find beneficial to their cause.
Needless to say, the issues as they stand are best not addressed by any
network-based technology. It is more appropriate for them to be addressed
by content creation, transmission and receiving technologies as well as the
entertainment and legal communities.
Given this, the role that the network plays, as stated earlier, is to assure that
the viewing audience is the paying audience. This can effectively be
provided by IGMP-based filters with the join filters, which are
implemented at the edge ingress of the network. Many providers choose to
provide channel bundles to ease the administration of this aspect of the
service. By arranging the multicast addresses to correspond to these
bundles, it becomes easy to allow or deny access based on them.
As an example, the provider may have a premium offering where all of the
channels are grouped into a common representation. In the previous
example of channel speed grouping, the third octet could be used for this
purpose. By this means, all of the channels contained within that super-
group would be allowed or denied based on the policy. So if the premium
service were represented by a .192 value in the third octet range, the
provider would have two super-groups: one for high-speed channel access
232.128.192.x, and one for low-speed channel access, 232.192.192.x. As
shown earlier, the route policies to direct the different channel speed
groups are already in place. Now it becomes a simple matter of
transmitting an allow or deny message to the 232.128.192.x range in the
case of the high-speed offering. If an aggregate of several channels (which
is the bundle) were within the addressing range, then all the channels
would be allowed or denied based on the filter policy.
Additionally, the provider could be channel-specific as well—perhaps offer
a single channel as a limited-duration promotional offer for the whole
bundle. In this instance, the super-group 232.128.192.x would be denied,
while 232.128.192.10 might be allowed for a period of thirty days.
Once the browser has downloaded the metafile, it is handed off to the
viewers RealPlayer client. The Real Player client then reads the data in the
metafile and then requests the presentation from the Real Server. In later
versions of the Real Player client, this is accomplished via RTSP dialog.
Other methods are available: PNA is a streaming dialog that is used by
earlier client version and HTTP provides generic embedded access by port
80.
When the Real Player client requests a URL that begins with rtsp://, it
sends its request call to Real Servers port 554. Requests to pnm:// indicate a
PNA request from an older client and are directed to port 7070. Whereas,
http:// requests are directed to port 80 or 8080 as appropriate and are the
less efficient (nonstreaming) method for delivery. Further details can be
found in the RealNetworks Administrator guide.
As a result of the request, the server (or an intermediate proxy service) will
then serve out the streaming media. RealNetworks provides two methods
for serving streaming content. The first is RealNetworks proprietary Real
Data transport Protocol (RDP), the second is the industry standard RTP/
RTCP. Note that the RTP delivery requires the G2 RTP-based Real Server
and Client.
Because of the tight integration between the client and the server via the
ram file, RealNetworks is a closely licensed method. Even intermediate
network proxy functions must comply with this model and, hence, are
covered by the licensing rights. RealNetworks provides licensed proxy
services by software called RealProxy* that can be used in the network
path to provide caching delivery features at the edge. Many CDN cache
products and solutions directly license the RealProxy service as an add-on
feature. This allows for the intermediate proxy of the ram file and any
corresponding contents for the event according to the rules defined on the
proxy agent.
port 1755 for media stream control and is somewhat analogous to RTSP.
The actual media ports for audio and video are UDP and are dynamically
created. MMS media can also be handled by an intermediate proxy service.
Appendix G
QoE Engineering
In this appendix, the focus will be on real-time data application’s
performance targets and their impairments. It should be pointed out that not
all data applications require real-time treatment; but, a subset does require
real-time or quasi real-time response time to achieve the desire interactivity
level. Applications such as gaming, telnet or remote login requires
response time in the millisecond range, and careful attention to the
underlying network architecture and selection of QoS mechanisms.
QoE
Target
Margin
QoE dependencies & factors
(Buffer size, loss rate, # of flows/users, link size)
1000
Queuing Delay (max, ms)
100
10
0.1
10% 20% 30% 40% 50% 60% 70% 80% 90%
Allowed
burst
Peak
Average Rate
Rate
Policing Shaping
Figure G-5: Rate limiting characteristics of policing and shaping
Figure G-6 shows an example of a two-color policing implementation
using a token bucket architecture. The token bucket accumulates tokens at
the “Committed Rate” up to the burst level. When that happens, the tokens
are discarded. When the incoming packet aggregate conforms to a
“Committed Information Rate” (CIR) with some burst bandwidth in line
with the “Committed Burst Size” (CBS), then packets are marked as in-
profile. Otherwise, when burst size exceeds the “Excess Burst Size” (EBS)
limit, packets are marked as out-profile. After packet classification, in-
profile packets will be discarded only after all out-profile traffic has been
dropped (differentiated dropping probabilities). Excess traffic is tagged and
may be discarded under congestion.
Tokens
1 token = credit for 1 byte
CIR
Packets Enough
YES:
arriving Credits?
CBS - Committed burst size
EBS - Excess burst size
CIR - Committed information rate =
Token arrival rate
NO: Exceed
Queue management
Queue management is a function required to store packets before
transmission on a link or interface. The simplest technique for queue
management is called “tail drop”. Tail drop is a passive queue
management technique, whereby the queue is to set a maximum length (in
terms of packets) for each queue, accept packets for the queue until the
maximum length is reached, then reject (drop) subsequent incoming
packets until the queue decreases because a packet from the queue has been
transmitted. The other class of queue management is called Active Queue
Management (AQM). The basic idea behind active queue management
schemes such as WRED/RED (random early detection) is to detect
incipient congestion early enough to convey implicit congestion
notification to the end-systems, allowing them to reduce their transmission
rates before queues in the network overflow and packets are dropped.
WRED/RED detects congestion by monitoring the queue size and start
dropping packet randomly when a queue threshold is reached. WRED/
RED offers a proactive response to congestion:
preventing synchronization of TCP timeouts and restarts
providing early feedback on congestion in network
Dropping
Probability
Enqueue Packet Randomly Packet Dropped
Packet Dropped
100%
0%
8
7
6
5
4
3
2
1
Recommended QoE operating zone
0
0 300 600 900 1200 1500 1800 2100 2400
Number of Users
Figure G-8: WRED QoE performance (90th percentile response time) against
tail drop (best effort) as a function of the number of users. This
graph compares multiple WRED configurations (mint, maxt, drop
rate) against tail drop. The traffic offered load is IDENTICAL in
both best effort and WRED-enabled solution; hence both QoS
mechanisms are compared on the same loading conditions. The
instability of WRED is reflected in response time variation as
load changes. No optimal setting works for a wide range of
operating conditions
Appendix H
PPP Header Overview
The Multiclass Extensions for Multilink PPP provides for Service Classes
to be specified in the Multilink PPP header. Two PPP multiclass formats
are defined. The short sequence number format provides four classes of
service and the long sequence number format provides sixteen classes of
service. The PPP class fields are circled in Figure H-1 for both Long and
Short Sequence Number formats.
Glossary
1xEV-DO 1.25 MHz Evolution, Data Only
1xEV-DV 1.25 MHz Evolution, Data & Voice
1xRTT Single carrier (1x) Radio Transmission Technology - third
generation wireless technology for CDMA, also called
CDMA2000
2.5G Enhanced Second Generation wireless technology - adds some
data networking functionality to 2G systems. An example is
GSM GPRS.
2G Second Generation wireless technology - 2G wireless uses
digital voice transmission across the radio channel.
3G Third Generation wireless technology - 3G uses redefined
channels to allow transport of digital voice and data services.
5-tuple Combination of values for fields of IP packet header used to
specify filters: IP source and destination addresses, protocol
number, source and destination transport identifiers (port
numbers)
6bone An experimental IPv6 network now being wound down
6over4 An obsolete tunnelling technique for carrying IPv6 across IPv4
6to4 An automatic tunnelling technique for carrying IPv6 across IPv4
802.1p A 3-bit field in the IEEE 802.1Q extension to the Ethernet header
used to identify classes of service over Ethernet
802.1Q IEEE standard that defines the operation of VLAN Bridges that
permit the definition, operation, and administration of Virtual
LAN topologies within a Bridged LAN infrastructure. It defines
four additional bytes to the Ethernet frame header providing a
VLAN ID and 802.1p user priority field used for QoS.
A6 Experimental DNS record used to return IPv6 addresses.
AAAA Quadruple A DNS record used to return IPv6 addresses
AAL ATM Adaptation Layer
ABR Available Bit Rate
Automatic Generally associated with OSI Layer 1, the protection port is pre-
Protection planned; this is for SONET layer. The switch immediately
Switching (APS) activates the spare port and switches traffic upon failure
indication
Autonomous A collection of IP address/networks that is under the control of a
System single entity. Each router that is designated an autonomous
system will contain a full copy of the routing table.
Autonomous An AS boundary router (ASBR) is attached at the edge of an
System Boundary OSPF network Autonomous system. An ASBR runs an
Router (ASBR) interdomain routing protocol such as BGP
B-Frame A bidirectional differential frame which is one type of
compressed video frame. The B-Frame contains motion vectors
and the residual information needed to reconstruct parts of the
image uncovered by the displacement of a moving object.
Backhaul Route traffic “out of its way” to reach the destination. Done to
reach special equipment (such as a satellite ground station), to
reduce cost, or to avoid congested routes.
Back-to-back user Logical entity that receives a request and processes it as a user
agent agent server (UAS). To determine how the request should be
answered, it acts as a user agent client (UAC) and generates
requests. Unlike a proxy server, it maintains dialog state and
must participate in all requests sent on the dialogs it has
established. Since it is a concatenation of a UAC and UAS, no
explicit definitions are needed for its behavior.
Bandwidth (1) For analog signals, the difference between upper and lower
frequency limits; said of a signal, a channel, or a filter. (2) The
amount of data that can be put through a given channel in a given
time. The term is used to refer to either the maximum data rate
that a channel can carry or the minimum rate required for a
particular signal. This use of the term is derived from the
relationship between the frequency bandwidth of an analog
carrier and the maximum rate that the carrier can be modulated to
signal one bit of information. The broader the bandwidth, the
faster the maximum modulation rate, and so the more bits can be
sent per unit time.
BGMP Border Gateway Multicast (routing) Protocol
BGP Border Gateway Protocol
BGP/MPLS VPN Layer 3 VPN implemented using MPLS Label Switched Paths
(LSPs) to carry traffic between PoPs and an extension of BGP to
route traffic. Originally documented in RFC 2547.
Call Admission A mechanism that ensures the network has sufficient capacity to
Control (CAC) provide service to a user before admitting the session. Admission
criteria usually require that the user being admitted will receive
adequate performance and the added session will not cause other
users to experience quality degradation.
Call Routing A table maintained in the switch that provides an ordered and
Table conditional list of all possible next hop routings to reach a given
telephone number from that switch (the order is based on the
conditions).
Call Volume The integration of number of calls and the duration of each call.
Capex Capital Expenses
CBR Constant Bit Rate - an ATM service category
CBS Committed Burst Size - the size up to which packets will be
delivered while meeting the service class performance.
CCITT Former name of the ITU-T
CCS Centum Call Seconds (Hundred Call Seconds)
CDMA Code Division Multiple Access - used for digital cellular access.
CDMA splits each “bit” into a binary sequence of smaller units
called chips. A particular pattern of chips (the code) is assigned
to each user. The receiver uses the same code to extract its
intended signal; signals based on other codes appear to that
receiver as noise. CDMA is the basis of the US IS-95 2G system
as well as 3G UMTS and CDMA2000. Compare FDMA,
TDMA.
CDV Cell Delay Variation - in ATM, the variation in time-of-arrival of
cells at the receive end, analogous to IP packet delay variation
(jitter).
CDVT CDV Tolerance - in ATM, the upper limit on the Cell Delay
Variation. Specified for all ATM service categories.
CE Customer Edge
Central Office The term for the local telephone switch where customers’ lines
connect to the phone company. Also called the CO.
CEPT European Conference of Postal and Telecommunications
Administrations. The ECC (Electronic Communications
Committee) under the CEPT handles radio and
telecommunications matters.
CER Cell Error Ratio - in ATM, a measure of the ratio of cells with
errors to the total number of transmitted cells.
CES Circuit Emulator Service
Channel coding Error protection encoding for a signal transmitted over a wireless
or cellular radio link that is subject to Rayleigh fading and other
interference. Protection might include a checksum, forward error
correction, interleaving, and/or redundant information. Channel
coding is done on the bit stream output of the source codec
(compare source coding).
Chrominance The intensity of color in a television signal relative to a standard
color. Adding white reduces the color intensity.
CIDR Classless Inter-Domain Routing
CIR Committed Information Rate - the rate up to which packets will
be delivered while meeting the service class performance.
Class Selector A DiffServ PHB designed to support legacy routers that only
PHB group support the older form of IP QoS called IP Precedence. The Class
Selector PHB can support either eight priority classes similar to
IP Precedence or can be configured to inherit EF, AF and DF
PHBs.
CLP Cell Loss Priority - in ATM, a traffic management parameter that
specifies whether cells may be discarded if the network is
congested.
CLR Cell Loss Ratio - in ATM, the ratio of cells that are lost
compared to the number of cells originally sent; a required
parameter for some ATM service categories.
CMR Cell Misinsertion Rate - in ATM, the ratio of cells received at an
endpoint that were not originally transmitted by a given source
compared to the total number of cells transmitted from the
source.
CNG, comfort A DSP device that replaces background noise in a signal where
noise generation the background noise has been removed by a voice switch, the
non-linear processor in an echo canceller, or by a DTX feature. A
simple CNG will fill in with white or filtered Gaussian noise.
More sophisticated CNG designs may try to model the voice. In
some cases, information from the sending end is used to
reconstruct the noise to better match any noise present in the
speech.
Co-channel Radio frequency interference broadcast on and intended for the
interference same channel as is being received.
End Office A term for the local telephone switch where customer lines
connect to the phone company, also called the Central Office
(CO).
Endpoint In H.323, a terminal, Gateway, or MCU. An endpoint can call
and be called. It generates and/or terminates information streams.
Equal Cost Protects against link failure and is best used on the links between
Multipath high availability routers where load sharing and quick recovery
from a failure are required. ECMP allows a router running OSPF
to distribute traffic across multiple, equal cost routed paths
Erlang A unit of voice traffic volume. One Erlang is a call volume
sufficient, if all segments were concatenated, to occupy one 64
kb/s trunk for one hour.
Erlang Tables Probability tables that yield the number of trunks required
between two switches to provide a specified level of call
blocking given a calling volume between those switches.
ESP Encrypted Security Payload
Ethernet Virtual A logical “broadcast” domain, containing traffic to certain
Private LAN group members at Layer 2
Excess Sometimes known as the burstable component, the amount of
Information Rate data accepted by the network but marked Discard Eligible (DE).
(EIR) Note: sometimes EIR is referred to as Extended Information
Rate. In either case EIR = BE/T
Expedited A DiffServ PHB best suited for low latency, low loss, and low
Forwarding PHB jitter real-time services
Explicit A two-bit field in the DiffServ field used by routers to indicate
Congestion to neighboring routers that the router is experiencing congestion.
Notification
Explicit Route The path taken by an LSP is explicitly specified. This means that
the route is established by a means other than normal IP routing
Fast Reroute Techniques used to repair LSP tunnels locally when a node or
link along the LSP path fails
FastStart A term related to H.323 setup procedures with an abbreviated
sequence that allows call setup and connection setup to occur in
one round trip.
FCH Fundamental Channel
FCOT Fiber Control Office Terminal - a generic term for fiber terminal
which can be configured either fully digital, fully analog, or
mixed.
FDMA Frequency Division Multiple Access - a wireless access
technique in which the wireless RF band is shared by dividing it
into narrower bands, each of which carries data for a separate
channel. Compare CDMA, TDMA.
FEC In MPLS, Forwarding Equivalence Class indicating Route +
Class of Service. In digital communications, Forward Error
Correction, a method of detecting and correcting data errors
without the need for retransmission.
FERF Far End Receive Failure - a signal to indicate to the transmit site
that a failure has occurred at the receive site.
Field In video, information representing one of the two scan patterns
that make up a frame in an interlaced television system. Two
fields constitute one frame.
File format A standard method of parsing digital data and any information
needed to read and display it.
FoIP Fax over IP
Foreign agent PDSN serving the local Base Station Controller where the user is
currently connected
FOTS Fiber Optic Transmission System, such as SONET or SDH.
FRAD Frame Relay Access Device
Frame Speech/audio: a segment of speech operated on by a frame-based
codec. Video: a single image from a series displayed sequentially
to simulate motion. Data: a segment of data to be parsed
according to a rule specified by the data format.
Frame roll An effect caused by receiver and camera not being synchronized
and has the appearance of sequential pictures moving vertically
through the screen
FRF Frame Relay Forum, now the MPLS and Frame Relay Alliance
FrNNI Frame Relay Network to Network Interface
FrUNI Frame Relay User to Network Interface
FT1 Fractional T1 is nothing more than a T1 with only some of the
DS0s being used.
FTP File Transfer Protocol
Home agent In a 2G wireless system, the Packet Data Serving Node where the
user maintains a full time presence and has a gateway to other
networks
HTML HyperText Markup Language
HTTP HyperText Transfer Protocol
HTTPS Secure version of HTTP
Hue The position of a particular colour within the visible spectrum of
colours.
HyperText An application level protocol with the lightness and speed
Transfer Protocol necessary for distributed, collaborative, hypermedia information
(HTTP) systems.
I Frame Intra Frame - one of the frame types used in MPEG which
contains complete information for one complete frame.
IANA Internet Assigned Numbers Authority
ICANN Internet Corporation for Assigned Numbers and Names
ICMP Internet Control Message Protocol
IETF Internet Engineering Task Force - a standards body governing
Internet operation
IGMP Internet Group Management Protocol - IGMP is the edge session
protocol for IP multicast. It is supported in both L3 (router side)
and L2 (client side). Router side implementations of IGMP work
in tandem with the L3 multicast routing protocol (DVMRP or
PIM). In many instances the IGMP router side process is part of
the multicast routing process meaning that IGMP does not have
to be enabled as a separate process. On the client side it is
embedded into the Operating System of the device. There are
three versions of IGMP. IGMPv1 and 2 are similar in both
protocol primitives and group representation. IGMPv2 mainly
brings the concept of an implicit ‘leave’ message in to enhance
the edge router performance profile. IGMPv3 uses newer
primitives and embeds group membership into the IGMP
message in such a way as to break traditional methods of IGMP
L2 snooping at the edge. This issue is being investigated.
IGP Interior Gateway-Protocol
IID Interface IDentifer
iIS-IS Integrated IS-IS
IKE Internet Key Exchange
Proxy server An intermediary entity that acts as both a server and a client for
the purpose of making requests on behalf of other clients. A
proxy server primarily plays the role of routing, which means its
job is to ensure that a request is sent to another entity “closer” to
the targeted user. Proxies are also useful for enforcing policy (for
example, making sure a user is allowed to make a call). A proxy
interprets, and, if necessary, rewrites specific parts of a request
message before forwarding it.
PS Packet data Services
PSTN Public Switched Telephone Network
PTI Payload Type Indicator
PVC Permanent Virtual Circuit - point-to-point circuit that maintains
connection even when not in use
QAM Quadrature Amplitude Modulation - QAM is a relatively simple
technique for carrying digital information from the television
operators broadcast center to the cable subscriber. This form of
modulation modifies the amplitude and phase of a signal to
transmit the MPEG2 transport stream. QAM is the preferred
modulation method for the cable provider companies because it
can achieve high transfer rates of up to 40 Mb/s.
QDU, One QDU is the quantization distortion associated with encoding
Quantization into G.711 at 64 kb/s; the coding impairment of other codecs is
Distortion Unit sometimes characterized by the number of QDUs associated with
the encoding/transcoding. The QDU concept assumes an additive
model, so that the quality impairment from successive encoding
can be estimated by adding together the QDUs for each
encoding. This model tends to break down when applied to
speech compression codecs, and other non-linear devices.
QoE Quality Of Experience - the user's perception of service or
application quality.
QoS Quality of Service - a set of mechanisms and protocols intended
to ensure efficient use of the network resources.
QPSK Quadature Phase Shift Keying - QPSK is more immune to noise
than QAM and consequently is typically used as the preferred
modulation for the satellite environment or on the return
signaling path for a CATV network. QPSK works on the
principle of shifting the digital signal so that it is out of phase
with the incoming carrier signal. QPSK improves the robustness
of a network however this modulation scheme has practical
limits of around 10Mb/s.
Signal Transfer This equipment forwards messages from one signal switching
Points (STP) point on to another signal switching point. Just like a router in an
IP network provides packet forwarding, the STP provides
message forwarding.
Signaling Point This is a signaling point is any node that originates or terminates
(SP) signaling messages
SIIT Stateless IP/ICMP Translation Algorithm (used in NAT-PT)
Silence Transmission of the voice path data only when speech is present.
suppression On a voice channel, this can reduce the long-term average data
volume by about 40%. The benefit of silence suppression is
achieved in high capacity links, since the peak data rate for an
individual channel remains the same. The more channels we
combine, the less variability in the overall average data rate,
allowing link capacity to be trimmed close to the predicted
average. Silence suppression is referred to as DTX or VAD.
Implementations usually include comfort noise generation (see
CNG).
Silkroad IPv6 tunnelling mechanism for traversing NATs
SIP Session Initiation Protocol - a peer level call control protocol,
developed as an open standard by IETF, and is a direct
competitor of H.323. In contrast to H.323, it is based on Web
principles and has a simplistic and modular design that is easily
extensible beyond telephony applications. SIP is enjoying rapid
momentum in the industry at both the system and device level.
SIP-based call control has excellent potential for smart phone
applications and some devices already appearing on the market.
SIP is also appropriate as the peer-level interface for call servers
and either standalone or decomposed gateways.
SIP-T SIP for Telephones is SIP with tunnelled ISUP messages
SLA Service Level Agreement
Slip An overflow (deletion) or underflow (repetition) of one frame of
a signal in a receiving buffer.
SNMP Simple Network Management Protocol
SOCKS Apparently not an acronym: A proxying technology extended to
provide IPv4-IPv6 technology
SOCKS64 The SOCKS IPv4-IPv6 translation proxy
SOHO Small Office – Home Office
T (time interval) The time interval used in calculating frame relay CIR and EIR
T1 Trunk Level 1 - a North American TDM transmission link
carrying 24 voice channels; equivalent to DS1.
T1X1 A committee within the ECSA that specifies SONET optical
Subcommittee interface rates and formats.
Tandem encoding Encoding and decoding of a signal two or more times through a
codec. Tandem encoding can cause significant degradation (see
asynchronous tandeming, synchronous tandeming).
TCLw Weighted Terminal Coupling Loss - a measure of the amount of
signal that a telephone end device allows to cross over from the
receive path to the send path. The weighting factor adjusts the
contribution of the various audio frequencies to the perceived
loudness.
TCP Transport Control Protocol - defined by IETF RFC 793
TDM Time Division Multiplexing
TDMA Time Division Multiple Access - a wireless access technique in
which an RF frequency band is shared by dividing it into short
time slots, each of which carries data for a separate channel.
GSM and North American IS-54 are TDMA-based technologies.
Compare CDMA, FDMA.
TE Traffic Engineering - proactive traffic management. Two TE
protocols are available:
RSVP-TE: Resource Reservation Protocol TE
OSPF-TE: Open Shortest Path First TE
TELR Talker Echo Loudness Rating - a measure of the level of echo
present on an interactive voice call
Teredo IPv6 tunnelling mechanism for traversing NATs (Teredo is the
name of an animal species – a worm which bores holes through
timber underwater and was the cause of the demise of many
wooden ships)
Terminal An H.323 Terminal is an endpoint on the network which
provides for real-time, two-way communications with another
H.323 terminal, Gateway, or Multipoint Control Unit. This
communication consists of control, indications, audio, moving
color video pictures, and/or data between the two terminals. A
terminal may provide speech only, speech and data, speech and
video, or speech, data and video.
Index 1
Numerics AAL-0 262
AAL-1 262
1+1 protected 149
1xEV-DO 316, 723 AAL-2 263
1xEV-DV 316, 723 AAL-3/4 263
1xRTT 313–315, 723 AAL-5 263
2.5G 723 ABR 265–266, 280, 723, 725
networks 313–315, 318 access
2G 723 control 374, 382, 415, 557, 724
networks 105, 308, 313, 318 network
3G 723 components 334
networks 308, 312, 315–316, 318, 361 concepts 307
5-tuple 352, 379, 723 networks
64 kHz 125 introduction 308
6bone 661, 663–664, 723 transport path 307
6over4 660–661, 665, 723 source jitter 625
6to4 660–661, 664–665, 723 technologies 7
802.1p 240–241, 246, 302, 437–441, 453, comparison 308
463, 573, 582, 591, 723 technology
user priority field 240 cable 327
802.1Q 240–241, 244, 246, 302, 318, 437– active
439, 591, 723 flows and network buffering 713
field definitions 241 queue management 236
add/drop multiplexer 147, 154, 158
additional information about IPv6 637–674
A address
A6 723 IP prioritization 244
records 359 IPv6 343, 346
AAAA 359, 723 resolution
AAL 723 linking IP and link level addresses 652
structure 262 resolution protocol 327, 356, 646, 652
selection
source and destination 658
summarization 409
types
IPv6 637
virtual channel connection 258 bit rate 68, 81, 86–87, 88–89, 102, 103, 107,
voice and telephony 269 143, 153, 236, 310–311, 319–320,
attenuation, noise and interference 309 338, 585
audio available 266, 280
codec 105 constant 269, 280, 442, 480, 483, 715
video unspecified 266, 280, 442
synchronization 78 variable 266, 271, 280, 442, 480, 482–485,
authentication header 354 512–514, 716
auto-configuration and DHCPv6 653 BITS 155
auto-configured addresses 656 block diagram
automatic protection switching 151, 397 VoIP voice path 37
autonomous system 408–410, 505, 690–692 blocking 75
boundary router 410 BLSR 141, 148
available bit rate 264 blurring 75
border gateway protocol 376, 387, 407–409,
B 410, 505, 642, 645, 683, 688–693
B frame 72–74, 113 BR 280
B2BUA 204, 215 branch network 577
backhaul 105, 214, 324 broadband 142, 147–148, 158, 308–310,
back-to-back user agent 204 342–343, 448, 473, 517, 529–532,
bandwidth 308–310, 494–495, 496, 500, 536–541, 678
507, 513, 526, 536–539, 547–549, advantages 539
553, 559, 562, 568–569, 572–575, delivery
583–585, 590,604, 606–611, 613, 626, wireless systems 310
686, 701–705, 712–714, 717 inter-carrier interface 255
factors solution 536
network edge 701 technical challenge 536
basic broadband market technologies 537
IPv6 network layer 344 buffer size relationship 493
real-time applications and services 5 build
BayRS protocol priority queuing LSRs MPLS forwarding table 290
sidebar 513 building block
BECN 234 data rate 125
benefits business
IPv6 343 continuity 545
MPLS network 287 drivers for convergence 544
best busy hour 130
effort 6
packets or packets marked 234 C
practice 8 cable 531
BGP 408 access technology 327
extensions 688 advantages 535
BIA 669 modem 334
BIS 669 network 327
bit components 330
stuffing 153 solution 533
impairment 41 DiffServ
margin 489 codepoint 234
processing 44 mapping 436
propagation 44 MPLS
queuing 44 mapping 444
serialization 44 PHB
source components 488 forwarding 238, 239
sources QoS
IP networks 45 architecture 237
voice quality 39 TOS
design addresses field 235
ability 558 digital
effectiveness 556 cross-connect
efficiency 555 wideband 148
interoperability 558 loop carrier 129
manageability 561 switching
performance 555 principles 128
redundancy 554 video 70
reliability 554 decoding issues 71
scalability 558 impairments 68
security 556 Dijkstra algorithm 407
designing directing packets
real-time networking solution IPv6 637
infrastructure 545 discrete cosine transfer function 72
detection component 423 distributed multilink trunking 402
determine DLC 129, 536
network state 544 DLCI 273
development of VPNs 676 DMLT 402
DF 239 DMZ 551
DFZ 642 DOCSIS 331
diagram domains
Centrex IP network 519 transparent 303
IP reroute 395 DoS 551
local network 523 downstream
long distance network 526 technology 331
MPLS drop and repeat 148
rapid fail over 419 DS1 timing circuits 617
multimedia network 530 DSCP 234, 238, 436
PESQ 55, 634 802.1p mapping 438
protocol reference 297 configuration
reference 675 considerations 239
transport path 35, 67, 123, 139, 229, 251, DSCP to EXP
285, 307, 341, 435, 465, 593 example 445
VRRP 404 mapping 445
difference DSCP to PPP
line 129 example 444
trunk 129
DSL IP 11
network architecture 325 engineering
DSLAM 52, 324, 598, 702 methodology
DSLAM aggregation example 702 QoE 466
DSTM 670 network resources
DVB-ASI 609 QoE 465–500
DVMRP 595 traffic 479
DVRMP multicast tree building 700 enhancements
DWDM 298, 547 IP multicast 698
Ethernet 303 enterprise
DWRR 236 global
VoIP and QoS 503–516
E LAN engineer 6
E.164 216 WAN access 475
early media 210 Enterprises 6
EBR 414 ENUM 216, 218
EBS 717 EoDWDM 303
ECAN 36, 473 EoF 299
echo EPG 612
causes 51 Erlang 130, 310
control ESCON 141, 142
VoIP 40, 52 Ethernet 436
impairment 48 DWDM 303
ECMP 403, 478 optical 297
ECN 234 concepts 297
edge over fiber 299
distortion 76 private
switch feature enhancements LAN 304
IP multicast 601 line 304
EGP 408 RPR 300
EIR 235 services 304
electrical RF switching and load balancing 552
interference 70 virtual private
element LAN 304
network 147 line 304
E-Model Ethernet switching 552
equipment ETSI 136
impairment 98 ETTU 594
E-MTA 534 example 8
encoding issues bandwidth factors
digital DSLAM aggregation 702
video 71 DSCP to EXP 445
end-user traffic categories 453 DSCP to PPP 444
engineer IP television system 696
converged network 8 label stacking 294
convergence 11 MPLS network 286
multicast route policies 704
S
SA 611
SAD 37
SADB 659
sampling
analog signal 84
survivability 7 WAN 7
SVC 569 technology
SVZ 551, 557 downstream 331
switching upstream 331
back component 424 TED 612
line to line 130 Telcordia Technologies 135
line to trunk 130 telecommunications convergence 15
trunk to line 130 telephony
VP and VC codec
ATM 259 common 98
synchronization 153 summary 99
audio/video 78 Telnet 24
signals TELR 49
loss 69 TGS 337
status messages 157 TIA-810-A 49
TDM 127 Tier 2 service provider 8
Timbuktu 24
T time division multiplexing 124
T1 126 timely applications 449
talker echo loudness rating 49 timing methods
tandeming 46, 94, 101, 500 network element 154
TCP 448 tool kit 12
based data application packet loss topology
requirements 491 network 219
TDM 25, 615 TOS 233
circuit switched networking 123 field
concepts 124 DiffServ 235
summary 138 old
clock rate 127 field definition 234
principles 124 traditional
synchronization 127 services 142
TDMA 310 traffic
TE 287, 479 conditioning 716
technical demands 480
challenges calculation example 481
broadband 536 engineering 479
cable 532 management
Centrex IP 518 parameters 441
local service providers 522 summary
long distance providers 525 NSC 456
multimedia 528 prioritization
technologies VLANs 244
access 7 trunks and flows
core 7 MPLS networks 287
LAN 7 traffic engineering
network 6 examples 480
wideband
digital cross-connect 148
Windows Media Player 115
wireless systems
broadband delivery 310
working group
ENUM 218
IPTEL 218
MIDCOM 218
MMUSIC 218
NSIS 218
SIPPING 218
working path component 424
WPs 423
WRED 236
WRED QoE performance 719
WRR 236
WSM 552
Y
Y.1711 426