Essentials of Real Time Networking

Copyright Country of printing Confidentiality Legal statements Trademarks
Essentials of Real-Time Networking: How Real-Time Disrupts the Best-Effort Paradigm

Copyright © 2004 Nortel Networks, All Rights Reserved
Printed in the United States of America
NORTEL, NORTEL NETWORKS, NORTEL NETWORKS LOGO, the GLOBEMARK, BAYSTACK, CALLPILOT, CONTIVITY,
DMS, MERIDIAN, MERIDIAN 1, NORSTAR, OPTERA, OPTIVITY, PASSPORT, SUCCESSION and SYMPOSIUM are trademarks
of Nortel Networks.
ALTEON is a trademark of Alteon WebSystems, Inc.
ARIN is a trademark of American Registry of Internet Numbers, Ltd.
APACHE is a trademark of Apache Micro Peripherals, Inc.
APPLE, APPLETALK, MAC OS and QUICKTIME, are trademarks of Apple Computer Inc.
CAPEX is a trademark of Solyman Ashrafi
CABLELABS, DOCSIS and PACKETCABLE are trademarks of Cable Television Laboratories, Inc.
C7 and CALIX are trademarks of Calix Networks Inc.
CANAL + is a trademark of Canal + Corporation
KEYMILE is a trademark of Datentechnik Aktiengesellschaft
MPEGABLE is a trademark of Dicas Digital Image Coding GmbH
CINEPAK is a trademark of Digital Origin, Inc.
ECI is a trademark of ECI Telecom Limited
DIGICIPHER and GENERAL INSTRUMENT are trademarks of General Instrument Corporation
OPEX is a trademark of Gensym Corporation
INFOTECH is a trademark of Infotech, Inc.
ESCON and LOTUS NOTES are trademarks of International Business Machines Corporation (dba IBM Corporation).
IANA and ICANN are trademarks of Internet Corporation of Assigned Names and Numbers.
NAGRA and NAGRAVISION are trademarks of Kudelski A.B.
INDEO is a trademark of Ligos Corporation
ENHYDRA is a trademark of Lutris Technologies, Inc.
SIP is a trademark of Merrimac Industries, Inc.
FORE SYSTEMS is a trademark of Marconi Communications, Inc.
ACTIVEX, NETMEETING, MICROSOFT WINDOWS, OUTLOOK, WINDOWS, and WINDOWS MEDIA are trademarks of Microsoft
Corporation
NETIQ is a trademark of NetIQ Corporation
TIMBUKTU is a trademark of Netopia, Inc.
OPNET is a trademark of OPNET Technologies, Inc.
ECAD is a trademark of Pentek, Inc.
REALAUDIO, REALNETWORKS, REALPLAYER, REALPROXY, and REALVIDEO are trademarks of RealNetworks, Inc.
PESQ is a trademark of Psytechnics Limited
POWERTV and SCIENTIFIC ATLANTA are trademarks of Scientific-Atlanta, Inc.
SILKROAD is a trademark of SilkRoad Technology, Inc.
SPRINT is a trademark of Sprint Communications Company L.P.
CDMA2000 is a trademark of Telecommunications Industry Association
NETBSD is a trademark of The NetBSD Foundation
THE YANKEE GROUP is a trademark of The Yankee Group
VERIZON is a trademark of Verizon Trademark Services LLC
BSD is a trademark of Wind River Systems, Inc.
ZENITH is a trademark of Zenith Electronics Corporation
Trademarks are acknowledged with an asterisk (*) at their first appearance in the document.
i
Contents 1
Author Biographies .................................................................................. ix
Acknowledgments ................................................................................................ xiv
Chapter 1. Introduction .............................................................................1

Section I: Real Time Applications and Services ..................................................... 3
Section II: Legacy Networks ................................................................................... 5
Section III: Protocols for Real-time Applications .....................................................5
Section IV: Packet Network Technologies ...............................................................6
Section V: Network Design and Implementation .....................................................7
Section VI: Examples ..............................................................................................8
Let’s Get Started .....................................................................................................9
Reading the Transport Path Diagrams ..................................................................10
Conclusion ............................................................................................................11
Section I: Real-Time Applications and Services ...........13
Chapter 2. The Real-Time Paradigm Shift ......................................15

Concepts Covered ................................................................................................15
Introduction ...........................................................................................................15
What is convergence? ...........................................................................................16
What do we mean by real time? ............................................................................19
Service quality and performance requirements ....................................................25
Conclusion ............................................................................................................31
What You Should Have Learned ........................................................................... 33
References ............................................................................................................34
Chapter 3. Voice Quality ...................................................................35

Concepts covered .................................................................................................35
Introduction ...........................................................................................................36
Voice calls through an IP network .........................................................................36
Factors affecting VoIP conversation quality ...........................................................38
Quality metrics for voice ........................................................................................53
What you should have learned ..............................................................................63
References ............................................................................................................64
Chapter 4. Video Quality ..................................................................67

Concepts covered .................................................................................................67
Copyright © 2004 Nortel Networks Essentials of Real-Time Networking

ii Contents
Video .....................................................................................................................67
Video Impairments ................................................................................................68
Digital video impairments ......................................................................................68
Causes of video signal impairments .....................................................................69
Digital video ..........................................................................................................70
Sequences of frames ............................................................................................74
What you should have learned ..............................................................................79
Chapter 5. Codecs for Voice and Other Real-Time Applications .81

Concepts Covered ................................................................................................81
Introduction ...........................................................................................................82
Basic Characteristics of Codecs ........................................................................... 85
Coding impairments ..............................................................................................88
Speech Codecs for Voice Services ....................................................................... 98
Audio Codecs ......................................................................................................105
Video Codecs ......................................................................................................107
What you should have learned ............................................................................116
References ..........................................................................................................117
Section II: Legacy Networks .........................................121

Chapter 6. TDM Circuit-Switched Networking ..............................123
Concepts covered ...............................................................................................124
TDM principles ....................................................................................................124
The importance of clock rate and synchronization in TDM ................................. 127
Principles of digital switching, voice switches .....................................................128
Chapter 7. SONET/SDH ..................................................................139

Introduction .........................................................................................................140
Overview .............................................................................................................140
SONET a practical introduction ...........................................................................141
The copper DS0-DS1-DS3 (traditional services) ................................................142
ATM and other traffic services .............................................................................142
Optical Ethernet applications ..............................................................................142
SONET terminology ............................................................................................143
The network element ..........................................................................................147
Network configurations .......................................................................................148
Synchronization ..................................................................................................153
Section III: Protocols for Real-Time Applications .......159

Chapter 8. Real-Time Protocols: RTP, RTCP, RTSP .....................161
Introduction .........................................................................................................162
Real-Time Transport Protocol (RTP) ...................................................................162
Real-Time Control Protocol .................................................................................167
Essentials of Real-Time Networking Copyright © 2004 Nortel Networks

Contents iii
RTP and TCP ......................................................................................................169

Real-Time Streaming Protocol (RTSP) ...............................................................170
RTSP and HTTP .................................................................................................175
References ..........................................................................................................178
Chapter 9. Call Setup Protocols: SIP, H.323, H.248 .....................179

Introduction .........................................................................................................179
H.323 .................................................................................................................. 182
SIP ......................................................................................................................198
Comparison of SIP and H.323 ............................................................................216
Gateway Control protocols .................................................................................. 219
What you should have learned............................................................................ 225
References.......................................................................................................... 227
Chapter 10. QoS Mechanisms ....................................................... 229

Concepts Covered .............................................................................................. 229
Introduction ......................................................................................................... 230
QoS and Network Convergence .........................................................................231
Overview of QoS Mechanisms............................................................................ 232
DiffServ QoS Architecture ...................................................................................237
DSCP Configuration Considerations ...................................................................239
Ethernet IEEE 802.1Q ........................................................................................240
Host DSCP or 802.1p Marking............................................................................ 241
Packet Fragmentation and Interleaving ...............................................................241
Other methods to achieve QoS........................................................................... 243
What you should have learned............................................................................ 246
References ..........................................................................................................247
Section IV: Packet Network Technologies ...................249

Chapter 11. ATM and Frame Relay ................................................251
Concepts Covered ..............................................................................................251
Introduction .........................................................................................................252
Layered protocol ................................................................................................. 253
ATM interfaces .................................................................................................... 255
ATM architecture .................................................................................................256
AAL (ATM adaptation layer) ................................................................................261
QoS and services in ATM networks .................................................................... 264
Voice and telephony over ATM ............................................................................269
Frame relay and FRF.11/12 .................................................................................271
Seven engineering best practices .......................................................................275
References .........................................................................................................282
Chapter 12. MPLS Networks ..........................................................285

Introduction .........................................................................................................286

iv Contents
Traffic trunks and flows ........................................................................................287

Motivations to move to MPLS .............................................................................287
The label ............................................................................................................287
Protocol components of MPLS ...........................................................................288
How to build the LSR’s MPLS forwarding table ...................................................290
Label switched paths setup..................................................................................290
LSP setup using explicit routes ...........................................................................291
LSP setup, example using RSVP-TE signaling................................................... 291
Integration MPLS and DiffServ ...........................................................................292
Label merging .....................................................................................................293
Label stacking .....................................................................................................293
References ..........................................................................................................296
Chapter 13. Optical Ethernet .........................................................297

Concepts covered ............................................................................................... 297
Introduction .........................................................................................................297
What is optical Ethernet? ....................................................................................298
How does an optical Ethernet network operate? ................................................299
How fast is optical Ethernet?............................................................................... 299
Ethernet over fiber ..............................................................................................299
Resilient packet ring.............................................................................................300
Ethernet over DWDM ..........................................................................................303
Optical Ethernet services ....................................................................................304
Internet access services .....................................................................................305
LAN extension .....................................................................................................305
What you should have learned.............................................................................306
Chapter 14. Network Access: Wireless, DSL, Cable ...................307

Introduction .........................................................................................................308
Physical challenges to bandwidth and distance ..................................................309
Wireless systems for broadband delivery ...........................................................310
Networking solutions supporting nomadic and mobile users ..............................317
xDSL technology ................................................................................................. 319
Cable access technology ....................................................................................327
References ..........................................................................................................340
Chapter 15. The Future Internet Protocol: IPv6 ...........................341

Renewing the Internet .........................................................................................342
Key IPv6 Items affecting Real-Time Networking ................................................. 343
Basics of the IPv6 network layer .........................................................................344
Layers above the Network Layer .........................................................................356
Control, Operations and Management ................................................................ 356
IPv6 Transition Strategies ...................................................................................360
What you should have learned ...........................................................................366
References.......................................................................................................... 367

Contents v
Section V: Network Design and Implementation ........371

Chapter 16. Network Address Translation ....................................373
Introduction ......................................................................................................... 374
The swings and the roundabouts ........................................................................375
Why do we need NATs? ......................................................................................377
NATs as a middlebox .......................................................................................... 378
NAT terminology ..................................................................................................378
The basics of Network Address Translation ........................................................379
Network Address Translator taxonomy ............................................................... 381
Interactions with applications ..............................................................................385
Packet translations needed to support NAT ........................................................385
Using the varieties of NAT ...................................................................................387
Issues with NAT ..................................................................................................389
References ..........................................................................................................393
Chapter 17. Network Reconvergence ...........................................395

Introduction .........................................................................................................396
Achieving resiliency .............................................................................................396
Redundancy to provide reliability ........................................................................396
Path redundancy and recovery ...........................................................................397
Protection schemes ............................................................................................ 398
Protocols for the network edge ...........................................................................399
Protocols for the core .......................................................................................... 407
Chapter 18. MPLS Recovery Mechanisms ....................................419

Introduction ......................................................................................................... 419
MPLS protection schemes ..................................................................................420
Components of an MPLS recovery solution ........................................................422
Monitoring, detection and notification mechanisms ............................................ 424
MPLS scope of recovery — global and local ......................................................428
MPLS recovery versus IP (IGP) recovery ...........................................................430
References ..........................................................................................................433
Informative references ........................................................................................433
Chapter 19. Implementing QoS: Achieving Consistent Application

Performance ....................................................................................435
Introduction .........................................................................................................436
Mapping DiffServ to Link Layer (Layer 2) QoS ....................................................436
Application Performance Requirements ..............................................................446
Categorizing Applications ...................................................................................447

vi Contents
Making QoS Simple via Networks Service Classes ............................................450

Additional QoS Implementation Considerations .................................................458
References ..........................................................................................................464
Chapter 20. Achieving QoE: Engineering Network

Performance ....................................................................................465
Introduction .........................................................................................................466
QoE Engineering Methodology ...........................................................................466
HRXs and QoS Mechanism Requirements......................................................... 472
Traffic Engineering & Resources Allocation ........................................................479
Summary–Network Engineering Guidelines ....................................................... 498
Section VI: Examples ....................................................501

Chapter 21. VoIP and QoS in a Global Enterprise .......................503
Voice over IP: Raising the need for Quality of Service ........................................503
The Quality of Service (QoS) design ..................................................................509
Lessons learned ..................................................................................................516
Chapter 22. Real-Time Carrier Examples .....................................517

Centrex IP ........................................................................................................... 517
Local ...................................................................................................................521
Long distance ......................................................................................................524
Multimedia........................................................................................................... 528
Cable ...................................................................................................................531
Broadband ..........................................................................................................536
Conclusion ..........................................................................................................540
References ..........................................................................................................542
Chapter 23. Private Network Examples ........................................543

The data solution ................................................................................................543
Introduction .........................................................................................................543
Getting Started ....................................................................................................543
Designing the Real-time Networking Solution Infrastructure ..............................545
Content Delivery Networking .............................................................................. 558
Solutions Management ....................................................................................... 560
Closets and Aggregation Points ..........................................................................562
The voice solution ...............................................................................................563
Voice and Multimedia Communication Services ................................................. 564
VoIP architecture and requirements ....................................................................565
Call Server .......................................................................................................... 568
Gateways ............................................................................................................579
Clients .................................................................................................................581
Applications .........................................................................................................583
Conclusion .......................................................................................................... 590

Contents vii
Chapter 24. IP Television Example ................................................593

Introduction ......................................................................................................... 594
The IP television system .....................................................................................597
Core switch/router feature enhancements for IP multicast ..................................599
Video on demand service extensions ................................................................. 604
Summary .............................................................................................................614
Appendix A. Additional Details about TDM Networking ....................615

SONET/SDH hierarchy .......................................................................................615
Stratum level clocks ............................................................................................ 616
Appendix B. RTP Protocol Structure ................................................... 619
Appendix C. Additional Information on Voice Performance

Engineering ............................................................................................ 623
Network jitter ....................................................................................................... 624
Access/source jitter .............................................................................................625
Jitter buffer dimensioning ....................................................................................626
Appendix D. Additional Information about IPv6 ..................................637

Introduction .........................................................................................................637
Directing Packets in an IPv6 Network ................................................................. 637
Control, Operations and Management ................................................................ 647
Application Programming Interface ..................................................................... 657
IPv6 Transition Mechanisms ............................................................................... 660
References ..........................................................................................................672
Appendix E. Virtual Private Networks: Extending the Corporate

Network ...................................................................................................675
Introduction .........................................................................................................675
The development of VPNs ..................................................................................676
Catering for the ‘Road Warrior’ ...........................................................................678
More Flexible VPNs ............................................................................................680
Layer 2 VPNs ......................................................................................................682
Layer 3 VPNs ...................................................................................................... 684
VPN Scaling, Security and Performance ............................................................690
Summary .............................................................................................................693
References ..........................................................................................................694
Appendix F. IP Multicast ........................................................................695

IP television head-end system ............................................................................695
Core feature enhancements for IP multicast .......................................................698
Bandwidth factors at the network edge ...............................................................701
Conditional access & content protection methods ..............................................703
Web streaming methods and practices ...............................................................707
Appendix G. QoE Engineering ..............................................................711

Nodal QoS Mechanisms .....................................................................................714

viii Contents
Appendix H. PPP Header Overview...................................................... 721
Glossary .................................................................................................723

ix
Author Biographies
Dave Anderson is a Senior Manager of Nortel Networks Wireless
Engineering and is responsible for the engineering aspects of Nortel’s
responses to global Wireless proposals and network designs. Dave has
been with Nortel since earning his B.S.E.E. in 1986 and has held a number
of Engineering positions within the company, including customer
supporting engineering roles for DMS switching, and more recently,
Wireless Network Engineering including aspects of Radio as well as core
network. He is familiar with the evolving global standards for wireless
systems including CDMA, EV-DO, GSM, UMTS and Wireless LAN.
Cedric Aoun works as a Senior Network Architect in Nortel Networks

Carrier VoIP business. For the past four years, he has been working on
corporate strategy for solving NAT and Firewall Traversal application
issues, as well as the introduction of IPv6 in VoIP networks. He holds a
B.S. in Engineering and an M.S. in Mobile Communications. Cedric is a
regular attendee and contributor to the IETF in the areas of NAT and
Firewall Traversal solutions, with primary focus on the NSIS Working
Group.
François Audet is an IP Telephony Subject Matter Expert at Nortel, where

is a System Architect in the Enterprise Business Networks division.
François has expertise is in telephony protocols, Voice over ATM, and
Voice over IP. He is an active participant in many Standards Forums, such
as ITU-T (for the H.323 suite of Recommendations), and IETF (for SIP
and SIP-related standards).
François Blouin is an Engineer and Subject Matter Expert in modeling

and simulation with the Solutions Ready Group in Nortel’s Ottawa Lab.
François leads a team developing network models for the prediction of
performance for real-time services running over packet networks, using
tools based on the ITU-T E-Model and OPNET. He consults with Nortel
account teams to specify target performance for customer networks. In
2001, François was presented with a gold pride award for his work on
voice bearer channel design guidelines for Succession Networks.
Sandra Brown is the Manager of the Technology and New Product

Introduction team in Nortel Networks Information Services, responsible
for the evaluation and trialing of new product and technology onto the
company's global network. Sandra has a B.S. and an M.B.A. and has
extensive experience in managing corporate information services.
Chris Busch is a Senior Network Architect for Nortel. He has more than
eight years of network systems experience within Telecommunications and

x Author Biographies
Healthcare industries. While at Nortel, he has held leadership roles in

Network Design, Product Integration, and Systems Engineering. Mr.
Busch is recognized for end-to-end solutions approach to Wide Area, Core,
Enterprise and Access Networks.
Peter Chapman has held positions as Development Manager, Product

Line Manager and Business Unit Manager over more than thirty years in
the telecommunications, aerospace, television technology and
semiconductor industries. He currently holds a position in the Chief
Technology Officer's organization of Nortel, with responsibility for end-to-
end performance of networks, applications and services. He is a Chartered
Electrical Engineer and holds a B.S.E.E. from Imperial College, London.
Hung-Ming Fred Chen is a Network Performance Consultant with Nortel.

At Nortel, he conducts analytical modeling and simulations for various
network architectures and products, including wireless and wireline
networks. His work mainly investigates QoS for triple-play service. He
completed his B.S, M.S., and Ph.D. with National Sun Yat-Sen University,
National Taiwan University, and University of Durham, UK, respectively.
Robert L. Cirillo, Jr. is a Network Architect in the Wireline Global

Operations division of Nortel. A native of Boston, he has been involved
with Enterprise Networking and the Telecommunications Industry for
fifteen years. Since joining Nortel in 1997, he has had extensive
involvement with the evolution of circuit-switched voice systems into high
performance, “Next Generation” packet-based networks.
Rob Dalgleish is responsible for 3G Access Strategy with Nortel Networks

Wireless Network Engineering based in Richardson, USA. Rob has been
with Nortel Wireless since 1995 in Engineering management and advisory
roles, working with GSM and CDMA customers and Nortel core R&D and
project teams in Asia and North America. Robert holds a B.S. in
Engineering with Honors from the University of Tasmania and is a
Chartered Professional Engineer.
Elwyn Davies is currently leading the CTO Office team setting the
strategy for the introduction of IPv6 into Nortel products. His background
includes an M.A. in Mathematics and research into aspects of Noise
Reduction in electronic and other systems. He is a regular attendee and
contributor to the IETF in a number of areas, including network layer
signaling and to the IRTF in routing research.

Author Biographies xi
Stephen Dudley has 23 years experience as an engineer in the

telecommunications business, working with fiber optic and digital
switching systems. He holds a B.S.E.E. and an M.S.E.E. with a certificate
in Computer Integrated Manufacturing. He is currently a Network
Engineer for converged voice and data networks.
Stéphane Duval is a Product Line Manager for OME Data. He has twelve
years of experience with data infrastructure solutions design for private and
public sector organizations and extensive customer interaction, which
helped him develop his skills to deliver reliable and secure data
infrastructures.
Gwyneth Edwards manages the Information Services (IS) product

engagement process, which supports the method in which the IS team
provides Nortel product feedback to the business units. Gwyneth has
twelve years IT experience at Nortel, specifically within the areas of
strategy and communication. She has a B.S.M.E. and an M.B.A.
Shane Fernandes is a Network Engineer in the Information Services

global WAN engineering group at Nortel, where he is responsible for all
aspects of the network including the global corporate backbone and
Internet. He has architected networks at various companies for the last
fourteen years. Shane has a B.S. from McMaster University.
Shardul Joshi is a Network Engineer at Nortel with seven years

experience in the Telecommunications Industry. His primary areas of
expertise are Data Communications and Voice over IP, for which he has
been slated an SME (Subject Matter Expert). Shardul has passed several
expert level certifications in the Data Communications areas and is highly
regarded among the Network Engineering community and external
customer base. His primary duties involve pre-sales consulting and
external customer trials to ensure proper cost-effective solutions are
created. In addition to those activities, he works with new product teams to
evaluate and ensure that these products provide positive customer
experience. Shardul graduated from Angelo State University with a B.S. in
Biology and minor in Chemistry in 1996.
Sinchai Kamolphiwong received a Ph.D. degree from the University of

NSW, Australia, where his thesis concerned flow control in ATM
networks. He is now an Associate Professor in the Department of
Computer Engineering, Prince of Songkla University, Thailand. He is a
director of Centre for Network Research (CNR). He has published fifty
technical papers. His main research interest areas are: ATM networks,
IPv6, VoIP, and performance evaluation. He is a committee member of
ComSoc (Thailand section) and a member of IEEE Computer Society.

xii Author Biographies
Peter Kealy has over twenty years experience in the telecommunications

industry. He is an RF and Telecom graduate from the Dublin Institute of
Technology, Dublin, Ireland. Peter spent seven years working in London,
UK, with BT working on varied telephony-based systems as a Field
Engineer. Peter joined Nortel in 1996 as a Technical Support Engineer for
Optical products and for the past four years has been an Optical Network
Engineer specializing in customized engineering procedures including
RPR & Optical Ethernet networks.
Joseph King is Director of Network Engineering for North America.

Joseph leads the network engineering teams responsible for planning,
designing and engineering next generation VoIP networks. He has over 22
years experience in the Telecommunications field—the last 17 ½ with
Nortel, where he has held various roles in the operations and engineering
organizations, supporting both Enterprise and Carrier customers.
Ed Koehler is a Solutions Architect for Nortel Portfolio Engineering

Group. He provides advanced consultation and design support for multicast
and multimedia-based solutions to the field engineering staff as well as
product development direction to the various product groups within Nortel.
Ed began his career at Eastman Kodak and was involved in one of the first
pilot projects for what was to become the IEEE 802.3 10BaseT
specification during the 1986-88 timeframe.
Ali Labed is a Subject Matter Expert in Performance Modeling and Traffic

Engineering. He currently works in the Solutions Ready Group in Nortel’s
Ottawa Lab. Ali consults with product line management teams on traffic
engineering, MPLS, and LSP resiliency strategies. Ali holds a B.S.C.S, a
M.S. in applied mathematics, and a Ph.D. in Computer Science.
Anthony Lugo is an Optical Network Engineer and a Subject Matter

Expert in the optical arena. He currently develops and implements complex
network reconfigurations for Nortel customers. Anthony served in the
United States Navy Submarine Division and received an A.A.S. in
Electrical Engineering Technology and a B.S. in Telecommunication
Engineering.
Timothy A. Mendonca is Team Leader of Succession Network Design

and has thirty years of data, voice, multimedia and security infrastructure
solutions design based on disparate technologies for private and public
sector organizations. Extensive customer interaction has positioned him
with a clear view and vision of the convergence issues. He is currently
working on his dissertation for a Ph.D. in Information Systems. His
dissertation topic is “Knowledge Acquisition Using Multiple Domain
Experts in the Design and Development of an Expert System for
Implementing VoIP and Multimedia in a Secure Environment.”

Author Biographies xiii
Robert D. Miller has 23 years of experience in designing and engineering

networks at Nortel. Rob was the Lead Architect of both the internal voice
ESN network and the private ATM network, and has the distinction of
making the first voice call over a private ATM network. More recently, he
was the Team Leader for the internal VoIP and QoS architectures.
Ralph Santitoro, Director of Network Architecture at Nortel, provides

strategic direction and best practice design guidelines for multiservice
network convergence. He defined the Network Service Classes (NSCs) and
DiffServ mapping discussed in this book, which are being standardized in
the IETF. Ralph also founded and chairs Nortel's QoS Core Team which
defines the QoS technology requirements and strategy for Nortel’s
Enterprise and Carrier product portfolios.
Leigh Thorpe is a Senior Advisor with the Solutions Ready Group in

Nortel under the CTO. She is responsible for evaluating QoE and
developing specifications based on QoE results. Her background includes a
B.S. in Physics and a Ph.D. in Experimental Psychology (Perceptual
Development). Leigh has directed many selection and characterization
tests for codec standards with ITU-T, TIA, and other standards groups. In
1997, she was awarded Nortel Networks Wireless President’s Award for
Quality.

xiv Acknowledgments
Acknowledgments
The writing, editing, and assembly of a large textbook is a formidable task.
We are fortunate that the corporate culture at Nortel encourages
collaboration and teamwork. Many people contributed to the successful
completion of this project, whether by direct contributions to the text, by
supporting the steering committee or individual authors, by removing
obstacles, or by championing this project to the senior executives. Thank
you to everyone who helped us move forward.
We would like to acknowledge some specific contributions. David

Downing, Leader of Wireline Deployment (Americas) for Nortel Networks
Global Operations group, was the main sponsor of this work. Account
Architect John Gibson was instrumental in initiating the project. Lorelea
Moore coordinated the Steering Committee. Michelle Bigham prepared the
cover graphics and assisted with internal communications. Ann Marie
Bishop tracked actions, published minutes, and tried to keep us on schedule
(sorry, Ann Marie!). Mark Bernstein provided valuable input on how
effectively we conveyed the main messages and on the integration of ideas
across chapters. Rod Wallace, Director under the CTO, and Bill
McFarlane, Director, Nortel Networks Global Certification Program,
promoted the initiative to Senior Executives across the company.
A big thank you to our many reviewers. Whether they read one chapter or
many, whether they focused on technical accuracy or clarity and
readability, their feedback and suggestions have greatly improved the
quality of the published version.
Finally, we are extremely grateful to the authors, many of whom worked

considerable overtime to complete their chapters. Thank you all for your
perseverance through the successive drafts, revisions, and detail checking.
Shardul Joshi, Leigh Thorpe, Steve Dudley, and Tim Mendonca, Editors

Acknowledgments xv
Steering Committee:
Lorelea Moore – Certification
Leigh Thorpe – Editor, CTO's Office
Shardul Joshi – Editor, Wireline Engineering
Stephen Dudley – Editor, Wireline Engineering
Tim Mendonca – Editor, Enterprise Engineering
Carelyn Monroe – Wireline Engineering
Joe King – Wireline Engineering
Michelle Bigham – Marketing communications
Ann-Marie Bishop – Project Manager
Contributors:
The following made significant contributions to the contents of this book:
Mark Armstrong
Benedict Bauer
Roger Britt
Peter Bui
James Chanco
Paul Coverdale
Steve Elliott
Matt Michels
Mustapha Moussa
Tom Taylor
Andrew Timms

xvi Acknowledgments

1
Chapter 1
Introduction
Joseph King
You have an emergency. You know you can dial three digits from any
phone, anywhere, any time and within milliseconds you have help. Most
people don't understand how it happens, and frankly they don't care. They
just count on it to work. The network design engineer both knows and cares
how it works. Would you have the same confidence dialing that number if it
was being routed across an IP core today? Not if that IP network is the
Internet or one of about 85% of the IP networks out there today.
Consumer and engineers alike have heard the technology hype:
convergence, VoIP, triple play, interactive applications. Voice, data, and
video networks are finally becoming one. Why? Because consumers
demand it, want it and need it. It is becoming a way of life. Technology is a
driving force for the way people communicate. The paradigm has shifted.
Consumers are driving this change in technology to support the way they
want to work and live.
Convergence is occurring between real-time and non–real-time data
networks. Voice over IP is being deployed on networks that were originally
designed with a router architecture and best-effort delivery philosophy.
Many of these networks are not capable of meeting the performance quality
requirements of real-time services such as voice. Voice services are critical.
As convergence proceeds and networks begin to carry voice and other real-
time services, these networks must adapt to the mission critical nature of
those services. The people who design and operate these networks must
meet a new set of constraints. Best-effort cannot guarantee the performance
of mission critical real-time applications. Throwing more bandwidth at the
problem is not sufficient. What is needed is proper network planning and
design, which in turn requires a thorough understanding of the operation
and constraints of real-time networking and how that interacts with the
operation and constraints of IP networking.
Because convergence of real-time applications with data networking is new
ground for so many people, a group of Nortel subject matter experts have
created a real-time networking manual to serve as a shared foundation for
engineers and other professional from various areas of the industry. As part
of this effort, Nortel has also developed a certification that is focused
purely on real-time networking: Nortel Certified Technology Specialist
(NCTS)—Real-Time Networks. It is a baseline certification in real-time
networking, intended to be as applicable to the managers of engineers as it

2 Chapter 1 Introduction
is to the engineers themselves. It provides a foundation for future

certifications that will cover specific topics in more detail.
This book covers the basics of real-time networking. It addresses the
technology, performance requirements, and basic best practices for
engineering and implementation. It is not intended to provide detailed
treatment of all possible solutions, but rather to bring the reader to a level
of competence where he or she will be able to approach specific solutions,
to understand the technical documentation, to ask questions, and to
understand and use the answers. As stated, it does not touch all possible
aspects. For example, security is essential in a world of converged
networks. Security is touched on in parts of the book, but not in great detail
as this book is focused on the basics of real-time networking. The topic of
security in converged networks can easily fill a volume of its own.
This book is technology-and standards-focused rather than product-
focused. Given that, nowhere in this book are we recommending any single
architecture. Each chapter was written by one or more Subject Matter
Experts, and contains the fundamentals you need to sit for the associated
certification exam. More importantly, it will supply you with the necessary
background and guidelines for understanding the networks of tomorrow
and making informed decisions about network components, network setup,
and network operations.
The NCTS – Real-Time Networks certification encompasses real-time
traffic engineering issues such as guaranteed Service Quality, high
availability networking, and the fundamental science of voice transmission.
Who better than Nortel to tell you how to deliver high quality, high
availability services? Nortel is the world leader in voice, with over 100
years experience in telecommunications. Nortel’s optical and multiservice
switching carry telecom services on an international scale where reliability
must meet standards of 99.999%. But this book was not written to discuss
Nortel. Instead, we've chosen to focus on standards-based, open
networking. At last, a certification that can be applied to any vendor that
supports open standards! Convergence is not only about running real-time
and non–real-time services on a single network, but also about assembling
many different vendors' products into a single, cohesive standards-based
network.
Networking is often taught as separate threads. For instance, the OSI model
separates the network into layers representing different functions and
protocols. Breaking the material down this way helps structure learning.
Rather than expose the student to the full complexity of the topic, it is more
efficient to expose them to the individual parts. What often happens,
however, is that the parts are never reintegrated. Students become designers
and engineers, keeping the narrow focus and become experts in individual
protocols, boxes, or topic areas. Relatively few expand their focus to see
the network as one seamless system.

Chapter 1 Introduction 3
In converged networks, technologies and protocols all interact and should

not be treated in isolation as separate threads. To create a single converged
network, the threads must be brought together into a single fabric. While
the understanding of the individual threads is important, it is the knowledge
of how to weave the threads together to create the seamless fabric of
converged communications that is key. This book and the accompanying
certification focus on the fabric rather than on the individual threads. To
this end, the chapter authors are all networking experts with many years of
real world experience designing, analyzing, and integrating multiservice
real-time networks. The book focuses sharply on real world issues,
emphasizing the knowledge, strategies and techniques that are needed to
design, deploy, and operate real-time networks in today's world and how to
migrate from your existing system to the network of the future.
Section I: Real Time Applications and Services

The most critical slice of the convergence pie is “real time.” All networks
can pass best-effort data but not all networks can support real-time services
such as voice. Convergence demands real-time capability, but first real time
needs to be defined
For our purposes, real time depends on more than just throughput rate. Nor
is real time simply another way of saying time-sensitive. For example,
broadcast or streaming media are not considered real time even where the
signal is “live.” Viewers often watch a “live” event several seconds or even
minutes after it occurs, without any major impact on their experience.
On the other hand, when streaming media are combined with an interactive
control, such as a channel changer, it exhibits the responsiveness
requirements characteristic of real time. When the user enters a command
or sends interactive data, the adaptation of the user's perception and
thought processes to the real world put strict constraints on the network
response times. If response time is longer, the user experience is degraded.
The important factor, then, is not whether streaming media has delay, but
rather how delay affects the user experience.
Real-time networking is bidirectional and interactive: the user experience
depends on a response from another user, or from a device at the other end
of the network. It is the ability to engage on simultaneous tasks or
applications at once and control the quality of experience across the entire
network. Real-time networking is so rapid it creates the illusion that there
is no network in between, a kind of “virtual reality.”
Beyond that, the delay starts to become noticeable and unacceptable.
Achieving real-time transmission speeds is quite a challenge when it is
considered that light traveling through fiber takes an eighth of a second to
travel half way around the world. The problem is that sampling,
compressing, framing, and synchronizing are time consuming. The early
chapters address these issues.

The first step will be to introduce the concepts of convergence and real
time. Network, service, and application convergence are discussed.
Examples of real-time services are presented, and the constraints around
operating real-time services in a packet environment are discussed. The
concept of Quality of Experience is introduced as the fundamental
performance requirement for all services and applications.
To design a real-time network and assure your customers excellent Quality
of Experience, you need to understand the concepts of real-time
applications. As discussed earlier, real-time challenges network
performance. What are the performance requirements for the major real-
time applications and services, and what are the protocols and mechanisms
we use to control network behavior?
Most all will agree that if the video freezes for seconds while watching the
news but the audio continues, the interruption is a mere annoyance.
However, if the reverse occurs, and the audio is lost for seconds while the
video continues, your comprehension of the news will be severely
degraded. For other content, such as sporting events, loss of audio may be
tolerable, while video loss is not. That said, interactive voice is one of the
most demanding communications services. Consequently, a significant
portion of Section I is focused on the quality of voice services and voice
codecs.
For conversational voice services on a converged network, there are many
contributing factors to the final quality. You, as a convergence engineer,
need to understand the contributions of various parameters such as delay,
packet loss, and echo. Network planning for voice is essential, and tools
like the E-Model and its associated quality metric R are invaluable in
designing and provisioning a network. Other metrics such as MOS are also
used to quantify voice performance.
As with voice, there are aspects of video signals that need to be understood.
Understanding the concepts of video is critical to a convergence engineer.
Impairments such as noise (luminance and chrominance), loss of
synchronization signals, co-channel interference, and RF interference are
all critical factors for video.
Real-time applications are often concerned with the transmission of signals
originating in analog mode. Sound signals because of their wave nature
necessarily begin as analog signals. An NTSC (ordinary TV) video signal
captures the information needed to reconstruct the visual display as an
analog stream. To be transported across a digital network, analog signals
must be converted to digital information by means of a codec. We discuss
the basic characteristics of codecs used for telephony (speech), those for
general audio signals, and codecs used for video signals. Also covered are
the parameters that underlie the performance of the codec from both a
human and technical perspective, common coding standards defined for

each of these areas, and the boundary conditions for the effective use of
compression codecs in Real-time communications systems.
Section II: Legacy Networks

Before jumping into packet transmission, protocols, and the technologies
involved, let's take a look at Time Division Multiplexing (TDM).
Understanding TDM is very helpful to appreciating real time. How does
TDM technology support real-time networking? What makes it so reliable?
TDM is a declining technology, but much has been accomplished with it
and learned from it. Most importantly, TDM will be the benchmark against
which users compare packet voice services.
Fiber optics was the first example of convergence. Today, almost all
networks, applications and protocols converge on a fiber optic cable
running SONET. Working with the real-time converged network of
tomorrow will require an understanding of SONET and the advancements
that have taken place in optical networking.
Section III: Protocols for Real-time Applications

Many real-time applications are associated with a stream of data such as
voice or video. For real-time applications, there is no time to resend lost or
corrupted data. The Real-Time Protocol was created to control the data
flow for those applications. RTP does not guarantee that the packets are
delivered in a timely manner, but it provides information to the application
regarding whether all packets have been delivered and whether packets are
in sequence so the application can make the correct decisions about playing
out the content of the packets.
IP allows packets to make their way from one side of the global network to
the other. IP has true universal global access and can reach nearly any
business, home, or hotel on the planet. This global reach is a key factor in
IP becoming the new focal point of communications convergence. IP can
bring about end-to-end connections to more places than any other packet
technology.
Now that you have an IP infrastructure, you want to have some real-time
applications to run on it. To make this happen, you're going to need some
control mechanisms for locating other endpoints, call setup, capability
negotiation, and so forth. This is where the call control protocols (SIP* and
H.323) and the gateway control protocols (H.248/Megaco) come in. These
protocols take care of all the housekeeping tasks needed for setting up
communication paths optimized for real time.
SIP, H.323 and H.248 are critical to a successful user experience. They
determine how a session is established between users. These protocols
automate provisioning of communications parameters from mapping of the
destination IP address against the telephone number, to authenticating the

caller as a real network subscriber, to choice of codec, to determination of

IP addresses and ports for running the media, to tracking of usage for every
call!
As well as session control, you're going to need more control over the
network traffic behavior. Most IP networks treat all traffic the same and are
referred to as “best-effort” networks. Best effort means that the network
delivers the traffic on a first-come, first-served basis, without regard for the
urgency of the content. The speed and completeness of delivery depend on
the amount of traffic sharing the nodes and links. Best-effort networks are
engineered for connectivity, not for performance: all sessions are admitted,
without regard for the effect on overall packet movement through the
network. Best-effort makes no bandwidth or performance assurances, so
packets may be discarded or delayed under network congestion. Because of
this, traffic may experience different amounts of packet delay, loss, or jitter
at any given time.
As Service Providers and Enterprises alike consider converging network
operations to leverage their existing infrastructures, real-time services will
join non–real-time applications riding on IP networks. For real-time
services and applications, performance is paramount.
The shift from connectivity to performance puts different demands on the
network. Can your network handle this? Well, if it has some way of
policing the packet flows, it might. This type of policing is typically done
with Quality of Service (QoS) mechanisms. QoS mechanisms and
protocols are essential to converting the connectivity-based network to the
performance-based network. QoS provides ways to streamline packet
movement, prevent congestion, and prioritize performance needs for
different types of traffic. Spending some time discussing these mechanisms
and how they work with various technologies is high on our list of
priorities.
Section IV: Packet Network Technologies

As an IP certified engineer, your area of expertise may not include ATM
and SONET. You design your LAN, are handed a cable from your Service
Provider, and off you go. But what is happening in the carrier network? Is it
important how the Service Provider is treating your packets? Your packet
goes from your office in New York and arrives in London after a bit of
delay, but it makes it. Well, it turns out that a lot of things can happen in the
network core that can be detrimental to your real-time communications.
What you don't know can bite you. As an Enterprise LAN engineer, do you
know what questions you should ask your Service Provider? To do more
than best effort, you'll want to know the questions and the answers. To get
the most out of talking to Service Providers, you'll need some basic
understanding of the technologies they use.

From the other side of the street, Service Providers need to understand
what their Enterprise customers are working with if they are going to serve
them well. Frame relay continues to be used extensively in Enterprise
networks. Packets crossing the Enterprise boundary encounter NAT. How
are these things going to affect your SLA and the final user quality? Both
Service Provider and Enterprise networks today are quite complex.
Designing a real-time network to work across the combined domain is
doubly complex.
IP was designed to be a simple protocol. A few entries in a routing table,
connect your cables, and you're up and running. It's a great concept, but the
complexities of convergence will not allow us to maintain that simplicity.
In Section IV, access, WAN, and core technologies are discussed. What are
the drivers to move to an MPLS network? What QoS mechanisms are
available in ATM and how are they invoked? What are the important things
to know about ATM, frame relay, MPLS, SONET, and Optical Ethernet
with respect to real-time networking? A convergence engineer needs to
understand these network technologies, to be able to comprehend the
concepts, and to understand their influence on real-time operation.
Convergence is happening in many places. It's already happened at Layer 1.
We are now seeing convergence at Layers 2 and 3, and VoIP is just one of
the driving factors. The characteristics of the LAN, the WAN, and of
course, the access network will come into play in the determination of the
final network performance.
Section V: Network Design and Implementation

You know it's going to happen. It's Friday afternoon, a three-day weekend
is coming up, and you want to leave early. But then the call comes in, “THE
NETWORK IS DOWN”. Why does it always happen on Friday?
Convergence will make these calls even more heart-stopping. It's not just
about the data network anymore. It's about a real-time network carrying
voice calls and billing applications. As important as your data may be,
these applications are mission-critical. Your CEO/CIO or your customer
will be calling to find out when these applications are going to be back up.
The network of yesterday was about reliability. Convergence has ratcheted
that up a notch. Now it's called survivability. Survivability means your
network can continue to provide service in the face of network faults. Fast
recovery from failures and strategies for rerouting traffic around a problem
are key aspects to survivability. Survivability techniques can be applied at
different levels: at the node and link levels, as well as at the network level.
They address software as well as hardware failures. In the past, reliability
was thought of in terms of hardware. It was provided by redundant
processors and dual power supplies. Most vendors offer redundancy at the

hardware level, but how can you carry this through to the logical layer?
You need to know, or suffer the consequences.
Chapters 17 and 18 cover survivability at the network level. Together, let's
explore the concepts of Network Reconvergence and MPLS Recovery. In
other words, how can you build in survivability at the logical layer. This is
a key factor to successful network convergence.
The previous sections have discussed the relationship of applications,
protocols, and technologies to real-time networking. Once you get to this
point, you will have been introduced to some basic real-time services, how
packet transport affects them, and some techniques for controlling and
enhancing the performance of the packet network to meet the demands of
these real-time services. You will understand the concepts of core network
technologies and how they need to work on your existing LAN.
In Chapters 19 and 20, the concepts are all brought together. Now is the
time for the Converged Network Engineer to shine. Managing the
complexity of the converged network takes planning. This section of the
book helps consolidate the concepts you've learned, and shows how
network planning can polish real time over IP to a brilliant shine. These
chapters consider network planning for real-time voice and data, and how
to translate Quality of Service settings from one network technology to
another. In these chapters, potential issues related to real-time networking
will be described along with mitigations and best-practice engineering
guidelines.
Section VI: Examples

Up to this point, the authors have maintained a vendor agnostic perspective.
The concepts discussed are applicable to any vendor's products. This is
consistent with Nortel’s commitment to open standards. In the next few
chapters, however, we want to offer a few examples of what Nortel
solutions look like in both the Service Provider and Enterprise
environments.
The first example illustrates a global-scale Enterprise network, and one that
we are particularly proud of. Nortel runs one of the largest real-time
Enterprise networks in the world, equivalent in breadth and scope to a Tier
2 Service Provider. In a typical month, more than 1,500 terabytes of routed
traffic runs across the network. We will explore the business drivers of
deploying QoS in our own network, and how it was implemented.
The next two examples highlight Nortel’s carrier-grade portfolio by
providing examples of real-time network solutions. Each example includes
descriptions of business and technical challenges faced by Enterprises and
Service Providers alike, and it presents the Nortel solution including
specific product and service descriptions and network architectures. Note
that these are not specific customer networks, but are examples of how a

Nortel solution can help both Enterprise and Carrier customers move to the
converged network of tomorrow.
Let’s Get Started

When we first got the idea for this book, we wanted to address a specific
issue. Voice and data services are converging onto a single network.
Unfortunately, the histories of the two domains, the networks themselves,
and the knowledge required for working in these environments are quite
different. We wanted to do something to bridge the gap and create a way
for the integrated knowledge to be disseminated cost effectively across
both the Enterprise and Carrier spaces.
While we hope for a broad readership, it was necessary to make some
assumptions about the background of the nominal reader. Familiarity with
basic IP data networking, the OSI reference model, and the corresponding
terminology will be needed to follow much of the explanation, discussion,
and examples.
One of the goals of the book was to present the material in enough depth to
provide a useful ongoing reference but still compact enough that the
breadth of issues that confront implementing a converged network could be
addressed. There is a lot of material in this book. More material, in fact
than any one user of the book might need. Depending on your objectives in
reading the book, following are a few suggestions for navigating the
material.
As mentioned earlier, this book is being produced in conjunction with the
NCTS – Real-Time Networks certification, there is more depth to the
material in the chapters than is needed to prepare for the certification.
Those who read this book as preparation for the certification exam should
be aware that the exam will address definitions of terminology concerning
Real-Time networking, and the associated technologies and concepts. It
will also assess your understanding of the issues that Real-Time
applications face in IP networks. No exam questions are drawn specifically
from Section VI (Examples) or the Appendices, although reading those
sections will help you consolidate what you have learned from the earlier
chapters.
For the reader looking to pass the certification, we advise focusing on how
the pieces fit together into the larger picture, rather than memorizing details
of individual technical solutions. The exam questions will concentrate
more on knowledge of issues and terminology than on solutions to those
issues. Aim for a general understanding of the technologies, paying special
attention to issues that are highlighted in the text. Identify and digest the
key terminology and acronyms. Depending on your professional
background, you may find that some terms are used differently than you
may be used to.

Many readers will come to this book with strong expertise in one or more
of the areas covered, but may have little or no familiarity with other areas.
While we assume that readers will have basic knowledge of data
networking and TCP/IP, we cover a range of topics associated with
convergence and real time. The reader can pick and choose sections and
chapters, and does not necessarily need to read the various parts in order.
If you're not planning to take the certification, it is our hope that you can
use this as a guide as you embark on the journey to convergence. It is our
hope that everyone will be able to get something out of this book,
regardless of their background or the environment they work in today.
Reading the Transport Path Diagrams

This book employs a set of diagrams to provide a frame of reference for the
material in the individual chapters. A diagram will be presented at the
beginning of each chapter indicating the network components and
functions addressed in that chapter. These transport path diagrams are
intended to help the reader appreciate the relationships within the transport
path for different network implementations. Because these diagrams are
Figure 1-1: Transport path diagram

intended to give a high-level overview rather than a comprehensive

summary, they are simplified in some ways for ease of presentation and
comprehension. There is no 1:1 mapping between protocol functions and
the seven-layer OSI model even though we do mention some of the same
elements. This diagram is not intended to convey OSI stack information.
We hope that these diagrams will help the reader locate the threads of the
individual technologies within the fabric of the overall network.
The arrows used on the diagram are not meant to indicate that a process is
strictly one-way, but to illustrate the perspective of application-level data or
decision-making looking into the network. In general, the arrows point
down to indicate that the real-time application looks towards the Wide Area
Network through these layers. In many cases, there are different ways that
a real-time application could reach the Wide Area Network. Some
technologies, such as Cable and xDSL, have special Layer 2 relationships
between the Local Area Network and the Wide Area Network. Not shown
on the diagram are the encapsulation mechanisms used by these types of
applications to bridge Local Area Network level traffic through Cable or
xDSL transport and back into the core IP/ATM networks.
The diagram highlights some of the different aspects of real-time issues
that are addressed in the book, including transport protocols, session
control protocols, Quality of Service (QoS) protocols, and reliability-
related protocols. The latter have been included because real-time
applications can significantly increase the requirements for network
reliability. The braces on the right side illustrate where in the transport path
the QoS and reliability features are applied.
Conclusion
It is no longer enough to be solely a data or voice engineer. The networks
of today carry essential applications. These applications are not just data
anymore. It is about a real-time interactive world. To support convergence,
the underlying network must support real-time applications and service
with delays less than 250 ms. Convergence demands “One Network” that
brings together all the threads into one fabric. It is no longer about pieces of
knowledge; convergence is all about how to weave all the pieces together. It
is all about building your engineering toolkit. The NCTS – Real-Time
Networks certification is part of that kit.
The Nortel Certified Technology Specialist (NCTS) – Real-Time Networks
is just the first step of becoming a Convergence Engineer. This certification
was created to not only assist you, the IP certified engineer, but also to help
us at Nortel. Convergence is part of our culture and our everyday life.
Nortel has built real-time capability into our converged networks. There are
advantages of the converged and real-time world, and you as a certified
engineer, will be ready to embrace them.

I know you will find value in this study guide as you get ready for your
certification. The subject matter experts who created this book hope you
enjoy reading this guide as much as we enjoyed creating it.
Thank you and best of luck in building your tool kit.

13
Section I:
Real-Time Applications and Services
Let's begin by looking at the applications that run on real-time networks.
Section I examines the characteristics of real-time applications and the
issues that arise when we run them over IP. This section looks at the
applications as the user sees them, and the implications of IP transport
performance on the quality the user experiences. These chapters will give
you a detailed view of how network design and implementation decisions
can affect the user's experience.
Chapter 2, The Real-Time Paradigm Shift, defines what we mean by
Real-Time networking. Real time is defined, along with convergence in
general and some specific types of convergence. Familiar applications are
sorted into real-time and non–real-time categories, and some potentially
distinguishing features of real-time packet traffic are described. Quality of
Experience (QoE) is introduced, and its relationship to network and
application performance is discussed in detail. The chapter concludes with
a discussion of the difference between QoE and QoS (Quality of Service).
Chapters 3 and 4 take a look at issues around quality in two popular
applications, voice and video. Voice Telephony continues to be the “Killer
Application” of telecommunications. Examining voice impairments
provides a good reference point for us to understand both the implications
of IP network behavior on real-time applications, and how to interface IP
networks with the existing TDM network. Chapter 4 looks at video and the
impairments to the image that can result from IP transport.
Chapter 5, Codecs for Voice and Other Applications, introduces
digitization and encoding, which make it possible to put analog signals
over digital networks. The discussion here provides some background on
(1) voice and video analog signals and how characteristics of those signals
translate into digital mode, (2) codecs that are used to remove redundancy
to reduce the amount of data needed to carry the signal information, and (3)
how various errors and disturbances of the compressed digital signal affect
the reconstituted analog output. Common coding standards for telephony,
audio streaming, and video streaming and conferencing are summarized,
and guidelines for selecting a codec for VoIP are provided.

14

15
Chapter 2
The Real-Time Paradigm Shift
Concepts Covered
Telecommunications convergence
Types of convergence: network convergence, service convergence,
and application convergence.
Convergence removes constraints for users but adds constraints
for network operators
Real-time telecommunications
How to separate real-time and non–real-time applications and
services
Service quality and performance
Quality of Experience (QoE)
Measuring QoE
Quality of Service (QoS)
Introduction
For more than a decade, telecom scholars have been publicizing the
advantages of convergence to multiservice networks; anticipated benefits
range from cost savings from operating a single network infrastructure to
productivity gains and/or new revenue from advanced services. Depending
on who you talk to, next generation networks are expected to reduce capital
expenditures, reduce operating expenses, increase revenues, decrease user
cost of telecom services, improve quality, reduce quality, increase the
reliability and survivability of the network, increase competition among
carriers, and reduce churn in the customer base. Only time will tell which
of these predictions are correct, but one thing is certain: meeting user
requirements and expectations on converged packet networks is an
enormous challenge. The crucial component of this challenge is making
real-time services operate over a packet infrastructure.
This chapter introduces the concepts of convergence, real-time operation,
and Quality of Experience (QoE). No matter what network they run on,
real-time services like voice telephony1 and video conferencing require
careful engineering to deliver acceptable performance. Convergence of
applications and services means that real-time and non–real-time functions
must share a common network environment and/or run side-by-side within
the same application. Mixed traffic types from services and applications
1. Hereafter referred to variously as interactive voice or simply voice.

16 Section I: Real-Time Applications and Services
with differing requirements, can end up bumping heads as the traffic moves
across the network.
Successful deployment of converged networks depends on careful planning
and engineering. The building blocks of this success include an
appreciation of the characteristics and constraints of the services and
applications the network will support, as well as a solid understanding of
the network environment. A solid understanding includes knowledge of the
transport technologies, the protocols, the available choices for
implementation, and issues around interconnection with other networks.
The details of network implementation will affect network performance
and resiliency. The balance of this book reviews these building blocks, the
choices and options available around network architecture, deployment of
services on IP infrastructure, and design guidelines for achieving the
performance and reliability users and network operators need from real-
time services.
The criteria for successful performance are based on user QoE. It doesn’t
matter how smoothly packets move through your network, if the users find
that services and applications don’t meet expectations. Planning must
address the factors that underlie QoE for each service that runs on the
network, as well as any interactions or inconsistencies between them.
This book introduces real-time networking and many of the real-time
concepts.
What is convergence?
Narrowly defined, the term convergence, refers to the merging of traffic
from two or more separate networks, onto a single network. At present, we
are witnessing the convergence of traditional voice traffic (consisting
mostly of standard voice telephony) and LAN-based data traffic (consisting
mostly computer communications such as e-mail and file transfer) onto a
common packet-based infrastructure. More broadly, convergence is used to
describe the fusion of function across all aspects of communications. Three
main kinds of convergence have been defined:
Network convergence–Combining network traffic from different
services (for example, voice, video, data) on one infrastructure
Service convergence–Combining previously distinct services (for
example, wireline and wireless voice; wireless voice and short
message service) into a single service
Application convergence–Merging of previously distinct
applications into a coordinated suite (for example, multimedia,
collaboration, and the integrated desktop)
Although convergence has recently gained prominence and notoriety, it is
not a recent phenomenon. The telecommunications industry recognized

Chapter 2 The Real-Time Paradigm Shift 17
over thirty years ago that the hierarchical network architecture of the voice
network could not continue to grow indefinitely. The introduction of digital
networking in the 1970s made it possible for the network to carry non-
voice data along with digitized voice. This trend continued in the 1980s
with frame relay and ATM. On the data side, private and then public
multiprotocol best-effort networks followed, and forerunners to the IP
protocols emerged. On the voice side, FAX and voice-band data services
ran on a common infrastructure with traditional voice. More recently,
convergence continues with the introduction of Storage Area Networking
(SAN) and IP telephony; these bring with them the more stringent
requirements of real-time operation and business-critical reliability.
Convergence increases options

For the user, convergence can increase the number of things one can do and
the ways they can be done. Convergence brings down the boundaries
between data and voice, between wired and wireless, between public and
private, between the central site and the remote site, and between the
location of the supporting processor and the location of the user.
Convergence allows services to more easily combine different media and
different types of information, and to facilitate contingencies between
events across service types. For example, convergence in messaging means
that the user monitors only one mailbox, rather than separately monitoring
e-mail, work voice mail, cellular voice mail, home answering machine, and
a pager. Convergence in the application user interface, makes a new
application instantly familiar, in that the controls are the same or similar to
applications that the user already knows how to use.
Network convergence
Network convergence brings all types of traffic onto a single network
infrastructure, such as voice, audio, video, and data; bearer and signaling.
Such convergence may occur at the level of the transfer protocol (for
example, IP), the data link protocol (for example, Ethernet), and/or the
physical medium (for example, optical fiber).
The overwhelming presence of the Internet has led to the choice of IP as
the converged transmission environment for both Service Providers and
Enterprises. The advantages of this choice include the economics of a
single network platform as well as seamless connectivity with existing
Internet infrastructure. The IP protocol suite now includes higher level
protocols for all forms of data applications, for audio and video streaming,
and for real-time applications such as telephony and conferencing.
Service convergence
Convergence enriches telecommunications by bringing together familiar
features and services that traditionally operate on different systems. For

example, a user’s wired phone and cell phone can share a single number.
Paging, voice messaging, instant messaging, e-mail, and FAX can be
managed by a single agent device. User mobility can be enhanced by
wireless LAN and follow-me services; various media (voice, pager, PDA,
e-mail) converge onto a single user device. At the same time, the total cost
of ownership of the supporting network may be reduced.
Service convergence brings with it, new client devices, communications
servers, and media gateways. It may be realized in a fully distributed
system, running on top of an IP network (service provider, large
Enterprise), or through an integrated office-in-a-box (small Enterprise). It
may be realized as an evolution from an existing installed base, as a stand-
alone system, or as a managed or hosted solution. Users want service
convergence without compromising their familiar telephone operation.
They want the same features/functionality, voice quality, security, and
reliability they are getting from their current individual services. Services
converged on a carefully engineered packet infrastructure can deliver this,
and more.
Service convergence enables a highly mobile and distributed work force.
You can use any IP desktop phone, register and your desktop is where you
are. You can work at home or in a wireless LAN hot spot while you run a
Session Initiation Protocol (SIP) client on your laptop, have your phone
number and telephony features with you, and make secure calls over the
Internet. Or you can have system-wide roaming for your IP wireless
telephone or telephony-enabled PDA. Service convergence will ultimately
allow voice and data roaming across the WAN, bringing down the
boundaries between Enterprise wireless LAN systems and public wireless
services.
Application convergence
The full potential of the IP multimedia networking will bring significant
changes in how people communicate and collaborate. Application
convergence can do for person-to-person communications what browsers,
HTML, and Domain Name System (DNS) have done for information
access and transaction services. It will put the end-user back in control of
the communications space, enhance how users collaborate with colleagues,
and enrich how Enterprises communicate with their customers.
Application convergence is realized through the development of
anticipatory, media-adaptive, and time-sensitive applications. Employee-
facing converged applications allow Enterprises to create distributed teams
to address business opportunities and challenges more effectively and
dynamically. Customer-facing converged applications serve to strengthen
Enterprise/customer relationships and leverage investments in contact
centers and self service applications with integrated databases and back-

office systems. Converged applications will form one facet of the new
revenue-generating services that Service Providers are anticipating.
Convergence adds constraints

While convergence offers greater freedom to the user, it puts increased
demands on designers and network operators. Network convergence means
that the network must meet the combined requirements for all the services
it will carry. Each performance parameter must meet the strictest
requirement among all those defined for individual services. Converged
services require end devices that take into account all the services that user
will access, which can include a variety of perceptual modes (sound,
graphics, text), different types of input (hard and soft keys, alphanumerics,
stylus, speech recognition), portability to support mobility, and so on.
Application convergence requires that designers build flexible, responsive
user interfaces, and that they identify and develop equivalent operational
features between applications to maximize capability and efficiency for the
user. We have seen with computing platforms and applications that
promised powerful features and functionality, often resulted in an
encounter of complexity and annoyance, as the application tried to guess
what the user wanted to do, or failed altogether when the operating system
and the application weren't compatible.
What do we mean by real time?

The inherent value of real-time applications and services, combined with
the incompatibility of real time and best effort operation, led us to develop
an engineering strategy specifically aimed at real-time networking. This
section defines real time as it relates to telecommunications networks,
offers some examples of real-time services and applications that run on
those networks, and discusses the characteristics of real-time traffic. These
characteristics distinguish real-time traffic from non–real-time traffic, and
make it particularly challenging to deliver high-performance, high-
availability, high-reliability real-time networks.
The fallacy of network as conduit

Initially, it was assumed that packet networks would operate simply as a
pipeline for data, regardless of the origin or application that generated the
data. The network would be an unintelligent pipe, with all the intelligence
in the endpoints. IP was designed to treat all packets equally, and all
applications were expected to peacefully coexist. This gave maximum
flexibility to applications developers, and it remains a suitable arrangement
for non–real-time applications.
Since then, the inherent difference between voice and other applications
has asserted itself. This led to vigorous debate about the virtues of thick
clients (smart end devices connected to a dumb pipe) versus thin clients

(relatively unintelligent end devices connected by an intelligent network).

It is now universally recognized that best-effort, multiservice IP
environment will not support voice services with the performance and
reliability that users have come to expect. Interactive voice requires
differentiated service classes or other special mechanisms to expedite the
voice packets, and careful management of the network to constrain jitter
and bursts of packet loss. Thus, a packet network would require different
tuning to optimize voice performance than to optimize data performance.
At the same time, it is not just voice that requires special care; all real-time
applications pose similar difficulties. What is it about real-time services
that impose these constraints? More specifically, what are the differences
between real-time and non–real-time applications and services, and what
do we need to do to meet user expectations for real-time applications
without disturbing the performance of non–real-time applications?
Real-time processes
You may be familiar with the term real time with respect to computing
operations. There, real time is used to describe processes that take more or
less continuous input and run fast enough to keep up with the rate at which
new input arrives. If the process does not run fast enough, the input backs
up. The execution time of the process is not critical, but the rate at which
new input arrives determines the minimum throughput rate. Taking an
example of digitization and compression of an analog video signal, the
codec must be able to accept and process video frames at the given rate of
the analog input. Variability in input rate can be buffered out, but the
process needs to keep abreast of the average rate to perform as a real-time
computing process.
Real-time networking
As a real-time process, real-time networking requires a minimum
throughput rate defined by the operation of the application. In networking,
we refer to the throughput capacity of a transport path as the bandwidth2. In
contrast to real-time computing processes, however, the execution time is
also a key factor. The “execution time” of a real-time networking service or
application is the end-to-end (one-way) delay. This delay is made up of
both processing (for example, time needed to parse input, execution time
for computations, and other operations), and transport delays (mostly
queuing, buffering, and propagation time).
Among other functions, networking processes mediate information transfer
between the endpoints. As well as bandwidth, sufficient to keep up with the
2. The term bandwidth is derived from the relationship between the frequency bandwidth of an analog
carrier and the maximum rate that the carrier can be modulated to signal one bit of information. The
broader the bandwidth, the faster the maximum modulation rate, and so the more bits can be sent per
unit time.

input (talker speech, video signals, text entry), real-time networking

services and applications need a short round-trip delay; that is, the time
needed for a signal to be sent to other end and a response to be sent back by
a specific application depends on what activity is involved (talking, text
entry, button press) and on what response time would be expected in
similar circumstances without the intervening networks (speech reply,
channel change).
Non–real-time network applications, for instance, file transfer, can trade off
the amount of bandwidth available with delay. A higher bandwidth channel
simply carries data more quickly. Real-time applications do not have this
direct bandwidth for delay trade-off. For example, an ordinary voice
telephone call uses 64 kb/s in each direction. All things being equal, using
faster links will not allow the data to be sent out at a faster rate since the
input rate remains constant. Adding bandwidth will not necessarily
improve the performance of real-time applications3. Instead, application
design, network topology, provisioning choices, and traffic management
are key factors to controlling delay for real-time networking services.
How short does the delay need to be? The response time required by a
specific application depends on what activity is involved (talking, text
entry, button press) and on what response time would be expected in
similar circumstances without the intervening network (speech reply,
channel change). The round-trip delay must not be short enough to avoid
disrupting the user’s thoughts, behavior, and/or interaction with another
user at the far end. Excessive baseline delay or variable delay across the
connection can force users to spend concentration adapting to or coping
with the communications medium and distract them from their goals. For
example, a call with long delay can exaggerate the pause that occurs
between one speaker’s turn and another. A user may interpret the pause as a
hesitation on the part of the other person, or possibly may wonder whether
the pause is a hesitation or delay in the speech path. In the first case, the
pause is erroneously interpreted simply as part of the conversation (with
implications for how the two parties will assess each other), while in the
second, the user is distracted from the conversation by his uncertainty
about what the delay means. Machines are more tolerant of delay; however,
there is usually a user somewhere in the chain. A machine may be prepared
to wait indefinitely for, say, password screening. A user, whose session
setup time depends on screening time will not.
Research done at Nortel has shown that for voice, end-to-end (one-way)
delay of up to 250 ms measurable does not measurably impair a
conversation. Browser page loading time should complete within four
3. As usual, in reality, things are more complex. Increasing bandwidth in the network does have some effect
on the total delay. First, there may be alternatives available at higher bandwidth (such as higher rate
codecs that can reduce processing time). Second, where congestion is occurring, increasing the total
bandwidth available can reduce queuing in the network, which in turn decreases the delay.

seconds of the request. Acknowledgement of remote application control

commands should take less than a hundred milliseconds. Delay is an
important impairment for IP networks, and will be discussed in several
chapters later in the book, especially around voice, video, and network
planning.
Real-time delay constraints mean that there may not be time to resend lost
or errored data. Some applications are relatively tolerant of lossy channels,
while others are completely intolerant of missing data. Figure 2-1 shows
some common applications within a delay-by-loss space. The applications
above the line are tolerant to limited amounts of missing data, as indicated
by the height of the box. Those falling below the 0% line will fail if any
data is corrupt or missing. This chart tells us that the most challenging
applications to run over converged networks are command-and-control
applications, since these require both very fast response (low delay) and
exact data.
Packet Loss
10%
Interactive Responsive Timely Non-critical
5%
Voice/video
Conversational messaging
voice and video Streaming
audio/video
100 ms 1 sec 10 sec Fax 100 sec
0%
Command/control Transactions Background
Paging,
e.g., Telnet, e.g., E-commerce e.g., Usenet
Downloads
interactive games web-browsing, E-mail delivery
E-mail access
Delay
Figure 2-1: Sensitivity of applications to delay and loss of data (from Rec.
G.1010, End User Multimedia QoS; figure reproduced with the kind
permission of ITU)
Aside from delay and bandwidth requirements, there are differences
between real-time and non–real-time flows. Real-time applications often
use small, more regularly generated packets, and for many, the flows may
last minutes or even hours. Simultaneous two-way traffic is common. Voice
traffic, for example, consists of small packets carrying the speech signal
that are generated on a regular schedule. Voice packet generation is
predictable either deterministically (where both speech and silence are
sent), or statistically, according to the distribution of conversational
utterances (where silence suppression is used). Video conferencing

services will have larger packets, but these are still generated regularly,
compared to the short, burst traffic associated with data communications
such as file transfer or e-mail delivery. The characteristics of interactive
command-and-control games show burst traffic of small packets containing
the joystick movements or mouse clicks associated with rapid-fire play.
The packet generation statistics depend on the characteristics of the
particular game.
In contrast, most non–real-time computing data traffic is highly bursty and
consists of large packets. Flows are generally shorter-lived and have a
back-and-forth nature (one direction, then the other) and the amount of data
transferred may be highly asymmetric, where the return traffic consists
mostly of TCP acknowledgements, user commands, and so on. Streaming
flows consist of regularly generated packets, more like telephony flows,
than like typical data flows. However, streaming flows are unidirectional
and use larger packets than are usually found in interactive applications.
Network signaling traffic is different again. Signaling traffic is usually
time-sensitive, and is often associated with session setup or another system
function. Similar to other real-time traffic, signaling is comprised of small
packets. However, flows are short, and the pattern is back-and-forth, rather
than simultaneous traffic on the two paths.
Table 2-1 provides a point-by-point comparison of the characteristics of
real-time and non–real-time traffic through a packet network.
Real Time Non–Real-Time

input arrives at a consistent rate input is highly bursty
requires minimum throughput rate throughput rate can vary
smaller packets (higher overhead) large packets (smaller overhead)
flows tend to be long flows tend to be short
usually bidirectional and symmetric usually highly asymmetric
performance is delay-sensitive performance is not overly delay-sensitive
Table 2-1: Real-Time and Non–Real-Time Traffic Comparison
Real-time services
We can use the preceding definitions to categorize common services and
applications. Figure 2-2 classifies many applications as real-time (right,
shaded background) versus non–real-time (left, white background). In
addition, the diagram also differentiates between applications where a

human is involved at one or both ends of the interaction (lower) versus

where the interaction is strictly machine-to-machine (upper).
Backup Storage Medical

Monitoring
eMail
Machine
Password
(delivery) Screening
Process
File Monitoring
Transfer Network
Computing Security
Monitoring
Remote Command &
Control Games
App
Move/Response eCommerce
Games SMS Video
Human
(Chess, Trivia) Browsing

Audio
Streaming
Video PTT Voice
Voice
Mail Streaming Text Conferencing
Audio Chat
Non-Real Time Real Time

Figure 2-2: Common network applications categorized as non–real-time
(left) versus real-time (right) as well as those mediating a human interaction
(lower) versus those between machines (upper).
Examples of real-time services and applications include, ordinary voice
telephony, audio or video conferencing, interactive command-and-control
(such as remote operation using Telnet or Timbuktu*, and shoot-em-up
games), and shared multimedia session (such as Netmeeting*).
There is no clean break between real-time and non–real-time applications.
Applications shown deep within their category in Figure 2-2 are easily
distinguished. Those falling near the boundary between non–real-time and
real-time have less stringent delay/responsiveness requirements, or may
show some variation in the requirements. This group includes audio and
video streaming, Push-to-Talk (PTT) (which is a kind of chat applications
using voice), browsing and e-Commerce, banking and other interactive
data services, security screening functions, and storage area networking. In
part, the delay requirements for these “near–real-time” or “quasi–real-
time” applications may depend on how they are being used, whether the
application offers particular features, or even what information is being
carried.

A new paradigm known as collaborative multimedia communications,

combines individual applications into coordinated operation (see
application convergence above). Such collaborative environments can
result in simultaneous operation of real-time and non–real-time elements.
Once a session is established between two or more parties, various modes
of communication can be invoked as required over the course of the
session. These modes can be voice, video, text chat, file transfer, file
sharing, web page push, and so on. They may be invoked sequentially or
simultaneously. Any component application of a collaborative session must
perform at least as well as it would perform individually on a purpose-built
network for users to find it acceptable.
VoIP vs. IP Telephony

Voice over IP (VoIP) is usually considered the IP parallel of TDM
telephony. VoIP is an application or a service where full duplex voice
channels are set up, over an IP network. It is almost always assumed that
these channels will be used for speech. Features ordinarily associated with
TDM telephony may be absent. Consequently, VoIP may not deliver the
same breadth of service and features. As well, 911 and voice-band data
services (like FAX and modem) may not be supported. Users, who expect
telephones to behave like telephones, and who may not even be aware that
the infrastructure has changed, may be surprised and disappointed by the
apparent shortcomings. IP Telephony extends VoIP to replicate the
standard TDM telephony service on an IP infrastructure. This requires the
network to support the public dialing plan, to offer backward compatibility
for voice-band data and other non-voice services, and to incorporate
telephony features like voice mail, call transfer, call forward, and 911.
Many aspects of IP Telephony operation are real time. IP Telephony
performance requirements are generally stricter than those for VoIP.
Our traditional real-time networks, that is to say, voice networks, have been
engineered for performance. In contrast, data networks are generally
engineered for connectivity. These two goals impose different engineering
priorities and strategies, and lead to very different network characteristics.
Service quality and performance requirements

Generally, end users of network services don’t care how service quality is
achieved. What matters to them is how easily they can complete their tasks
and achieve their goals. These factors contribute significantly to the Quality
of Experience (QoE) users have with a service. Carriers and transport
providers, on the other hand, are concerned with defining which low-level
network technologies and Quality of Service (QoS) mechanisms to use,
and how to implement, optimize, and configure them to deliver adequate
services quality while minimizing operating cost and maximizing link
utilization.

What is QoE?
Quality of Experience (QoE) is the user’s perception of the performance of
a device, a service, or an application. User perception of quality is a
fundamental determinant of acceptability and performance for any service
platform. The design and engineering of telecommunications networks
must address the perceptual, physical, and cognitive abilities of the humans
that will use them; otherwise, the performance of any service or application
that runs on the network is likely to be unacceptable4. Successful design
Sidebar: Quality of Experience

QoE refers to the quality of a device, system, or service from the user’s
point of view. Other terms for similar and related concepts include user
performance, human factors, user engineering, user interface design,
Human-Computer Interface (HCI), and Man-Machine Interface (MMI).
QoE is associated with all technology used by humans to reduce work,
solve problems, or reach goals. Voice telephony is a good example to
explore QoE, since QoE of telephones has been well-studied and used to
guide network and equipment design for decades. So, where does QoE
show up in telephony?
Efficiency: modern telecom services make it fast and easy to talk to
someone.
Ease of Use: the telephone dialpad is a simple user interface: a number
sequence is pressed to set up the call, call progress tones tell the caller
what is happening as call setup completes, the phone rings, the called
party picks up the handset and talks.
Transparency: How well does a telephone call approximate a face-to-
face conversation? The voice should have a good listening level without
distortion or noise. Delay should be short enough, and there should be
no echo or other annoying artifacts. Any impairments will annoy the
user or will require that the user adapt to them. The better the “virtual
reality” of the phone channel, the more the user can forget or ignore that
the conversation is taking place on the phone.
The effectiveness of a device or system in addressing the user’s needs
and constraints determines its QoE.
requires a thorough understanding of the needs and constraints of the

eventual users of the system. QoE is best understood on the system level,
since system characteristics and usage factors may interact, and this may be
4. Without proper understanding of user requirements, there is a risk of both under-engineering, where the
network fails to meet the needs of the users, and over-engineering, where the specifications go
beyond the user’s needs, needlessly driving up the cost to provide the device or service.

missed in subsystem-level analysis. For telecommunications networks, this

means understanding the end-to-end performance.
QoE directly affects the bottom line. If service QoE is poor, the service
provider may lose revenue or customers. When a conversation is impaired
by excessive packet loss or delay, when an application is slow, or when an
e-mail arrives late because the network was congested, communication
effectiveness goes down. This affects the user’s efficiency, and may push
his costs up.
In telecommunications usage, the older term Quality of Service (QoS) has
broadened in meaning and is now used to refer to the mechanisms intended
to improve or streamline the movement of packets through the network (as
in “Is QoS enabled on that network?”). In the past, the same term referred
both to the intention (enabling mechanisms used to help ensure good
service quality) and the outcome (the user’s perception of the service
quality), and described the user’s perception of quality. We now use the
term QoE for the user’s perception of quality to eliminate any confusion.
The sidebar gives more details on QoE.
Examples of user tasks or goals in the telecommunications realm include
making an appointment (voice call), finding out when a movie is playing
(Internet browsing), or obtaining an item from an online retailer
(e-commerce). When a user needs to spend attention and effort to manage
the medium (accommodate complex setup, unstable session, signal
distortion or artifacts, delay or other impairment), the task becomes more
difficult to complete, and QoE is reduced. Each application will have its
own combination of parameters to determine the QoE. Parameter values
leading to acceptable or optimal performance may also be specific to the
application.
Engineering for QoE is most effective when it is undertaken at the
beginning of the design process. Overall requirements are determined from
user needs for the target applications. Other factors such as the total
number of users to be supported and the different applications that will run
on the network are also taken into account. Requirements for individual
network components are derived from the overall requirements. In some
cases, it will be necessary to trade off between factors; for example, the use
of encryption may improve the user’s feeling of security and privacy but
can also increase delay and, therefore, reduce responsiveness. Guidelines
for deployment options address the QoE implications of various choices.
The user interface associated with the network management system and the
effectiveness of quality monitoring features will also be improved by
attention to QoE factors.

Ea
y ilit
sy
Reliab
to
y
lit
Us
bi
la
e
ai
Av
Pro
ind gress
ic a
tors
QoE Dexterity
sive
p on Se
Res cu
Clear
e
Us rit
y
& pict
t to
sound
n
ure
icie
Eff
Figure 2-3: Some of the factors influencing the QoE of a service, application,
or device
Efforts are more successful where QoE is an integral part of the design
process. Retrofitting to improve low QoE is likely to be difficult,
expensive, or inadequate. For example, external echo cancellers are more
expensive than integrated echo control. Tweaking the network to reduce
delay may achieve some minor improvement, but many sources of delay
will be hard-coded and therefore inaccessible to tuning. What does this
mean for buyers of real-time converged networks? Vendors whose
performance targets are derived from a comprehensive set of QoE
parameters, and whose design intent begins with these targets are likely to
achieve better overall QoE. Vendor selection criteria should include the
vendor’s attention to QoE, as well as system reliability and cost.
Measuring QoE
Aside from the obvious grossly malfunctioning cases and user complaints,
how can we determine the level of QoE our network or service provides?
Quality of Experience is a subjective quantity and can be measured directly
using behavioral science techniques. QoE can be measured in a laboratory

Sidebar: Quantifying QoE parameters

As noted in the main text, we need to relate subjectively measured QoE
to a set of objective parameters, and determine the target for each
parameter.
The particular values of QoE parameters determine or influence (1) the
user's rating of service quality OR (2) his/her performance on some
relevant aspect of the service. Subjective evaluation is done to quantify
the relationship between the overall QoE and the objective parameters
we believe determine the QoE. We vary the physical parameter (for
example, the resolution of a video image, and examine how the user's
quality rating changes.
The figure shows a hypothetical relationship of a generic parameter to
some QoE measure. As our hypothetical parameter increases (x-axis),
the subjective rating also increases (y-axis). The shape shown is
common for QoE parameters, where the user rating bottoms out at the
low end and tops out at the high end (so-called “floor” and “ceiling”
effects). Other shapes are possible.
The positioning of the unacceptable, acceptable, and premium quality
areas depends on another subjective measure, acceptability. Depending
on human perceptual factors, user expectation, etc., the boundaries
between the colored regions can shift. t
le
n
le
lle
ab
ab
ce
pt
pt
Ex
ce
e
cc
Ac
Some Rating
a
Un
Some Physical Variable

[ e.g.,SNR, delay, resolution]
setting or in the field, through user ratings, surveys, or observation of user

behavior. Specific techniques include, user quality ratings of an application
or service, performance measurement, such as the time taken to complete a

task, or tabulation of service-related information, such as subscriber

complaint rates or frequency of abandoned calls. A familiar QoE metric is
subjective Mean Opinion Score (MOS).
In the previous section, we emphasize that the outcome is best where
design and development proceeds using performance targets based on QoE.
The performance targets, however, should not be expressed in terms of
QoE metrics. This may seem counter-intuitive, given the previous
discussion about QoE. Not only are subjective metrics more time-
consuming and expensive to measure, they cannot always be translated into
engineering characteristics. In concrete terms, if the specification was
given as MOS, and verification testing showed that the performance was
below target, how would we know what to fix?
Instead, we need to identify objectively measurable correlates of QoE, and
determine the target for each. This approach facilitates design engineering,
verification, and troubleshooting in the field, as well as providing
customers with measurable performance targets the vendor will stand
behind.
Objective parameters that contribute to QoE include:
physical properties of the end device (such as size, weight, fit, button
placement)
timing and logic of system operation (such as feedback on progress of
a hidden operation, how long the user must wait before going on to
the next step, number of steps needed to complete a task)
network characteristics (availability, call setup time, data loss [bit
errors or missing packets], end-to-end delay/response time)
network/account administration (availability of user support, billing
accuracy).
There are a few cases where two or more parameters interact, making it
difficult to assess the QoE impact of one parameter individually. In most
cases, the parameters can be separated into sensible domains. This allows
the network characteristics to be considered separately from the physical
properties of the end device.
Service pricing is not a component of QoE. A service that performs poorly,
remains poor even when it is free. Nevertheless, pricing remains a factor in
a customer’s decision whether to tolerate poor QoE or to complain about a
problem.
The QoE results determine the range of allowable variation in each
parameter that matches the perceptual and cognitive abilities of the user.
The relationship between the range of variation and the acceptability of the
performance allows us to define targets and tolerances for each parameter.
When all parameters and their targets are properly identified, and a device

is properly engineered to meet them, the resulting device will have high
QoE.
What is QoS?
Quality of Service (QoS) refers to a set of technologies (protocols and
other mechanisms) that enable the network administrator to manage
network behavior by avoiding or relieving congestion, expediting time
sensitive packets, limiting access to congested links, and so on.
The aim of QoS mechanisms is to ensure efficient use of network
resources. The alternative is overprovisioning capacity, which may not
solve the problem of contention for specific resources, and, as we have
discussed earlier, may not improve the performance of real-time
applications. Quality of Service mechanisms do not create bandwidth but
instead manage available bandwidth more efficiently, especially during
peak congestion periods. Congestion occurs when a node or a link reaches
its maximum capacity, that is, when the sum of ingress traffic at a given
node exceeds the egress port capacity. QoS mechanisms may not be
sufficient or effective in a network that is continually congested; to address
this, redimensioning may be necessary.
An important aspect of QoS is assigning packet priorities corresponding to
specific service classes (for example, with specific payload types) or within
specific flows (for example, User X vs. User Y). They can raise the priority
of a given class of packets or a given flow or limit the priority of competing
flows. Providing differentiated services requires first determining the
desired services and user performance requirements, and second defining
and evaluating the appropriate QoS mechanisms required to balance the
resulting traffic. QoS mechanisms allow us to manage network
performance (for example, bandwidth, delay, jitter, loss rate, and response
time) to maintain stable, predictable network behavior.
Conclusion
The main challenge of converged networks is to create a network
environment that allows all the services and applications it carries to
perform well, regardless of whether they are real-time or non–real-time.
The combination of real-time applications with traditional non–real-time
computing data applications on a single network can result in a widely
variable packet traffic characteristic. The network must be able to
comfortably carry many different types of traffic without degrading any of
the applications riding on them. Combined with the delay requirements for
individual real-time applications, the design challenge is formidable
indeed. Table 2-2 summarizes the demands of converging real-time and
non–real-time services and applications onto a common packet
infrastructure.

Mixed traffic types with various (sometimes conflicting) requirements
Traffic highly likely to be mission-critical: Resiliency is essential.
Goal is performance not connectivity
High user expectations for existing Real-Time services
Service and application convergence marry real-time and non–real-time
End-to-end delay requirements are strict for some applications
Some real-time applications are intolerant to data loss
Table 2-2: Challenges of converged Real-Time Packet Networking

What You Should Have Learned

Telecommunications convergence holds enormous potential. At the same
time, it is a significant challenge to ensure that all applications and
services, both real-time and non–real-time, running over a single network
infrastructure perform well. Real-time services have different requirements
than non–real-time services, and these will sometimes conflict.
Convergence is defined generally as, fusion of function within various
domains. Three important domains are the network, the service, and the
application. Network convergence combines traffic from previously
separate networks. Service convergence combines previously distinct
services into a single service. Application convergence combines separate
applications into a coordinated suite. Convergence promises greater
flexibility and freedom to users, but designers face tighter requirements and
additional constraints, and must understand network issues end-to-end,
rather than by function or location.
Real-time services are distinguished by having both a lower limit on
throughput and an upper limit on delay. The specific limits differ for
different services and applications. These requirements make it difficult to
deliver high quality real-time services over a best- effort network.
Flows associated with real-time services have different characteristics than
those for non–real-time. Real-time flows are made up of small, regularly
generated packets, and the flows tend to last a long time. Non–real-time
flows tend to contain large packets, are bursty in nature, and the flows tend
to be shorter. If more bandwidth is available, a non–real-time application
can take advantage of it to run faster.
Real-Time services have traditionally been engineered for user
performance. The end users of a service do not care how the service quality
is achieved, but only that they can do what they need to do without
difficulty. How well the service performs for the users determines their
Quality of Experience. For voice at least, QoE affects the bottom line.
Each service or application has its own set of QoE parameters and
associated requirements. These are determined through subjective studies
evaluating the contribution of individual variables. Understanding the
contributing factors allows us to determine a set of performance
requirements for each service to help ensure a given level of QoE in the
final network. The engineering performance requirements are expressed in
terms of objectively measurable parameters that have been shown to
correlate strongly with QoE. We can expect the best QoE in the final
product where QoE has been an integral part of the design process from its
inception. Pricing is never a factor for QoE.
For the present discussion, Quality of Service (QoS) refers to a set of
technologies (protocols and other techniques) that enable the efficient use
of network resources.

References
ITU-T Recommendation G.1010, End-user Multimedia QoS Categories,
Geneva: International Telecommunication Union Telecommunication
Standardization Sector (ITU-T), 2001.
ITU-T Recommendation G.114, One-way transmission time, Geneva: ITU-
T, 2003.
ITU-T Recommendation G.107, The E-Model, a computational model for
use in transmission planning, Geneva: ITU-T, 1998.
ITU-T Recommendation P.800, Methods for subjective determination of
transmission quality, Geneva: ITU-T, 1996.

35
Chapter 3
Voice Quality
Leigh Thorpe
Media Session Related
Session Gateway
Media Related Control Control
H.248 / MGCP
View From
Video
Audio
Voice
Real-Time
Application
Control
/ NCS
RTCP
H.323
RTSP
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
ATM FR Ethernet DOCSIS
xDSL Cellular Fiber Copper WLAN HFC
SONET / TDM
Concepts covered
Characteristics of conversational voice services (“voice”)
Steps involved in transporting voice over an IP network
The main factors affecting VoIP conversation quality
Effects of delay and jitter
Effects of packet loss, and its mitigation using packet loss
concealment
Effects of echo, and its mitigation through control of signal level and
echo cancellation
Quality metrics including MOS, PESQ, and E-Model R

Introduction
Everyone uses the telephone every day and has well established
expectations about how it should work. What users may not realize is that
the standard voice call consists of two narrow-band (300-3400 Hz) sound
channels, one in each direction, and that these operate independently. This
means that even if the two users talk simultaneously, each will be heard at
the other end. This operating mode is called full-duplex. The situation is
more complex for advanced features like handsfree (speakerphone) or
conferencing, but for now we will limit our discussion to the simple desk-
to-desk handset call.
Traditional voice networks have provided very high quality voice, setting a
high benchmark for IP voice services. In this chapter, we will discuss the
main factors that contribute to voice quality on converged networks. Voice
is a demanding real-time application. How will typical IP network behavior
affect Quality of Experience (QoE) for conversational voice?1 What are
the challenges we face in making voice meet user expectations, and what
can be done to ensure we meet them?
Voice calls through an IP network

Speech traveling through an IP network undergoes a number of
transformations on its way from the talker's mouth to the listener's ear.
Figure 3-2 shows the steps involved in transmitting a simple call from an
IP terminal through a gateway to an ordinary phone at the other end. The
upper row of boxes in the diagram shows the path taking speech from left
to right, and the lower row shows the path from right to left. On the upper
path, the user's voice is picked up on the microphone (Point A) and carried
to an analog-to-digital converter (A/D, shown at B), which is part of the IP
telephone electronics. The A/D encodes the voice into a synchronous
digital bit stream (eight-bit G.711, 64 kb/s). The G.711 bit stream passes
through an Echo Canceller (ECAN)2 and perhaps through the encoder of a
low bit-rate codec (this is optional). The output of the encoder (or the
G.711 bits, if no compression codec is used) at C is chunked into packets,
and headers are added. The packets are sent across the network by edge and
1. The subjective quality of a voice call is based on many parameters, some of which involve how the
output sounds (for example, level, distortion) and some of which involve the conversation dynamics
(for example, echo, round-trip delay). For the purposes of the present discussion, these are combined
under the designation voice quality.
2. An echo canceller is used in this example, but it is not the only option. Other methods of echo control
may be used to control local echo in the phone. This is discussed in “Echo control for VoIP”.

Chapter 3 Voice Quality 37
network routers. The signal may pass through several routing nodes before
reaching the packet receive side.
A B C D E F G
Packet Core Gateway TDM
Transport Network Network
Terminal Gateway
A/D E Encoder Packet- Jitter Decoder E L

o T
G.711 C SAD ization Routers Buffer PLC C s D
A A s M
D/A N Decoder Jitter Packet- Encoder N
G.711 PLC Buffer ization SAD
Synchronous
Packet Packet Synchronous
(Non-packet)
Side Side Side
Side
A B C D E F G
Figure 3-2: Block diagram of the processes making up the VoIP voice path
In this example, the packet receive side is a media gateway. There, the
packets are delivered and stored in a buffer called the jitter buffer. The jitter
buffer applies a short delay to the data to ensure that a steady stream of data
can be sent to the decoder. Packets are unbundled (D), and the data
reassembled into a synchronous stream. If compression was used, the
signal is decoded. The output of this process is a G.711 bit stream (E),
which is handed over to the TDM network. The echo canceller shown is a
network canceller, and it is essential at the interface between an IP network
and a TDM network where analog access lines may be in use. A loss pad
(F) at the output may be needed to match the loss plan of the packet
network to that of the TDM network. When the G.711 stream reaches the
end of the TDM network (detail not shown), it is converted back to analog,
and the analog signal travels over the local line to the telephone at the far
end (G).
Certain special features are included in the diagram. Speech Activity
Detection (SAD) indicates a silence suppression function, which may be
used to determine whether the data contains speech, which is sent across
the packet network, or only silence, which is not sent. PLC refers to Packet
Loss Concealment. PLC is a process by which the output G.711 bit stream
is repaired to smooth over the gaps left by any missing data.
Note that all Digital Signal Processing (DSP) components (codecs, echo
cancellers, silence suppression, and packet loss concealment) are situated

in the synchronous (non-packet) portion of the path. The speech data are
not read or modified in the packet portion of the network.
Although Figure 3-2 shows a particular connection from an IP phone to a
conventional phone, other connections (for example, IP-to-IP, wireless-to-
wireless over IP) are similar. Only minor changes to the diagram would be
needed to describe most alternate VoIP scenarios.
Factors affecting VoIP conversation quality

Packet networks introduce potential degradation to the voice channel. This
section describes how these impairments affect voice quality. In subsequent
chapters, principally Chapter 20, we will discuss how to control or avoid
these impairments to achieve high-quality voice performance.
Sources of impairment can be classified into two distinct groups: (1)
intrinsic or noncontrollable and (2) controllable. Figure 3-3 indicates where
many of these impairments are found in a converged IP network.
Impairments are termed controllable if the network architect, the box
designer, or the network operator can make choices that increase or
decrease the impairment. On the other hand, non-controllable impairments
are those where the design of the equipment or network has no, or very
limited, influence. Propagation delay due to distance (governed by physics)
is an example of a noncontrollable impairment. Delay through legacy
equipment such as TDM end offices is also beyond the control of the IP
network engineers.
Parameters identified as controllable in Figure 3-3 are not always
controllable by the network operator. In many cases, the choices have been
made by the equipment vendor. Processor speed and capacity, use of
buffering, and many other design factors are chosen by the vendor. The
vendor's design will also determine what setup options are available, such
as the codecs available (for example, G.711, G.726, G.729), packetization
options (for example, 10, 20, or 30 ms), silence suppression, built-in echo
cancellation, and so on. Vendors who give a high priority to end-user

Quality of Experience (see Chapter 2) will offer choices in their setup

options that allow the optimization of voice qualities.
Processing/switching
Propagation Delay Delay Jitter Buffer
Codec Voice/data
Packet size Access Packet Loss
Link Speed Transcoding
Router
Packet Packet
PSTN POTS
IP Phone Network Network TO
TO EO
V2I Edge
core
MG
Router
L2 SW router
V2I TDM handoff Non-controllable
Source Jitter parameters
Enterprise Queue Size Transmission Delay
access Network Jitter
Controllable
Voice/data loading parameters
Figure 3-3: Controllable and noncontrollable impairments introduced by

packet networks
There are four main performance factors, or impairments, that are
important for VoIP voice quality. These are the speech codec used, delay,
packet loss, and echo. A fifth aspect, signal level, is not affected by IP
transport, but it is important to establish proper settings at the point where
an IP network connects to another type of network. None of these
impairments are specific to VoIP, they all exist in some form in traditional
networks, both wireline and wireless. In traditional wireline networks, they
are under careful engineering control. Traditional wireless is less strictly
controlled, mainly because of delay and data loss on the RF channel, but
users are willing to trade off some voice quality for the convenience of
mobility. The impairments are introduced below, and are considered in
more detail in the following sections.
Speech codec
The speech codec chosen will have a strong influence on the final obtained
quality, both because of the baseline quality of the codec (that is, the
quality of the codec without other impairments) as well as the response of
the codec to other factors, such as presence of background noise, packet
loss, and transcoding with itself or another codec. The choice of codec is an
important determinant of the overall performance of VoIP. See Chapter 5
for additional discussion of the contribution of various telephony speech
codecs to VoIP service quality.
End-to-end delay
The end-to-end delay of a voice signal is the time taken for the sound to
enter the transmitter at one end of the call, be encoded into a digital signal,
travel through the network, and be regenerated by the receiver at the other

end. Delay is sometimes called latency. When delay is too long, it may
cause disruptions in conversation dynamics. As well, increasing delay
makes echo more noticeable.
Variation in delay, caused by differences in the time taken for packets to
cross the network, is called jitter. Jitter is a concern because the decoding
of the digital signal is a synchronous process and must proceed at the same
constant pace that was used during encoding. The data must be fed to the
decoder at a constant rate. Variation in packet arrival times is smoothed out
by the jitter buffer, which adds to the end-to-end (mouth-to-ear) delay.
Jitter is not considered a separate impairment because the effects of jitter in
the packet network are realized in the output either as delay or as distortion
from packet loss.
Packet loss
In VoIP, packets sometimes get lost. Packets may be dropped during their
journey across the network, or more commonly, they are late in arriving at
the destination and miss their turn to be played out. The missing
information degrades the voice quality, and a Packet Loss Concealment
(PLC) algorithm may be needed to smooth over the gaps in the signal.
Echo control
Because of the longer delay introduced by VoIP, echo control is a major
concern. A given level of echo sounds much worse when the delay is
longer. Echo control at the appropriate places in the connection will protect
the users at both ends. Echo control relies on the correct signal levels (see
Signal Level, below) as well as on echo cancellers and other techniques
that prevent or remove echo from the connection.
Signal level
The level or amplitude of the transmitted speech signal is determined by
amplitude gains and loss across the network. There are a number of
contributors to the final signal level, and most are defined in the loss plan
(sometimes called the loss/level plan) of the network. The loss plan for
TDM ensures that the output speech is heard at the proper level and
contributes to the control of echo. The loss plan for VoIP is reasonably
simple; the sensitivities of the sending device (say, an IP phone) and the
receiving device (say, a media gateway) are defined by standards, and there
is no gain or loss in the packet portion of the network.

Things are more complicated when a packet network is connected to

another network with a different loss plan. When the other network is a
traditional network with analog access, it may be necessary to adjust the
level of each signal path (the signal sent to the other network and the signal
coming from the other network) to account for the loss plan of that
network. The required loss for each path must be determined and set
accordingly. Errors in the loss settings can cause incorrect speech level or
audible echo at one or both ends of a connection.
Pulling it all together

Making choices for all these characteristics is called transmission planning.
Voice transmission planning is essential to supporting a usable
conversation connection, and is especially important in the migration from
TDM to converged networks. Most VoIP transmission planning issues
should have been addressed in the design of the network architecture and
equipment. There are a number of standards that define limits on these
factors, and conformance to these standards is an important selection
criterion for VoIP network equipment. However, a solid understanding of
the impairments and their control will protect against poor setup choices,
slips, or neglect that can defeat the transmission plan and result in
unacceptable voice quality.
Delay & jitter
Impairment from delay

Delay in the voice path destroys simultaneity at the two ends of the
conversation, disrupting turn-taking, and introducing subtle changes to
interpretation of meaning. With longer delays, simultaneous starts and
awkward silences occur. It becomes difficult to interrupt gracefully, and
attempts to do so may appear especially impolite because of the difference
in perceived timing at the two ends of the call. Delay can even affect one
party's perception of the attentiveness, honesty, or intelligence of the other,
without either party being aware that there is an objective cause. This can
cause significant difficulties in the business environment, where sensitive
discussion and negotiation may be involved, and when callers may not be
familiar with one another and must rely on their immediate impressions.
ITU Rec G.114 provides guidance on the range of delay for acceptable
service quality. G.114 suggests that delay be kept below about 250 ms to
avoid noticeable impairment; this will provide essentially transparent
interactivity for voice and multimedia applications where conversational
voice is a component. Delay may range up to 400 ms, but degradation may
be apparent. Figure 3-4 shows the impairment associated with increasing
one-way delay in terms of a metric called Transmission Rating (R).

100
90
no
d re
80 fr e g co n
om ra im mi m ot
R
de dat pa ld m
en
la ion irm to de
70 y en sig d
t f nif
ro ic
m an
de t
60 la
y
50
0 100 200 300 400 500
One-way Delay (mouth-to-ear) (ms)
Figure 3-4: Degradation from delay, shown as decrease in R with increase in

one-way, end-to-end delay. Degradation shows up as impairment to
conversation quality. The relation shown is that used by the ITU E-Model
(G.107). R shown has all other factors at ideal values. (After G.114.)
While individual contributions to delay (ten milliseconds here, five
milliseconds there) sound small, it is important to remember that once
delay is added in, it can not be removed. When making choices that involve
trade-offs, including delay, be sure that you are getting value for any delay
that is added by network setup choices. Good quality VoIP equipment will
minimize delay associated with the equipment design and will allow you to
select setup options that minimize additional delay. Mobile/cellular already
has substantial delay, so VoIP networks with cellular access require a
carefully engineered delay budget and close adherence to other engineering
guidelines.
Jitter and the jitter buffer

In IP networks, jitter is the variation in the time-of-arrival of consecutive
packets. Jitter results from a momentary condition where more packets are
vying to get on a particular link than the link can carry away.
Jitter must be removed from VoIP data so the packet payload can be
converted to a synchronous stream. A buffer, called the jitter buffer, is used
to hold the packets so that a constant rate bit stream can be output. Two
values3 are needed to describe the jitter buffer: first, the amount of data it
can hold and second, the waiting time imposed before the data is sent to the

decoder. The wait time is an important network tuning variable, since it

determines the arrival “deadline” for packets if their data is to be played out
in the bit stream. Longer waiting time will result in fewer late packets.
However, the wait time adds to the end-to-end delay. For best results, the
wait time should be only as long as it needs to be, and no longer.
The capacity of the jitter buffer depends on the amount of memory
allocated; this is set by the equipment vendor. The waiting time (jitter
buffer delay) may be fixed, provisionable, or adaptive. If the jitter buffer
waiting time was fixed by the equipment design, voice quality will be sub-
optimal unless the jitter on the network just happens to match the jitter
buffer setting (not very likely!). A provisionable jitter buffer allows the
selection of a setting that matches the amount of jitter generally
experienced on the network. If jitter occasionally exceeds the wait time,
packet loss will result. An adaptive buffer is best, since the adaptation will
keep delay low during periods of low jitter, but will increase the waiting
time when increased jitter is experienced.
The amount of jitter in the network is governed by:
Network loading, that is, the volume of packets being handled by
the network nodes (switches, routers)
The scheduling strategy and priorities of classes of service
The link speed
Voice and data packet sizes
Maximum queue size
Use of silence suppression and/or down-speeding
Contention for link bandwidth by multiple voice calls sharing the same
priority, causes some packets to sit in the queue until the processor can get
to them. Voice packets may also have to wait if a data packet has already
started transmission, even though they have a higher priority. Timing drift
can also add to jitter.
In networks carrying voice and data, jitter can be significantly reduced by
strict priority scheduling and proper load balancing. This will prevent
individual nodes from being over-subscribed. This is discussed further in
Appendix C. Controlling jitter means that jitter buffer wait time can be
lower, reducing end-to-end delay, and that the network will experience
fewer (or no) congestion events that may result in jitter buffer underflow
(speech gaps) or overflow (packet loss).
3. Jitter buffer for VoIP was described with a single value (delay in ms), and the recommended size
was twice the packet size. This formulation does not adequately specify either the buffer capacity
(which may need to be higher than two packets to prevent packet loss through buffer overflow fol-
lowing a congestion event) or the wait time (which should be much lower where jitter behavior
allows).

Sources of delay
The end-to-end delay is the total of all delays incurred in the voice path.
The principal sources of delay are summarized in Table 3-1. The four main
categories of delay are shown in the left-hand column.
Processing delay is an inevitable part of VoIP. Voice packet payloads
contain speech associated with a chunk of time, and the system must wait
for that speech to accumulate before it can be put in a packet. The packet
can not be loaded and sent until all the speech for that chunk is collected.
Where speech compression is used, the time needed for coding is added as
well. The speed of any processors (DSP, CPU) involved also contributes to
the final delay.
Serialization delay (the time needed to push a packet onto the wire) is a
small but predictable contribution. It is determined by the channel speed
(bits/sec) and the number of bits in the packet. On high speed links (> T1)
serialization delay becomes negligible compared to other sources of delay.
Queuing delay accumulates at network nodes (routers and switches)
across the network. Congestion can increase packet waiting times in
buffers. Variation in queuing and buffering delays in the network account
for most of the variation in packet transport times (that is, jitter). The jitter
buffer wait time is another instance of queuing delay.
Propagation delay is the time taken for the signal to travel through a cable
or fiber. In the conventional public network, propagation delay is the
largest contributor to end-to-end delay. For international calls, propagation
delay through terrestrial circuits can exceed 100 ms, so it remains an
important contributor to VoIP delay.
Propagation delay across a fixed distance is not a controllable parameter,
since it is determined by the speed of the signal through the transmission
medium (usually light through a fiber). However, it is possible to ensure
that packets take the most direct route through the network to minimize
queuing and propagation delay. Note that where the shortest route is
congested, queuing delays on that route may exceed the additional time
needed to take an alternate route.

Delay Type Delay Sources Mitigation techniques

Packetization/ select smaller packet size
depacketization
Processing
(if used)
Load balancing
QoS mechanisms and
traffic management
Optimized buffer size
DSP/CPU processing select faster DSP/CPU
Speech compression (if select waveform codec or
used) codec with smaller frame
Serialization Time to push a packet into increase link speed or
the wire/fiber reduce packet size
Queuing Delay in Routers Load balancing, QoS
mechanisms and traffic
management, Optimized
Queuing buffer size
Interleaving, scheduling/ System architecture and
polling traffic loading
Propagation Distance Noncontrollable
Table 3-1: Sources of delay in IP networks
Distortion
The remaining VoIP impairments to the conversation quality are different
types of distortion. These are summarized in Table 3-2. Codecs are
included in the table, but the details are discussed in
Chapter 5. Signal level is included with echo, since signal level through the
network plays an important part in the control of audible echo.

Distortion Type Distortion Sources Mitigation techniques

Coding distortion Appropriate codec selection
Codec
Transcoding/Tandeming Packet handoff between
networks
Transcoder-free operation
Missing data from late/lost packets
Load balancing and traffic
Packet Loss Concealment (PLC) management
artifacts
Packet loss
Lost data from statistical Tuning of silence
oversubscription with silence suppression parameters
suppression appropriate for number of
channels & link speed
Sound level Inappropriate loss settings or
terminal loudness ratings Delay & loss planning
Careful planning for echo
Echo Level Insufficient echo control; control
Artifacts from echo control
methods (suppressor, canceller)
Table 3-2: Sources of distortion in IP networks
Packet loss
Packet loss can be a significant source of distortion to VoIP. Lost packets
create gaps in the voice data, which can result in clicks, muting, and
artifacts associated with attempts at smoothing and repair. Non–real-time
data transmission is robust to packet loss because packets can be resent.
Delay-sensitive applications such as interactive voice can not wait for the
time it takes to resend.
Generally, there are two ways that packets can be “lost.” The first way is
that some packets never make it to the destination. They may be lost at
network nodes either through a buffer overflow at a congested network
node (insufficient memory to store packets waiting for forwarding), or
because a congested router deliberately discarded them to reduce packet
load. These packets are truly gone, and will never arrive at the destination.
Disabled devices or fiber cuts can also result in lost packets, until the
network responds by establishing an alternate route. Packets lost in these
ways will be spread across all the flows being handled at the time, so losses
on individual channels are likely to be small.

The second type of loss is that packets arrive too late. Queuing and other
network delays can cause variability in packet arrival time at the receiving
end. The jitter buffer smooths out the variability by holding packets for a
fixed wait time relative to the expected arrival time before they are sent to
the decoder. The jitter buffer waiting time determines the longest time that
a packet can take to arrive. Packets delayed longer than this lose their turn
in line, and are as good as lost since the voice playout can not wait for the
late data to show up. The total packet loss will be a combination of losses
from these two sources.
When significant congestion occurs at a transmission node, packets may be
held up long enough that the decoder uses up all the data waiting in the
jitter buffer, resulting in underflow. When the congestion clears, the several
packets that have been backed up are forwarded quickly, one after another,
with the possibility that there may be more packets arriving at the jitter
buffer than the memory can hold. When this happens, packets are lost
because there is no room to store them (jitter buffer overflow).
Concealing missing data

Packet loss creates gaps in the speech data, and these cause clicks or other
artifacts in the output speech. Packet Loss Concealment (PLC) consists of
processing to remove clicks and fill in the gaps. Techniques range from
very simple ones requiring little processing power to complex methods that
can restore the quality to almost the equivalent of the original speech. Even
the best techniques can not repair speech gaps of more than about 60–80
ms. Where a burst of loss exceeds this range, the PLC will mute the output.
This can result in temporal clipping and missing words.
A very basic PLC strategy is simply to smooth the edges of the gaps to
eliminate audible clicks. More sophisticated techniques create synthesized
replacement speech that preserve the spectral characteristics of the talker's
voice, and maintain a smooth transition between the estimated signal and
the surrounding original. The most advanced techniques begin with a fast-
response adaptation of the jitter buffer wait time, which helps reduce the
number of packets that are lost. This keeps late packets to a minimum,
reducing the number of gaps to be smoothed. Additional discussion of PLC
techniques can be found in Chapter 5.
IP Telephony requires backward compatibility regarding voice band data
(analog FAX and modem) calls that are routinely carried on TDM. These
services are particularly susceptible to packet loss, and PLC techniques are
ineffective at repairing voice band data signals. Packet loss rates between
10–6 and 10–3 show increasing impairment to voice band data performance.

FAX is generally affected during the FAX handshake, where a lost packet
may result in a failed call attempt. Modem calls can have similar setup
difficulties, and may be subject to data rate downspeeding or call drop if
packet loss is encountered during data transfer.
Controlling echo
Sources of echo and measuring echo impairment

The full-duplex characteristic of voice telephone connections leads to echo.
Because both channels are open at the same time, a signal sent over the call
in one direction can leak back onto the return path. (Only analog signals
produce echo; if digital signals happen to leak between send and receive,
they will add noise to the bit stream but are not heard as echo.) Because the
connection is full-duplex, echo is always present. When properly
controlled, however, it is below the user's threshold of perception.
Echo impairs a telephone conversation when it becomes loud enough to
irritate or distract the talker. Because echo is highly correlated with the
talker's speech, it is difficult to ignore. Longer delay echo (especially
louder echo) may interfere with one's ability to speak, because of disrupted
timing of auditory feedback to speech areas of the brain. When users at
both ends of the connection are talking at once, echo can mix with nonecho
speech and reduce intelligibility. Consequently, it is essential to control
echo to achieve good Quality of Experience on voice calls.
Figure 3-5 provides a conceptual illustration of two types of echo heard on
voice calls. The first type is called talker echo (Figure 3-5, upper panel),
because in this case the talker hears his own voice returning. The second
type is called listener echo. This occurs when talker echo is reflected again
onto the sending path, and the listener hears a second, delayed instance of
the talker's speech (Figure 3-5, lower panel). Controlling talker echo
usually controls listener echo also, so our discussion will focus on talker
echo.
Specific sources of echo are electrical reflections in analog access
equipment and acoustic pick-up (coupling) between the receiver and the
transmitter. (See Sidebar for details on echo in analog access circuits.) As
noted in the sidebar, analog access lines, which remain the most common
access type in traditional telephone networks, are the most frequent source
of echo. Currently, almost all residential telephone lines and many business
lines use analog loop to connect to the telephone company equipment.

Talker Listener
A B
Talker Echo A's voice
(Delay > 5 ms)
A's voice, delayed
Talker Listener
A B
Listener Echo A's voice
Talker Echo
A's voice, delayed
Figure 3-5: The type of echo is named after who hears the echo.
Because echo cannot be generated in the digital portion of the path, the
only sources of echo in all-digital networks such as ISDN, cellular, and
packet networks (for example, IP, ATM, frame relay) are audio and
acoustic coupling in the end device. These echoes are best controlled in the
end device itself, and TIA-810-A gives requirements for the maximum
allowable coupling in an end device (TCLw), which applies to any
handsets, headsets, and speakerphones used on wireline digital networks,
including IP.
The degree to which echo impairs a conversation depends on two main
factors: the level (loudness) of the echo and the time it takes the echo to
come back (delay). Other sound (such as the talker's own voice, the far
user's voice, room noise, circuit noise) may mask the echo and change the
threshold of audibility. Figure 3-6 shows the quantitative relationship
between the level of the echo (measured in dB TELR) and delay (expressed
as the mean one-way delay of the echo path), for a talker in a quiet location.
The echo delay refers to the delay between the talker and the reflection
point. Since the reflection point is usually a hybrid in the access circuit at
the other end of the call, the echo path delay is typically the same as the
end-to-end delay.
Talker Echo Loudness Rating (TELR), which is a measure of how much
attenuation is applied to an echo along the echo path, weighted for the
perceptibility of the frequency components of the echo. TELR accounts for
all gains and losses in the echo path (including those supplied by an echo
canceller or echo suppressor). The computation of TELR also takes into
account the sensitivity of human hearing to the sound frequencies making
up the echo. TELR measures the loss (attenuation) of an echo rather than
its absolute level, and is thus independent of the level of the talker's voice.
This means that a single TELR requirement applies to all talker levels.

The color code in Figure 3-6 reflects the audibility and annoyance of echo.
There is no audible echo for TELR/delay combinations falling in the green
region. The contour between green and yellow shows the average threshold
of echo audibility, where echo is “just noticeable.” A given level of echo is
more easily detected at longer delays. Therefore, the “just noticeable” echo
is progressively quieter with increasing delay, that is, the TELR gets higher.
The yellow region represents combinations of TELR/delay for which the
echo is noticeable but is not loud enough to be annoying. TELR/delay
combinations where echo is loud enough to be irritating or annoying fall in
the red region. Limits on TELR are defined in terms of subjective
acceptability. Maintaining adequate TELR for the maximum expected
delay will ensure acceptable echo performance.
Figure 3-6: This graph shows the limit of audibility (green/yellow contour)
and the limit of acceptability (yellow/red contour) of echo as a function of
delay. The x-axis gives the one-way delay on the echo path, while the y-axis
gives the level of echo measured in dB TELR. (Higher TELR denotes quieter
echo.) Also shown are the positions of common types of telephone
connections, with an indication of the improvement associated with adding
echo cancellation to the call. These contours are taken from ITU Rec. G.131,
and are based on subjective ratings of echo in telephone conversations.

Sidebar: What causes echo in telephone connections?

Most telephone sets in the existing networks are analog. Analog
telephones are connected to the telephony company local switch through
a two-wire line that terminates at a hybrid. The hybrid is a dual
transformer in a bridge arrangement that splits the send and receive
signal paths so that they can ride on separate channels in the network
core. The separated signals are digitized and multiplexed in the core.
The figure below shows the circuit between the telephone and the hybrid
at the switch. The analog access line is said to be “two-wire,” and the
send and receive signals share the same pair of wires. The network core,
on the other hand, is “four-wire,” which means that the send and receive
signals travel through separate channels. A telephone handset is also a
four-wire device, and there is a similar hybrid circuit in the telephone
itself that breaks out the circuit into a path from the mouthpiece and a
path to the earpiece. Two-wire access lines were originally used because
of cost considerations.
When the two-wire signal reaches the hybrid, the signal is not
completely transferred from one side to the other. Some energy gets
reflected. The signal reflected on the four-wire side of the hybrid goes
back to the user at the other end of the connection and is heard as echo.
Of course, only a small amount of the total energy is reflected. This
means that the echo level is much lower than the original signal level.
Measurements of the echo level are made in terms of the attenuation or
loss relative to the level of the original speech signal in dB.
Hybrid echo is the main source of echo in the traditional telephone
network. Other common sources are pickup of received signal by the
transmitter (acoustic echo) and inductive coupling in the handset/headset
cord. The figure indicates where all these sources of echo occur in a
typical analog telephone circuit.
Receiver
Four-Wire Four-Wire
Analog Digital
inductive
coupling
A/D Send Path

Transmitter
D/A Receive Path
hybrid
inductive coupling echo
(in handset cord)

Echo control for VoIP

Given that most echo is generated in the traditional network, why is echo
such a big concern for VoIP? There are two reasons. First, IP networks have
long delay compared with traditional networks. As discussed above, the
longer the delay, the more audible a given level of echo will be. For calls
within the IP network, even low levels of acoustic coupling in the end
device can generate annoying levels of echo. Second, the delays in the IP
network are not compatible with the operational assumptions of the
traditional network. In traditional networks, echo control is applied to
reduce any echo below threshold of audibility, which is determined from
the expected delays (see Figure 3-3). For calls made between an IP phone
and a traditional phone, the echo control applied in the traditional network
may not be sufficient. Additional attenuation is needed to compensate, so
that acceptable quality can be assured.
Echo control should be in place at the following points in an IP network:
In all IP end devices, such as IP telephones or access gateways
(such as a DSLAM), to remove echo from acoustic coupling and
electronic cross talk. This echo control may be any of, or any
combination of the following:
Separation assured in the design of the device/component
Tuning of gains/losses in the voice path (loss plan, see section
above on signal levels)
Echo suppression (voice switch, such as is often used in
speakerphones)
Echo cancellation
At the interface with any traditional network using analog access
lines to remove echo coming back from that network (mostly
hybrid echo) using:
Echo cancellation, combined with the proper loss plan. If the
loss plans of the two networks are not properly matched, it may
affect the performance of the echo canceller.
Echo cancellation is not needed at the interface between an IP network and
another all-digital network (that is, where analog access is not in use), such
as digital cellular (for example, GSM, CDMA) or ISDN, provided the end
devices in both networks meet the echo control requirements of their
respective guiding standards. Of course, echo control will be needed if the
connection spans the adjacent digital network and reaches into a third
network where analog access is used, and it will typically be provided at
the interface to that third network.

Quality metrics for voice

Each of the characteristics discussed above (and other, conventional voice
impairments, such as noise and harmonic distortion) can be measured
individually. However, it is useful to have an overall indicator of voice
quality. Various metrics have been devised to quantify the overall perceived
voice quality of a component or a system. Three common metrics are
discussed below: the subjective measure called Mean Opinion Score
(MOS), an objective MOS estimator called PESQ (Perceptual Evaluation
of Speech Quality, pronounced “pesk”), and a computed metric called
Transmission Rating (R), which is calculated from objective measurements
of fifteen contributing parameters using an ITU standard tool called the E-
Model. Since most quality metrics are based on MOS in some way
(sometimes in name only), we'll start by looking at types of MOS before
looking at the details of the individual metrics.
Quality metrics are evaluated for individual calls. There is no single value
to describe a network: networks carry many calls, both simple and
complex, and the quality is determined by the access types, the transport
technology, the number of nodes the call passes through, the distance,
packet transport links speeds, and many other factors that differ from one
connection to another. To compare networks, specific connections
(reference connections) representing equivalent calling conditions are
defined that can be measured and compared.
Types of MOS
Mean Opinion Score began life as a subjective measure. Currently, it is
more often used to refer to one or another objective approximation of
subjective MOS. Although all “MOS” metrics are intended to quantify
QoE performance and they all look very similar (values between one and
five with one or two decimal places), the various metrics are not directly
comparable to one another. This can result in a fair amount of confusion,
since the particular metric used is almost never reported when “MOS”
values are cited. Appendix C provides more details on the distinction
between different types of MOS, and how to distinguish them. There are
fundamental differences between individual metrics, and numerical values
are not necessarily directly comparable just because they are both called
MOS.
Subjective MOS
Subjective MOS is a direct measure of user perception of voice quality (or
some other quality of interest), and is thus a direct measure of QoE.
Subjective MOS is the mean (average) of ratings assigned by subjects to a
specific test case using methods described in ITU-T P.800 and P.830.
Subjective MOS can be obtained from listening tests (where people rate the
quality of recorded samples) or conversation tests (where people rate the

quality of experimental connections). Quality ratings are judged against a

five-point scale: Excellent (five), Good (four), Fair (three), Poor (two), and
Bad (one). MOS is computed by averaging all the ratings given to each test
case, and it falls somewhere between one and five.4 Higher MOS reflects
better perceived quality.
Mean Opinion Scores (MOS) are not a measure of acceptability. While
perceived quality contributes to acceptability, so do many other factors
such as cost and availability of alternative service.
Subjective MOS is strongly affected by the context of the experiment.5
There is no “correct” subjective MOS for any test case, process, or
connection. This is extremely inconvenient, since it means that it is not
possible to specify performance or verify conformance to design
specifications based on subjective MOS, but it is very important to any
analysis based on subjective MOS evaluation.
PESQ (P.862)
Subjective studies take significant time and effort to carry out. MOS
estimators such as PESQ6 (Perceptual Evaluation of Speech Quality) can
provide a quick, repeatable estimate of distortion in the signal. However,
the score does not reflect the conversational voice quality, since listening
level, delay, and echo are excluded from the computation. Separate
measures of these characteristics must be considered along with a PESQ
score to appreciate the overall performance of a channel.
PESQ is an intrusive test, which means that the tester must commandeer a
channel and put a test signal through it. To perform a test, one or more
speech samples are put through a device or channel, and the output (test
signal) is compared to the input (reference signal). The more similar the
two waveforms, the less distortion there is, and the better the assigned
score. The algorithm does some preprocessing to equalize the levels, time
4. MOS are sometimes quoted to many decimal places. The appropriate number of decimal places de-
pends on the reliability, which in turn is determined by the number of independent ratings that con-
tribute to the mean. Usually, one decimal place is appropriate. Two places may be justified if a
large numbers of ratings are averaged (more than about fifty).
5. “Context” refers to things like the order in which the test cases are presented in the experiment, the
range of quality between the worst and best test cases used in the experiment, and whether the sub-
jects are asked to do a task before making a rating. If an experiment is repeated exactly (with dif-
ferent subjects), similar scores will be obtained within a known margin of error. This is not the
case from one experiment to another. Consistency from test to test is found in the pattern of scores,
not in the absolute value of the scores. For example, the MOS-LQS for G.711 may be 4.1 in one
study, 3.9 in another, and 4.3 in a third, but whatever the value obtained, we expect to obtain a
higher score for G.711 than G.729, and approximately equal scores for G.729 and G.726 (32 kb/s).
6. Many objective quality algorithms have been defined. Aside from PESQ, the best-known are PSQM
(Perceptual Speech Quality Measure), standardized as P.861, and PAMS (Perceptual Analysis
Measurement System), a proprietary method developed by BT. As the current standard, PESQ is
preferred to the older measures.

align the signals, and remove any time slips (where some time has been
inserted or deleted). PESQ then applies perceptual and cognitive models
that represent an average listener's auditory and judgment processes. A
diagram of the process is shown in Figure 3-7.
The raw PESQ score is usually converted to a MOS estimate using one of
several available conversion rules, for example, PESQ-LQ7.
Original
Perceptual
Difference Model
System under
test ?
Output
Cognitive
Model
The system under Quality

test may be a codec, Estimate
a network element,
or a network
Figure 3-7:Block diagram showing the operation of PESQ and similar

objective speech quality algorithms. A known signal is input to the system
under test, and the algorithm analyzes the difference between them by
applying a model of human auditory perception followed by a model of
human judgment of preference to arrive at a quality estimate.
Transmission rating (R)

Transmission Rating or R is an objective metric indicating the overall
quality of narrow-band conversational voice. R is the main output variable
of the ITU E-Model (Rec. G.107).8 Fifteen parameters are used to
compute R, some of which are listening level, noise, distortion, the codecs
used, packet loss, delay, and echo. Because R accounts for all the factors
7. A conversion defined by Psytechnics, a company holding intellectual property rights for PESQ.
There is now an ITU-T standard conversion defined in P.862.1.
8. The E-Model can also be used to compute a “MOS” estimate. Note that MOS computed with the E-
Model is not comparable to MOS computed with PESQ.

that contribute to the conversational voice quality, it is the only value

needed to completely describe the quality. R can be determined for voice
calls on any technology platform or combination of platforms (analog,
digital, TDM, ATM, or IP) and with any type of access (analog loop, digital
loop, wireless, or 802.11).
The E-Model computes R for individual connections. By defining many
connections of interest (called hypothetical reference connections or
HRX), we can determine R for many paths through the network.
Generating R for a well-chosen set of HRX gives a good indication of the
performance of a network. By comparing similar call scenarios for
different networks, we can get a picture of where a network does well and
where it may disappoint users.
The input values used to compute R can be measured values or expected
values. This means that the E-Model can be used to predict the quality of
equipment and networks that are still in the planning stages. It is also
helpful to compute R for a benchmark network, particularly where the
benchmark provides a “known user experience.” The PSTN (Public
Switched Telephone Network) is a common choice for a benchmark. Other
sensible choices might be (1) an existing network that is being replaced by
the new network, or (2) an unrelated network that delivers a known user
experience that is chosen to serve as a quality target, for instance, ordinary
wireless cellular performance.
R can be tabulated to facilitate comparisons. Often, however, we find it
useful to make comparisons graphically. E-Model output can be used to
generate a graph showing how R changes as delay increases. The R x delay
relation not only indicates how a particular scenario will respond to
increasing distance between the endpoints, but can also suggest the benefit
associated with changes to the end-to-end delay. The range for delay is
0–500 ms, which goes slightly beyond the limits suggested in G.114.
Detailed discussion of the charts are given in the figure captions for Figure
3-8, Figure 3-9, Figure 3-10, and Figure 3-11. The accompanying sidebar
shows how the quality of the PSTN TDM network may be represented
using this type of graph.
Nortel engineers have incorporated the E-Model into a powerful predictive
tool that allows call-by-call comparisons of hundreds of hypothetical
reference connections, allowing the evaluation and comparison of entire
networks. The results of studies of network performance have generated
network design guidelines for voice telephony services and real-time data
applications. Additional details about Nortel’s modeling process are
provided in Chapter 20 and Appendix C.

Typical Delay Impairment

International Call (IP-to-DCME-to-IP)
100
PSTN Reference
R=78.7, Delay=125 ms
90 (G.711 @ 20 ms)
G.711 > DCME > G.711
R=75.8, Delay=191 ms
80 (G.711 @ 40 ms)
G.711 > DCME > G.711
R R=71.0, Delay=231 ms
70
14057 km
60
50
0 100 200 300 400 500
One-Way Delay (ms)

Propagation Delay +
Switching/Equipment Delay DCME Delay Gateway Processing Delay Gateway Packetization Delay
Figure 3-8: R vs. delay for a particular class of terrestrial international calls.
G.711 is used in the national links with a Digital Circuit Multiplexing
Equipment (DCME), which generally uses G.726 speech coding at 32 kb/s in
the undersea cable. Specific points on the curve show R for the benchmark
(PSTN reference, TDM end-to-end), and for each of two calls using IP in the
national portions of the call (20-ms and 40-ms packets, respectively). Bars
under the curve indicate the sources for the cumulative delay associated
with each call. Since only one coding scenario is considered (G.711> G.726
> G.711), the model generates only one contour. The model assumes best
practices for any factors not specified.

Speech Coding Distortion

100
G.711, Ie = 0
G.726, Ie = 7
90
G.729, Ie = 10
80
R
70
60
50
0 100 200 300 400 500
Listening @ 0ms
One-Way Delay (ms)
Minimum delay for 20-ms payload
Figure 3-9: R vs. delay for G.711, G.726 (32 kb/s), and G.729 (8 kb/s). The
model assumes best practices for any factors not specified. Note that
although R is plotted for all delays, there will be a non–zero minimum delay
(yellow points) for interactive calls. For these points, propagation delay is
zero. This is the lowest delay for the modeled call scenario (the minimum
delay will depend on the codec as well as the packetization selected). In this
chart, we have assumed similar equipment delays beyond those associated
with the codec; however, in actual network situations, these can change as
well. The blue points represent the quality differences heard when listening
to recorded speech samples.

Echo Distortion, Digital Set

(with Ideal Loss Plan)
100
No Audible Echo
TELR = 65 dB
90
TELR = 60 dB
80 TELR = 55 dB
R
TELR = 50 dB
70
TELR = 45 dB
60
50
0 100 200 300 400 500
One-Way Delay (ms)
Figure 3-10: R vs. delay for various levels of echo. Note how R drops off
more quickly with smaller values of TELR. The increasing rate of
degradation for louder echo reflects the interaction of delay and echo
discussed above. The model assumes best practices for any factors not
specified.

Combining Factors
Loss Plan, Speech Compression, & Packet Loss
R
ISDN (all-digital)
Digital Loss Plan
100
POTS > POTS
Analog Loss Plan
POTS > G.726 > POTS
90
Analog Loss Plan + POTS > G.729 > POTS
Waveform Compression
80
Analog Loss Plan POTS > G.729 >POTS (3% PL)
+ Speech Compression
Analog Loss Plan

70
+ Speech Compression
+ 3% Packet Loss
60
50
0 100 200 300 400 500
One-Way Delay (ms)
Figure 3-11: R vs. delay for multiple distortion factors, showing the effect of
successive addition of non–ideal factors: loss plan, compression coding,
and packet loss. Since delay does not exacerbate any of these factors, each
contour has the same relative shape as the one above. The model assumes
best practices for any factors not specified.

:.
Sidebar: PSTN quality

Just for fun, let's use the E-Model to see how the TDM PSTN stacks up
against an “ideal” network.
The black curve shows how R changes for an ideal all-digital wireline
connection using G.711, with perfect echo control. The green curve
shows R for the PSTN, with wireline analog access and initially using
only the fixed loss plan (that is, no echo cancellers) described in
(T1.508-2003) for control of echo on short delay connections (up to
approximately 20 ms). This modeling was done using values taken from
standards targets. Actual connections will vary somewhat on all the
contributing parameters, with the result that many real calls through the
PSTN will be similar to the results shown here, while some will be better
and others will be worse.
In the PSTN, the largest contributor to the end-to-end delay is the
propagation delay. Equipment delay through a single digital central
office or PBX is 1–2 ms. As it turns out, almost all the differences
between the ideal network and the PSTN are due to echo control
measures.
For short delays (local calls and what we call “short-haul” toll), the
PSTN uses the fixed loss plan to keep echo below the threshold of
impairment. The fixed loss plan adds (amplitude) loss in each direction
to attenuate echo. As delay increases from 0 to about 22–25 ms, echo
impairment increases until it is no longer tolerable. At this point, the
network shifts to echo cancellation to control echo, and R increases
again to the high 80s and degrades gracefully with a slope slightly
shallower than the “ideal” ISDN curve.
No ECAN used here; TELR = 31 dB

Listening level not
ideal: Trade-off ECAN for these calls; TELR = 69 dB
ideal listening level
for echo control

Sidebar: PSTN quality (Continued)

These results highlight some details about the PSTN performance as
well as about the use of R as an indicator of quality. First, we can see
that R for the PSTN ranges between 88 and 70, for calls with typical
delays. Delays are low for local calls (2–3 ms), and may increase to
about 150–160 ms for the longest directly routed terrestrial (nonsatellite)
international calls. We know from experience that any calls falling in
this range will be completely acceptable to the vast majority of
telephone users. Calls over satellite links have additional delay from the
propagation of the signal over the distance to communications satellites
and back to earth, as well as propagation to and from earth stations at
each end, and some processing delays. For geostationary satellites, this
additional delay is approximately 300 ms.
Second, PSTN quality can not be characterized by any single value of R
(or any other scalar metric). It is not sufficient to use an absolute
performance cutoff for R such as 75. For many local calls, the PSTN is
significantly better than 75. On the other hand, many common calls such
as the short-haul toll calls with more than 10 ms of delay as well as
geostationary satellite calls would fail a criterion of 75.
Instead, if we want to compare the quality of an IP network to the quality
of the TDM PSTN, we need to look at multiple calls, and compare
similar calls made over the PSTN and over the IP network.

What you should have learned

The reader should now be familiar with the potential impairments
associated with voice services in general and VoIP in particular.
Conversational voice services (“voice”) are based on a pair of channels,
one in each direction, operating in full-duplex mode. Each channel carries
the digital equivalent of a narrow-band analog signal (300–3400 Hz). Good
conversational quality of a voice call depends on the listening level, low
distortion, low channel delay, and freedom from echo.
The path taken by the voice signal through the IP network involves
digitization of the analog voice signal, conversion to a synchronous (non-
packet) digital signal, packetization and routing, and finally ending up in a
jitter buffer, where the signal is unpacked and sent to a decoder, which
restores the synchronous signal to be played out to a local telephone
receiver or sent over a TDM channel. Echo cancellers and other DSP
functions sit on the synchronous side of the boundary. The VoIP payload is
not read or modified in the packet portion of the network.
The four main factors affecting VoIP conversation quality are the speech
codec used, the end-to-end delay, the rate of packet loss and application of
PLC, and how well echo is controlled. Jitter (variation in packet time-of-
arrival) contributes to delay and/or packet loss impairments, depending on
how it is handled, but does not directly affect the voice performance. The
listening level of the received signal is also important but not generally
affected by the packet transport.
End-to-end delay can affect the conversation dynamics (turn-taking and
interruptability) and introduce subtle changes in the listener's interpretation
of the talker's meaning. Jitter turns into either lost packets (when the jitter
buffer is too short) or longer delay (when the jitter buffer is long enough to
prevent packet loss). An adaptive jitter buffer can ensure that the jitter
buffer wait time is always long enough to prevent most losses, but never too
long so that unnecessary delay is added.
Packet loss introduces distortion. Short losses (< 40–60 ms) can be repaired
by Packet Loss Concealment (PLC), but longer losses result in missing
speech.
Echo is always present in full-duplex calls, and must be controlled by good
design in the network and the end devices, proper loss plan, and
appropriate us of echo control devices. The annoyance associated with
echo depends on both the level and the delay of the echo. A network echo
canceller is required at any interface between an IP network and the
wireline PSTN. Echo control is also needed in IP telephones and PC
clients, where coupling between the receiver and the transmitter may occur.

Some commonly used quality metrics were discussed: MOS, PESQ, and
the E-Model R. Subjective MOS is a directly measure of user perception of
quality, and may be carried out in lab or field studies. PESQ is a method of
estimating subjective MOS with an objective algorithm, and has been
standardized as P.862. The E-Model is a standard network planning tool
(ITU G.107) that generates another overall quality metric, R. R combines
fifteen objective measures, including the listening level, the delay, the
encoding distortion. While both PESQ and R can be translated to a MOS
value, such MOS should be considered for indication only. No comparison
of MOS derived from these different sources.
References
ITU-T Recommendation G.131, Talker Echo and its control, Geneva:
International Telecommunication Union Telecommunication
Standardization Sector (ITU-T), 1996.
ITU-T Recommendation G.711, Pulse code modulation (PCM) of voice
frequencies, Geneva: ITU-T, 1988.
ITU-T Recommendation G.726, 40, 32, 24, 16 kbit/s Adaptive Differential
Pulse Code Modulation (ADPCM), (includes Annex A: Extensions of
Recommendation G.726 for Use with Uniform-Quantized Input and
Output-General Aspects of Digital Transmission Systems), Geneva: ITU-T,
1990.
ITU-T Recommendation G.729, Coding of Speech at 8 kbit/s Using
Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-
ACELP), Rec. G.729, (includes Annex A, Reduced Complexity 8 kbit/s
CS-ACELP Speech Codec, and Annex B, Silence Compression Scheme for
G.729 Optimized for Terminals Conforming to Recommendation V.70),
Geneva: ITU-T, 1996.
ITU-T Recommendation P.800, Methods for subjective determination of
transmission quality, Geneva: ITU-T, 1996.
ITU-T Recommendation P.800.1, Mean Opinion Score (MOS) terminology,
ITU-T Recommendation P.830, Subjective performance assessment of
telephone-band and wideband digital codecs, Geneva: ITU-T, 1996.
ITU-T Recommendation P.861, Objective quality measurement of
telephone-band (300-3400 Hz) speech codecs, Geneva: ITU-T, 1998
(withdrawn).
ITU-T Recommendation P.862, Perceptual evaluation of speech quality
(PESQ): An objective method for end-to-end speech quality assessment of
narrowband telephone networks and speech codecs, Geneva: ITU-T, 2001.

ITU-T Recommendation P.862.1, Mapping function for transforming P.862

raw result scores to MOS-LQO, Geneva: ITU-T, 2003.
ITU-T Recommendation PESQ-LQ, A.W. Rix, Comparison between
subjective listening quality and P.862 PESQ score, Psytechnics, September
2003.
T1.508-2003, Loss plan for digital networks, Washington, DC: American
National Standards Institute, Committee T1, 2003.
TIA-810-A9, Transmission requirements for narrowband Voice over IP and
Voice over PCM digital wireline telephones.
Further information on VoIP service quality and performance
requirements:
G.113, Transmission Impairments due to Speech Processing, (includes
Appendix I, Provisional planning values for the equipment impairment
factor Ie and packet-loss robustness factor Bpl.) Geneva: ITU-T, 2001.
TIA TSB-116.1, Voice Quality Recommendations for IP Telephony,
Telecommunications Industry Association, 2001.
Y.1541, Network performance objectives for IP-based services, Geneva:
ITU-T, 2002.
9. TIA-810-A, TSB-116, and other VoIP standards are available from TIA for free at the following site:
http://www.tiaonline.org/standards/sfg/committee.cfm?comm=tr%2D41&name=User%20Premises
%20Telecommunications%20Requirements. Click on the first link (TR-41 VoIP Standards). Answer
the questions. This takes you to the page where you can download these standards for free.


67
Chapter 4
Video Quality
Peter Chapman
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
codec SIP To
Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
An overview of analog video systems
Impairments to analog systems
MPEG coding principles
A brief description of each of the common effects of coding
principles
Video
Video is the means of transmitting images by sending the instantaneous
brightness, color hue and color intensity of a picture element. The picture

elements are sent sequentially in rapid succession so that the image is

perceived as being instantaneously updated in real time.
The rate at which the complete image is refreshed is known as the frame
rate, and it is usually either about thirty fps (frames per second) for North
American television systems or 25 fps for systems used in most other parts
of the world. Motion pictures (movies) are usually created at 24 fps.
Analog video systems were designed for the technology of the day,
assuming the receiver or display device could not store the image. They
depend on the persistence in illumination on the screen (lag) to sustain the
image between updates. For this reason, a technique known as “interlace”
was developed that rendered the sequential writing of the image on the
screen imperceptible to the human eye.
Video Impairments
Video impairments come from many sources, such as capturing, digitizing,
compressing, and distributing videos. Nortel is concerned primarily with
problems resulting from distribution or transmission. The analog standards
are designed to ensure that a picture is created even when there is
corruption of a broadcast signal. Due to the brain and eye correlating
information from line to line (spatial redundancy), a picture can be
perceived in extreme signal degradation, provided the structure of the
picture is maintained.
Digital video impairments

Digital video systems produce very different impairments than analog
video systems. In digital systems, the frame format is generated at the
receiving end, therefore, loss of frame synchronization does not manifest
itself as frame roll as it does in analog systems. However, the digitizing
process introduces other artifacts into the received video image.
Compression is a technique used to trade off quality and error tolerance
against bit rate. Compression invariably decreases picture quality, but if
done well, many effects and artifacts that it introduces are not noticeable.
Current compression, which takes advantage of spatial redundancy in an
image, can reduce data rates by a factor of fifty or more. The current
standard in widespread use for TV and DVD is known as MPEG-2,
indicating the version of the standard created by Motion Picture Expert
Group (MPEG), a working group of ISO/IEC. MPEG does not define the
compression technology, intentionally so, in order to allow innovations.
MPEG defines only the format of the bit stream. However, MPEG suggests
methods for decoding the bit stream.
Disadvantages that result from the use of compression are as follows:
increased sensitivity to transmission errors

Chapter 4 Video Quality 69
delay in transmission
artifacts introduced into the image
loss of some information
Causes of video signal impairments

The causes of video signal impairments to analog signals are as follows:
luminance noise
chrominance noise
loss of synchronization signal
co-channel interference
electrical RF interference
color burst phase and frequency errors
Luminance noise
Random noise manifests itself as random speckles on the picture. To
minimize the effect of noise, some versions of broadcast TV use negative
picture modulation, which means that the peak output of the transmitted RF
signal corresponds to a black signal and minimum amplitude corresponds
to a white signal. Negative picture modulation is effective against
interfering noise, such as that generated by motor vehicle ignition systems,
which it was designed to counter. It generates random black spots in the
picture, which is far less intrusive than white spots. (Modern vehicle
electrical systems have largely eliminated the source of this problem.)
Chrominance noise
The effect of noise on the chrominance channel is to change the hue of the
chrominance signal with no effect on the luminance. Due to the
uncorrelated nature of noise, this effect reduces the saturation and intensity
of the reproduced color, producing a washed-out effect. High-frequency
chrominance noise manifests itself as dots or specks of varying color, but
these are not well defined, due to the lower chrominance bandwidth. They
do not have well defined edges because there is no change in luminance at
the edges.
Loss of synchronization signals

Most broadcast TV receivers can tolerate occasional losses or corruption of
the synchronization pulses. The receiver generates lines at a nominal rate,
equal to that needed to reproduce the picture. Synchronization pulses adjust
this rate to equal that of the received signal. If a synchronization pulse is
not received at the expected time, the receiver generates it at the current
line period after the previous synchronization pulse. Receivers adjust the

rate at which they create lines to be exactly the same as the received line
rate. They base this line rate on a composite rate from a number of received
lines so that a single corrupted or missing synchronization pulse does not
cause loss of synchronization. Sustained loss of synchronization pulses
causes the receiver to revert to its default line generation rate, which is the
nominal rate specified in the standards.
Co-channel interference
Co-channel interference is possibly the most annoying of all the broadcast
video impairments. It manifests itself in one of two ways. If the source of
the interfering signal is the same as the primary signal, but the propagation
path is different, it appears as a second signal but slightly delayed in time.
On the screen, it appears as a second signal displaced horizontally from the
first. The effect of the delayed field/frame synchronization pulse appears as
a darker column on the left side of the screen where the blacker-than-black
synchronization pulse is being added to the primary signal. If the source of
the reflected signal is moving, as if caused by a reflection from an aircraft,
then this second image position changes in position on the screen due to the
varying path length. If the co-channel interference is from a separate
transmitter, it appears as a second different image on the screen. Because
the two sources are not perfectly synchronized, the two images move
spatially in relation to each other.
Electrical RF interference
RF interference that emanates from a source other than a video manifests
itself as diagonal lines or patterns superimposed on the picture. This type of
interference can be very annoying.
Color burst phase and frequency errors

Small errors in color phase or frequency cause significant errors in hue,
particularly in the NTSC system. The PAL system was developed to
overcome this problem by reversing the subcarrier phase on alternate lines.
This is less of a problem in cable distribution systems than on broadcast
systems. Phase errors caused by signal reflections do not occur in cable
systems.
Digital video
Compression
Compression invariably uses one of the MPEG standards. MPEG does not
define the compression codec, instead, it defines the format of the
compressed information and suggests how it is reconverted to video. This
method allows for the development of compression tools and techniques
based on experience. There is significant development in this area. Early

MPEG decoders required much operator interaction, primarily to choose

the type of encoding tool based upon the program material. For example, a
slow moving scene requires good background detail and definition, but it
needs infrequent updating. A fast moving scene can sacrifice background
definition because the eye and brain is unable to follow, but it needs
frequent updating. Similarly, certain sporting events require a very small
part of the scene, say a ball, to be given high priority in being tracked
accurately. At the same time, the background information can update more
slowly.
Video information is sent as discrete pictures, and pictures are clearly
delineated as such. Frame and field accurate timing is regenerated by the
device decoding the frames. Strict field timing information is not sent as
part of the signal stream. Instead, information relating to the sequence of
frames or fields is sent with information that allows the frames to be
reassembled in sequence and matched to the strictly timed audio signal.
Encoding issues
MPEG provides a bit stream definition. To create this bit stream, encoders
use a toolbox of different techniques, which are proprietary and dependent
on the encoder used. Details of these techniques are beyond the scope of
this document. However, the video quality is dependent on the encoding
scheme and the tools used. There are wide variations in the perceived
quality of encoded video.
Decoding issues
Visual complexity, spatial and temporal

The MPEG compression system has difficulty dealing with certain types of
rapid motion information. Sports events provide a particular challenge
because they contain a large expanse of spectators, whose image is rapidly
moving across the screen due to panning. In early implementations, some
algorithms disregarded small rapidly moving objects if they occupied only
a small part of a relatively static picture. This proved to be a problem for
sports events such as tennis, baseball, and cricket where it was necessary to
follow a ball.
MPEG sends video information by coding information from pictures and
sending these in sequence. MPEG defines a picture as either a frame or a
field depending on whether it is interlaced or not. Interlaced pictures
consist of two interlaced fields. Interlacing is a technique whereby alternate
lines are sent in sequential half frames, called fields. The first of these will
consist of the odd numbered lines followed by a field of even numbered
lines. Interlacing was introduced in the early days of analog television to
reduce visible flicker for a given frame rate. It is unnecessary in modern

digital technology, and introduces some undesirable artifacts, but is

maintained for compatibility with legacy standards and equipment.
MPEG defines three different types of frame:
I frame. This frame uses intracoding with no motion
compensation.
P frame. This frame uses forward prediction. The previous frame is
used to predict the current frame.
B frame. This frame uses the previous and next frames to
determine the current frame.
I (Intra frame coding) frame

An individual I frame enables the complete static scene to be reproduced
with reasonable fidelity. It is compressed using various techniques,
primarily the Discrete Cosine Transfer Function (DCT) to encode the
information. To perform the transfer, the image is reduced to blocks of 8 x
8 pixels. This matrix of 64 pixels is scanned and a DCT performed on it.
Loss of any of these elements prevents the block from being updated,
therefore, if the picture is rapidly changing, the block from the previous
frame remains. This result affects more than this frame because the I frame
is used as a basis for creating the next sequence of frames until the next I
frame is sent.
A DCT is used for most picture content, because many of the coefficients
are reduced to zero or near zero and can be run length encoded. This
significantly reduces the amount of data that needs to be sent. The
coefficient reduction to near zero introduces a video artifact known as
shimmer or “Gibbs effect” that appears as movement in plain areas close to
edges. It is worth noting that the conversion to DCT is a completely
lossless, hence reversible, transformation. But once the near zero
coefficients have been reduced to zero, the reverse transformation can not
be performed. The compression has become “lossy.”
B and P frame
Rather than send complete information for every frame, the MPEG
standard provides for sending two other types of frames, containing only
difference information from the I frames. Such frames, known as B and P
frames, send considerably less information than I frames do. P are
“Predictive” frames that carry only information relating to the difference
between the last frame and the current frame. B frames are “bidirectional”
and carry difference information based on the immediate past and future
frames. The MPEG decoder then takes this information and together with
the information from the most recent I frame or reconstituted frame from I
frame and previous P frame or B frame, makes a prediction of the next
frame.

A result of packet loss in an I frame, and its associated error in that frame,
is likely to result in an error being propagated until the next I frame.
Typically, the error is propagated for ten or twelve frames, though this is
entirely at the discretion of the coding system. These ten or twelve frames
last about one third of a second and are noticeable.
Missing packets in received video streams manifest themselves as
displaced static blocks. A block of 8 x 8 pixels appears at the wrong
location in the picture frame and is static. This distortion is corrected when
the next complete sequence is received, normally an I frame.
These P and B frames use motion compensation to further reduce the
information being sent.
Motion detection works on a macro block, which is a matrix of four blocks
in a 2 x 2 matrix (16 x 16 pixels). The encoder determines the motion by
looking for a match within adjacent blocks on the subsequent frame. It then
sends a vector indicating direction and position of the matching macro
block. The first vector (left upper-most element) is sent as a complete
vector and this vector is followed by subsequent horizontal elements being
sent as differences from this first vector. This vector and group of
differences is known as a slice. Over what range the match attempt is made
is determined by the encoder. If the range is a wide area, it causes
significant delay in encoding, and inevitably there is a trade off between
encoding delay and compression efficiency. For non–real-time coding, for
example, when preparing a movie for DVD, the delay is not such a
problem, but for live events it is.
Because of the resolution of the vectors and possible nonlinear motion of
the object, the resultant image is not necessarily accurate. Therefore, the
encoder calculates the new image from the calculated vector, and it then
compares this calculated image with the actual image. The encoder sends
as part of the P frame, the difference information, as well as the vector.
Sending an approximate vector with the difference information is more
efficient than sending an accurate vector.
P frames and error propagation

To use P frames, knowledge of the current image is needed. However, if the
current image is created from a previous P frame, an error in a P frame
propagates until the next I frame is received.
B frames, bidirectional frames

B frames offer slightly more protection. They use both the previous image
and the following image to determine the current image. Because
information revealed by moving objects is not available from past images,
but is available from future images, it can be used to reproduce the
background exposed as a result of moving objects. B frames are not coded

from each other so they do not propagate errors. Because B frames, when
coded from future frames, need those frames ahead of the B frame itself,
the sequence of frames, usually referred to as a Group of Pictures (GOP) is
sent in a sequence other than purely chronological so that both before and
after images are available to the decoder before the B frame is calculated.
This calculation requires a buffer of several frames duration to be present in
the decoder. B frames can be calculated from the previous frame, from the
next frame, or by an interpolation from both before and after. When the
calculation is made using both before and after frames, MPEG supports
simple linear interpolation from two frames, before and after, but does not
support weighted interpolation to handle multiple B frames interpolating
between P and I frames.
Sequences of frames
I frames are much more critical than P frames or B frames, therefore, loss
of an I frame is much more serious than loss of a P or B frame.
A typical MPEG sequence is as follows:
IBBPBBPBBPBBI
A reordering sequence is as follows:
IPBBPBBPBBPBI
Due to different error performance among DVD, broadcast, Digital Video
Broadcast (DVB), and Internet streaming, different sequences are suitable
for different applications. Material needs to be encoded for the medium for
which it is designed. This is a new area and requires experience at finding
the best method.
Compression issues
The distortions resulting from compression are as follows:
blocking
blurring
shimmer
smearing
edge distortion
jerkiness
luminance noise
chrominance noise
contouring

Blocking
Blocking (Figure 4-2) is the effect of blocks of pixels appearing in the
wrong location on the image. Sometimes, blocks appear in the correct
position but in the wrong color because corruption to the color information
has occurred.
Figure 4-2: Blocking
Blurring (low resolution)

Blurring (Figure 4-3) is caused by insufficient resolution of information
into pixels. If information changes rapidly and the pixel update rate is not
sufficient to keep track of it, blurring occurs. Credits on movies typically
suffer from blurring because the rate at which they are scrolling is
incompatible with the MPEG data refresh rate.
Figure 4-3: Blurring (low resolution)

Shimmer, the Gibbs effect

Shimmer (Figure 4-4) is a phenomenon whereby edges of objects appear to
shimmer. It is caused by the DCT representation not exactly representing
the time domain information. A contributor to this effect is compression
applied to the DCT. High-frequency components with low coefficients are
reduced to zero because the loss of high frequencies is less noticeable, and
they sometimes represent noise. Therefore, reducing these coefficients to
zero significantly reduces the amount of data with little noticeable effect
other than this artifact.
Figure 4-4: Gibbs effect
Smearing
Smearing typically occurs if the luminance does not significantly change at
an edge, but the color does. Because the chrominance resolution is less than
the luminance resolution, edges tend to blur. This effect is often seen on
colored titles superimposed on a colored background.
Edge distortion
Edge distortion (Figure 4-5) is caused by a number of effects. A common
cause is the application of compression to an interlaced image. Edge
distortion or a comb effect, is caused by the two fields, each representing a
sample of information that represents a different time, presented as a single
frame. Horizontally moving objects display in a different horizontal
position on the two fields. Another cause of edge distortion is problems
with the MPEG motion vectors.

Figure 4-5: Edge distortion
Jerkiness
Jerkiness, resulting from a lack of smooth information, is caused by a
number of effects. Change of frame rate from 24 or 25 to 30 or vice versa,
causes nonsmooth motion, noticeable on the background during horizontal
panning. Other causes are problems with motion estimating vectors,
particularly during accelerating objects.
Luminance noise
Luminance noise appears as specks on the screen. It is caused by a noisy
input source. If possible, remove noise using analog techniques prior to
digitizing.
Chrominance noise
Chrominance noise appears as color spots with no obvious brightness
change. If possible, remove the source of the noise.
Contouring
Contouring (Figure 4-6) is an effect appearing as lines indicating
quantization level changes. It is seen on areas where there is a smooth
intensity gradient. It is a known problem in digitizing video and is normally
dealt with by adding a random signal equal to one half quantization level,
known as dither. Dither moves the quantization level slightly, resulting in
randomizing the level changes so that they do not appear as a line and

become invisible. In addition, MPEG allows for finer quantization levels on

a selective basis to reduce this problem.
Figure 4-6: Contouring
Audio/video synchronization
Audio/video synchronization needs to be ± 50 ms or better. Many
compressed broadcast material does not achieve this synchronization and
lack of lip synchronization is noticeable. The most common cause is poor
attention to the delay introduced in the compression process for the audio
and video channels.
Effects of packet loss

With the introduction of packet switching techniques, particularly the use
of Internet Protocol, packet loss is a concern. With high data rates,
relatively low loss rates can affect video quality. Video packet losses
become perceptible at 10–4 (0.01%).
Figure 4-7: Effect of packet loss


The reader should have learned an overview of analog video and the
problems that the original analog system was designed to overcome. A
description of frame and field were provided as well as the need for and the
effects of interlaced scanning.
The reader should understand the need for and methods of video
compression.
The chapter also presented a description of the causes of video
impairments and the manner in which they manifest themselves. The
reader should be able to identify the most common video impairments and
determine which are the results of imperfections in the original source
material, which are analog impairments, and which are the result of loss of
information due to corruption of the packet stream.
The reader should understand the limitations of both analog and digital
video systems.
The chapter contains a description of the components of MPEG video
compression and some effects of video compression. The reader should
have learned the three types of video frames used in MPEG systems (I, B
and P frames) and the significance of corruption of the data stream
containing those frames. Also discussed in the chapter was macro blocks
and the applicability of macro blocks to motion detection and prediction.


81
Chapter 5
Codecs for Voice and Other Real-Time
Applications
Leigh Thorpe
Peter Chapman
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts Covered
Digitization of analog signals
General characteristics of codecs such as sampling rate, bit rate,
and compression ratio
Coding impairments such as baseline quality, encoding delay,
performance with missing packets, and transcoding
Speech codecs for telephony including G.711, G.726, G.729/
G.729A, G.723.1, GSM-EFR, and GSM-AMR

Selecting a codec for your VoIP network

Overview of audio codecs for streaming and other applications
Overview of video coding and codecs used for streaming video
(video-on-demand) and other video services.
Introduction
Much of the content of human communications consists of analog signals.
To ride across digital networks, analog signals must be converted into
digital form. Codecs play a key role in formatting, compressing, and
translating digital data from analog signals. The characteristics of the codec
contribute significantly to the efficiency and the final quality of the
transmitted signal. In this chapter, we discuss codecs used in real-time
telecommunications applications and services, primarily speech codecs,
and more briefly, audio and video codecs. The discussion will introduce the
parameters that underlie the performance of a codec and will explore
effective use of compression codecs in real-time communications systems.
The field of digitization, encoding, and compression is a broad one, and
this chapter addresses only a very small portion of that field. We will not
cover the mechanics of compression or encoding for the purposes of
encryption. Readers unfamiliar with the digitization process should begin
with the following Sidebar, which describes basic analog-to-digital
conversion.
Sidebar: The coding process–PCM and compression

Real-time applications often require the transport of analog signals.
Sound waves are analog signals, as are ordinary video signals. An
analog signal is digitized using an A-to-D converter (analog-to-digital or
A/D). The digital signal may be variously processed, stored, or
transported, but is eventually reconstituted to analog form for a listener
or viewer by means of a D-to-A converter (digital-to-analog, or D/A).
An A/D converter outputs a simple digital bit stream. This basic digital
format is called Pulse Code Modulation (PCM), and this linear form is
called linear PCM. Restoring a signal to analog form is sometimes called
reconstruction.

Chapter 5 Codecs for Voice and Other Real-Time Applications 83
Sidebar: The coding process–PCM and compression (Continued)

A codec (combining the terms code and decode) is a device that
processes the simple bit stream from an A/D converter according to how
it will be used; processing may change the format of the data, reduce the
amount of data, and so on. A codec has two parts: an encoder and a
decoder. An encoder creates a specific signal that is recognized by its
matching decoder, which returns the bit stream to its preprocessed form.
For stored media such as MP3 music files, digitization, encoding and
decoding functions usually occur at different times, and there are many
decoders for each encoder. In real-time telecommunications, encoders
and decoders commonly come in pairs, and operate on a real-time signal
such as a live talker, with the encoder at the input end of a digital
transport path, and the decoder at the output end. In telecom, the A/D
and D/A functions are grouped with the encoder and decoder,
respectively, and are sometimes considered to be part of the codec.
PCM determines the amplitude of the signal at regular intervals (the
sampling rate), and assigns a value to that amplitude. The sampling rate
is important because it limits the highest frequency that can be
represented in the digitized code. The highest frequency that can be
adequately tracked is half the sampling rate.
Figure 5-2 shows how an arbitrary analog signal is digitized into a linear
PCM format. Because the process is based on discrete amplitude values,
it is sometimes referred to as quantization. Samples are taken at regular
intervals, indicated by ticks on the x-axis. The quantization steps (the
amplitude values available) are shown as light gray lines. The interval
between ticks is determined by the sampling rate. At each tick-mark, the
process determines a quantized value corresponding to the amplitude of
the signal. There are no in-between values: if the signal amplitude is
between two digital steps, it is assigned to the closer one. The
determination of the amplitude value is instantaneous, and does not take
into account whether the amplitude is going up or down.

Note deviation between

actual and quantized values
(Quantization Steps)
Original Signal
Signal Amplitude
{
5
0 time
}
5
Sampling rate too low
to respond to high
Digitized Signal
frequency content
Figure 5-2: Digitization of an analog signal. Note the

deviation between the actual signal amplitude and the
quantized values, as well as the loss of information where
the frequency content (up and down swing) is too high to
be captured given the sampling rate.
There are eleven quantization steps in Figure 5-2, and the step spacing is
linear, meaning that the amplitude represented by Step Five is five times
the amplitude represented by Step One. The dynamic range of sound
amplitude is very broad, and a large number of linear steps are needed
for high quality reproduction. Sixteen-bit linear PCM (used on CD and
other high quality audio) has 216 (65,536) quantization steps covering 96
dB of dynamic range. The dynamic range of a coding system refers to
the difference between highest (loudest) and lowest (quietest) signals
that can be represented.
If the appropriate filtering is used in the digitization and reconstruction
processes, the deviation between the actual amplitude and the integer
value assigned shows up as noise in the signal, called quantization noise
or sometimes quantization distortion. The number of quantization steps
and their spacing determine how much quantization noise is added to the
output. The more steps, the closer the quantized value will be to the
actual amplitude, and the lower the quantization noise. The more
amplitude steps used, the more bits needed, and the higher the bit-rate
needed to represent the signal in digital form.


Again, higher sampling rates preserve more of the original signal and
generate more data. The sampling rate should be chosen based on the
frequency content of the signal and the required fidelity. General audio
signals, including music, may contain frequencies up to (and beyond) 20
kHz, the upper limit of human hearing. Accordingly, CD audio uses a
44 kHz sampling rate (44,000 times per second). Human speech
contains frequencies up to about 7 kHz, and a sampling rate of 16 kHz is
sufficient to digitize speech. Conventional-band telephone speech
preserves only part of the speech band, frequencies up to about 3500 Hz,
because most of the information in human speech is contained in this
frequency range.
Telephone signals are sampled at 8 kHz, which records an eight-bit
sample every 125 µs (microseconds). This results in a bit rate of 64
kbit/s.
Compression techniques can reduce the amount of data needed to
represent a digital signal without sacrificing too much quality. A simple
type of compression employs nonlinear amplitude steps. The basic
digital encoding standard used for telephone voice, G.711, uses
amplitude quantization defined by one of two logarithmic functions, the
A-law and the µ-law. The logarithmic steps allow an eight-bit
quantization to encode a dynamic range similar to that for a twelve-bit
linear quantization.
Higher compression can be achieved using Digital Signal Processing
(DSP) techniques. A common method is to encode the difference
between one sample and the next. For instance, Adaptive Differential
PCM (ADPCM), a technique used with speech and audio signals,
reduces the bits needed by encoding the difference between one PCM
sample and the next, with a step size that adapts to the signal. Many
video codecs use a differential encoding technique. To compute the
difference between samples, an codec must wait until the second sample
is obtained, which increases the time taken for encoding. Another
disadvantage of differential techniques is that they take longer to recover
from lost data than do nondifferential techniques.
Basic Characteristics of Codecs

Codecs in telecommunications play one of two roles. First, as discussed in
the Sidebar, they convert analog signals to digital data that can be sent
across a network or processed further by Digital Signal Processors (DSPs).
Second, they are used to convert one type of digital signal to another.
Devices fulfilling this second role are more properly called transcoders,
although this term is rarely used. Later, this chapter discusses the effects of
using such codecs in the transport media path, which is indeed called

transcoding. In almost all situations, the analog-to-digital conversion is

done using a linear codec, and a codec of the second type is applied to the
output of the first, to transcode the signal to another format.
Certain basic information describes a codec. The basic approach
determines the codec type. Parametric characteristics include sampling
rate, data rate (bit rate), encoding delay, frame size, look-ahead, and
compression ratio. Finally, the baseline coding performance indicates
how well the codec reproduces the input sound signal.
Types of Codecs
Codecs are designed to address particular signal domains (for example,
audio and video signals). Within each domain, codecs can be designed to
handle a broad range of signal types, or they can be tailored to optimize
performance on a particular type of signal. In the audio domain, there are
general audio codecs, others specifically for speech signals, and others
specifically for music. Codecs can also be classified by the format of the
output data. Basic audio codecs that directly represent the amplitude of the
analog signal (such as PCM codecs) are called waveform codecs. G.711
and G.726, two telephony codecs, are both waveform codecs. The bit
stream (the digital representation) of a waveform codec contains explicit
information about the amplitude and frequency of the sound signal. These
are sometimes called sample-based codecs. Other codecs work on a
principle that groups information together in bundles called frames. Frame
size for speech and audio codecs is usually given in terms of the duration of
signal contained in the frame, say, 10 or 20 ms. Before the codec can
encode the frame, it has to wait for the entire frame of signal to collect in a
buffer.
Video codec frames are based on the natural succession of images making
up the original signal, each frame corresponding to one image. A video
frame thus refers to either the data corresponding to the frame, or the image
itself. The frame rate used by a video codec is a key determinant of the
smoothness of motion depicted in the playback.
Some frame-based codecs also look at the signal following the current
frame, which means they wait a little more before they begin processing the
frame. This little extra bit of signal is called (not surprisingly) the look-
ahead. The look-ahead for speech codecs is around 5–10 ms. Look-ahead
in video codecs is usually the entire following frame; this will be 40 ms for
full-motion frame rate.
Codecs can also be classified by the type of algorithm they use. Common
types in this classification include PCM, ADPCM, (see Sidebar for a
discussion of both of these approaches), sub-band coding, in which the
signal is separated into multiple narrow frequency bands, each of which is
encoded separately, and Code Excited Linear Prediction (CELP). CELP

codecs are very common in voice telephony, and are described in greater
detail in the following section.
The sampling rate describes how often a digital value is defined from the
instantaneous value of the analog signal. The data rate or bit rate of a
codec refers to the number of bits per second that are needed to transfer the
signal from the encoder to the decoder. These terms are described in more
detail in the Sidebar above. Codecs used to transcode the signal from one
digital format to another generally inherit the sampling rate of the original
analog-to-digital [A/D] conversion; although, downsampling (changing to
a lower sampling rate) is possible.
Some codecs operate at one data rate; others have different rates available.
Constant bit-rate (CBR) codecs carry on at the same rate no matter what
the signal (or even when there is no signal). Variable bit-rate (VBR) codecs
can adjust their rate as coding proceeds. A VBR codec can select a bit-rate
based on the content of the signal being encoded (voiced vs. unvoiced
speech, speech vs. silence, full video scene change vs. a static image), or
based on other factors such as the channel quality or capacity.
All codecs take a finite time to encode and decode a signal. This time is the
encoding delay. Waveform codecs are very fast (microseconds), while
frame-based codecs have a significant delay built into them that can not be
reduced. This delay minimum is the algorithmic delay, which consists of
the frame length plus the look-ahead, if the codec uses one. The encoding
delay consists of the algorithmic delay plus additional time needed by a
finite speed processor to complete the processing. For practical reasons,
this delay is estimated as twice the frame size plus the look-ahead.
Compression
Linear encoding generates a lot of data, and transferring or storing them
uses a lot of capacity. To reduce the volume of data, and hence the capacity
needed to handle it, compression techniques are employed. Compression
can be lossless or lossy; lossless compression is called for where it is
necessary to restore the exact signal content. For voice and video
telecommunications, lossless compression does not sufficiently reduce the
data rate, so lossy methods are used. Some simple compression techniques,
such as logarithmic coding, restriction of dynamic range, and differential
coding, are described in the sidebar.
More complex techniques can compress the signal more efficiently. Many
coding algorithms include a modeling process that makes some
information or assumptions about the characteristics of the signal, the
channel, or human perception. Speech codecs model the acoustics of the
human speech production apparatus. Assuming a signal is speech limits the
number of different kinds of sounds that the codec needs to reproduce.
Speech codecs are designed to process speech signals, and they do it well,
on the other hand, they usually perform poorly with nonspeech signals such

as noise and music. Audio codecs are aimed at a broader range of signals,
and are less likely to put different types of sounds at a disadvantage.
One of the most common compression methods is Code Excited Linear
Prediction (CELP). CELP codecs are frame-based. A CELP coding
algorithm deconstructs the signal into two parts: a spectral model
component and a residual component. The spectral model component is a
digital filter. The residual is the left over part of the signal not accounted for
by the filter model. When the residual component of the signal is passed
through the filter, the segment of speech contained in the frame is
reproduced. The filter parameters are quantized, and the quantization
indices (step numbers) for each parameter are obtained. A table called the
codebook contains numbered entries corresponding to choices for the
residual. The encoder determines the codebook entry that best matches the
residual. These quantization and codebook indices make up the encoded
data that are sent to the decoder. Since the encoder and decoder use the
same codebook, the decoder can easily look up the codebook entry
corresponding to the residual. The quantization indices are used to
reconstruct the spectral model filter. The decoder then filters the residual
codebook entry to reproduce the signal segment for that frame.
Coding distortion occurs because the codebook entry is not an exact match
for the actual residual. The subjective quality of a CELP codec depends on
both the size of the codebook, which determines the number of signal
segments that could be used to represent the residual, and how well the
distortion takes advantage of the “blind spots” in the human perception of
distortion (some types of distortion are less apparent or less irritating). The
size of the codebook affects the bit rate needed to operate the codec.
The efficiency of compression is quantified in the compression ratio, which
is the ratio of the data rate of the compression codec compared to either the
uncompressed signal or some standard digital process. The standard digital
process for telephony voice is G.711 at 64 kb/s, which is the rate used for
an individual channel (DS0) in a TDM network. Audio and video
compression is usually compared to the rate of the linearly encoded signal.
For audio, this is 16-bit linear PCM (used in CDs). For video there is no de
facto standard; the data rate of a linear signal will depend on the frame rate,
display size, and other characteristics of the original analog signal.
Coding impairments
With the exception of delay, impairments associated with the digitization
and compression of audio and video signals are similar for real-time and
non–real-time operation. The performance of individual codecs is
determined by the amount of distortion they add to the target signal, as well
as how they behave with any unwanted signal components (such as noise),
how much the signal degrades when it is passed through the codec multiple
times, and how disruptive data loss is to the output signal.

Constraints on delay, however, put limits on our ability to avoid or mitigate

the other impairments. This section discusses coding impairments, their
impact on real-time applications, and what if any steps can be taken to
manage their effects.
Encoding distortion
As discussed above, digitization and compression always reduce the
information content of the signal. This can manifest itself as distortion
(change in the shape of the waveform), as increase in the noise floor
(“hiss”), or addition of so-called coding artifacts. For PCM codecs, it
depends on the sampling rate, quantization step size, and whether any
compression techniques (such as logarithmic companding or differential
encoding) are applied. Frame-based codecs that aim for higher
compression will add more distortion. It is often assumed that the lower a
codec's bit rate, the more encoding distortion it will add. While bit rate has
some direct relationship with distortion (see description of the CELP
coding in the previous section), advances in coding technology have made
successive generations of low bit-rate codecs much better than previous
generations. Comparisons across different technologies (differential PCM
vs. CELP, say) do not follow such a simple relationship, either. (The reader
can inspect data rate vs. coding impairment for various codecs in
Table 5-2).
Coding distortion in waveform codecs does not depend on signal type.
CELP codecs, however, because they are generally tuned to a particular
type of input signal, distort different signals in different ways. CELP-based
speech codecs perform poorly with nonspeech signals, including
background noise, DTMF1 tones, and music. This means that CELP codecs
often require a DTMF work-around to detect, transfer, and reconstruct the
tone without putting the signal through the CELP encoder. Music on hold
will be significantly degraded by CELP compression. CELP codecs also
vary in their performance with different voices. Because of the way they
work, CELP codecs work best with lower-frequency voices, so they
generally reproduce men’s voices better than women’s and children’s
voices.
Given this dependence on input signal, evaluating encoding distortion of
CELP and other compression codecs can be a complex process. The most
reliable method is formal subjective testing. This is usually done with
listening tests, so that the codec’s performance on many different input
signals can be examined. Listeners in these tests rate the quality on a scale
of one (bad) to five (excellent), and the resulting average is known as Mean
1. Dual-tone multifrequency (DTMF) tones were invented to pass some signaling such as number dialled
over analog equipment. They are also used by network and proprietary features such as access to
voice mail, credit card number entry, and so on. Network transparency to DTMF is required for these
features to work.

Opinion Score, or MOS, which ranges between one (poor quality) and five
(high quality).
At least three different approaches to estimating encoding distortion are
used: (1) using a MOS from one test case (or a weighted average from
several) from formal subjective evaluation,2 (2) estimating MOS using an
objective quality estimation technique such as P.862 (PESQ*) (currently
available for speech codecs only), or (3) assigning a value indicating the
general extent of impairment, derived from a prescribed subjective test.
The first is an older strategy and works well if the codecs of interest were
directly compared in the same subjective study. The main limitation is that
no bench testing can be done using this method. The second approach can
make measurements on arbitrary systems or components. However, these
methods depend on complex algorithms that are used to obtain the
distortion estimates, and they are not guaranteed to map onto the values
that would have been obtained in a subjective test, since the models are
incomplete. The final approach is used in the ITU E-Model3, where
subjective test results are used to generate an Equipment Impairment (Ie)
value for each codec. The Ie value is then used in the modeling. Ie is cited
in Table 5-2 below as an indicator of the baseline coding quality of each
codec described. All these methods are discussed in more detail in
Chapter 3.
Encoding delay
While encoding by a waveform codec is virtually instantaneous, encoding
by a frame-based codec may introduce a significant delay. Before a frame
can be processed, a frame's worth of speech must collect in the buffer.
Where a look-ahead is used, the encoding window stays a fixed time ahead
of the current frame, and this time is added to the delay.
The encoding delay can be calculated as (frame size + look-ahead +
queuing delay + processing delay). The queuing delay is the time between
a complete frame of speech becoming available and when that frame is
submitted to CPU for processing. Processing delay is the time taken for the
processor to execute the algorithm for that frame. Queuing and processing
delay depend on a particular implementation and processor. A conventional
formula for estimating encoding delay in the absence of a specific
implementation is (2 x frame size) + (look-ahead). This formula assumes a
2. The tendency of authors to cite “the” MOS associated with a particular codec is misguided. No specific
MOS can be assigned to a codec. Different input signals will return different scores, and a change to
the test cases or reference cases used can shift the scores. It is the pattern of results for input signal
types with one codec and for different codecs that is key to understanding a codec’s performance.
This is discussed further in “Chapter 3 Voice Quality”.
3. Additional details on the definition and operation of the ITU E-Model are covered in
“Chapter 3 Voice Quality”.

worst-case scenario that the sum of queuing delay and processing delay is
equal to the framesize, which is the maximum tolerable delay for real-time
operation. It is based on considerations of efficiency: a powerful processor
could encode each frame very quickly, resulting in a delay only a few
microseconds longer than the framing delay. However, the processor would
then sit idle until the next frame is ready. This would require a relatively
expensive DSP. Instead, systems are often designed around a processor that
is powerful enough to finish the processing of one frame just as the next
one is ready. Using optimal scheduling and some other techniques, system
delay can be reduced to (frame size + look-ahead + processing delay).
Decoding is typically a small fraction of encoding time for CELP codecs.
Packet Loss Concealment (PLC) may add a few milliseconds.
Because of the integration of functions in the DSP chip, it is not always
possible to assess the delay associated with individual steps in the
processing. Integration allows more parallel processing and thus provides
the opportunity to reduce the end-to-end delay; on the other hand, it makes
it more difficult to partial out the contributions of the different functions to
the end-to-end delay.
Performance with Missing Packets

VoIP can benefit from packet loss mitigation techniques. Some packet loss
concealment can be built into or added to the decoder, and other techniques
(an adaptive jitter buffer, interleaving of frames in the packet flow, or
sending of duplicate information) are external to the codec.
For speech and audio, the effects of gaps of 40-60 ms or less will depend
on the particular codec and PLC technique. For example, lost data may
cause codecs using adaptive techniques to deconverge, with the result that
impairment will propagate forward in time, lasting until the adaptive
features restabilize. As well, PLC algorithms can distinguish themselves
with their ability to smooth and repair the signal. PLCs are designed to
mute the output when faced with long bursts of loss, since it is not possible
to repair the speech in such cases.
Some speech coding standards designate a PLC as part of the standard.
Other codecs require an external PLC. For codecs requiring an external
PLC, concealment algorithms may be commercially available. These will
be marketed with optimistic claims about the effectiveness of the repair
process. Some methods integrate PLC with an adaptive jitter buffer.
Adaptive jitter buffers prevent some packet loss by trading it for delay
(covered in Chapter 3). The majority of improvement from these methods
may be attributable to the adaptation of the jitter buffer4.
While it is unlikely that any methods can live up to the more extreme
claims, such methods will improve performance of VoIP running over best-
effort networks compared to VoIP without mitigation or concealment. VoIP

running on a managed network will likely require little assistance from

mitigation, provided loading and jitter sources are properly controlled.
Mitigation techniques provide useful “insurance” for call paths running
over the Internet or other best effort networks, where these factors are not
rigorously controlled.
Silence Suppression
Silence suppression is a technique that conserves network capacity by
identifying only the portions of a signal containing active speech, and
sending those portions while discarding or suppressing the portions that do
not. If you're familiar with silence suppression, you've probably been
thinking of it as a VoIP feature, so you may be surprised to find it lumped
in with impairments.
Silence Suppression capitalizes on conversational turn-taking: partners
alternately talk, then listen. On average, each voice path in a two-party call
is active 40-60% of the time. This means that about half the time, a channel
carries no speech signal. To reduce the amount of data to be sent across the
network, only data that encodes actual speech is sent. Data associated with
silent intervals between utterances is discarded.
Silence Suppression employs a Voice Activity Detector (VAD; also called
Speech Activity Detector or SAD) to determine whether there is speech on
the channel, and a noise estimator that samples the noise background noise
and sends coefficients describing the noise to the decoder end of the call.
The detector is actually configured to detect the absence of speech rather
than its presence. This inverts the logic of the detector, which provides a
kind of fail-safe: if the signal is ambiguous in some way, or the detector
fails, the decision outcome will be that speech is present. Therefore, speech
content will not be suppressed accidentally. Two VAD parameters are
important to Silence Suppression performance: (1) the detection threshold
and (2) the hang time (which determines the minimum time that data will
be sent once the algorithm has determined that a signal is present). In
addition, the network operator must decide the peak capacity of each link,
which will determine the probability that the channel becomes overfilled by
active speech data.
At the decoder end, the speech bursts are played as sent, but the silences
between them are filled with a synthesized background noise called
4. Some vendors of enhanced methods claim to be able to repair speech with up to 30% packet loss.
These techniques invariably compare a PLC algorithm combined with an adaptive jitter buffer to
the performance of a system using an ordinary (or no) PLC and a fixed, moderately sized jitter
buffer. This algorithm does not repair 30% packet loss. Instead, it prevents loss associated with
late packets arriving at the jitter buffer, only to be discarded because they are too late to play out.
While this greatly improves the sound reproduction, it does so by adding delay, at least temporari-
ly. Should the network suffer high packet loss from drops or discards in the core, repair by these
algorithms will not be much better than standard PLCs.

comfort noise. The level and spectrum of the comfort noise are determined
from the coefficients received from the encoder end. This works best for
stationary and quasi-stationary noise, like car interior noise or crowd
babble. Dynamic noise, such as street noise is more difficult to match.
Differences in level between the actual noise that arrives mixed with the
speech and the comfort noise generated at the decoder creates an audible
contrast. This makes it obvious that something is interfering with the signal
or is being turned on and off.
The use of silence suppression can cause several impairments:
front-end clipping, where the beginnings of utterances are
removed
background noise contrast, where noise used to fill the silent
periods is noticeably different from the background noise audible
during speech
noise pumping, where peaks in the background noise trip the
detector and background noise is transmitted momentarily (for the
duration of the algorithm's hang time)
data loss, caused when the total volume of active speech exceeds
the capacity of the link it is carried over and some data must be
discarded
The first three impairments result from detection errors, while the last
results from a provisioning trade-off between statistical fluctuation in
speech activity and the number of channels carried over the link. Design
parameters (threshold and time constants) determine the amount of silence
detected as well as the accuracy of the decisions. In general, aggressive
settings will detect more silence and both long and short silent intervals,
even short silences within one talker's speech. Conservative settings, on the
other hand, remove only long periods of silence. At the same time, the
aggressive settings create more opportunity for errors. In quiet, speech is
easily differentiated from non-speech, so the parameter settings are not
critical for this case. Elevated background noise, however, can exceed the
threshold preventing the detection of silence. Tuning the threshold and time
constants can prevent such errors, but this can lead to the inverse, where
lower level speech is mistaken for silence. These errors can cause audible
artifacts in the output: the front ends of words may be clipped off or quieter
sections chopped out.
The total speech load to be carried across the network will depend on the
number of channels with active speech at any one time. Within aggregated
flows, silence suppression reduces the peak bandwidth needed to carry
voice traffic. Where the number of talkers is high (as in the network core),
the distribution of peak talker data rate will be based on the statistics of
large numbers and the actual peak data rate will rarely exceed the capacity

of the link. Silence Suppression can reduce speech traffic by 30–35%;

attempting more increases the likelihood of impairments.
Silence suppression employs statistical multiplexing (statmux). Where
silence suppression is used to put more calls on a link than the system can
guarantee bandwidth for, occasionally the packet volume will exceed the
provisioned capacity, causing jitter and packet loss. Capacity engineering
for silence suppression requires a determination of (1) the capacity of the
link, (2) the expected reduction in data given the VAD settings, and (3)
determination of a tolerable rate of congestion and packet loss from
momentarily exceeding the link capacity. These can be used to compute the
maximum number of calls that can be handled with Silence Suppression
enabled. Spreadsheet calculators are available to assist in this analysis.
Modeling has shown that links carrying 24 calls or less do not have a
sufficiently stable peak rate to share bandwidth among voice channels
without unacceptable rates of packet loss. Low-speed links can still benefit
from the use of silence suppression, which makes room for data from other
applications to share the channel. Note that such sharing may increase jitter
where voice and nonvoice packets share a queue.
Transcoding
Transcoding refers to the successive encoding of a digital signal by
different codecs. Transcoding can be problematic for voice quality because
of cumulative degradation to the final output speech. Transcoding may
increase both signal distortion and delay. It is important to understand
when transcoding adds impairment and where it does not.
Generally, transcoding occurs because some networks use compression
coding to save bandwidth, and the compression codecs they choose are
different. For example, many VoIP networks use G.729, an 8 kb/s
telephony standard codec, to conserve bandwidth; digital wireless networks
use low bit-rate codecs over their radio channels. In addition, some network
features such as conferencing and voice mail may also add transcoding.
A special case of transcoding, called tandeming, occurs when a signal is
encoded, decoded, and reencoded by the same codec. The impairment is
similar to that for transcoding. TIA TSB-116 offers a good discussion of
the effects of transcoding.
The arithmetic of transcoding

For conventional telephony, G.711 serves as a baseline and neutral starting
point; all other codecs are evaluated relative to G.711. Transcoding from
G.711 to another telephony codec degrades the quality, while transcoding
from another telephony codec to G.711 does not. Typically, successive
non-G.711 encodings involve decoding from one codec to G.711 and then
encoding from G.711 to the second codec. A VoIP LAN using G.729,
connecting by G.711 to a GSM network begins with G.729 coding on the

LAN, transcodes to G.711, and finally to GSM-EFR on the wireless

channel. This would technically count as two transcodings (G.729 to G.711
and G.711 to GSM-EFR). However, since the intermediate G.711 does not
increase the impairment, we may describe the transcoding simply by
counting only non-G.711 encodings. For the purposes of voice quality
estimation, this counts as one transcoding (G.729 to GSM-EFR).
The precise amount of coding distortion depends on the specific codecs
involved, the order of processing (that is, which of the two codecs was
first), whether noise is present in the signal, and for some codecs, the
spectral content of the specific voice. The E-Model Ie, defined in the
discussion above on Encoding Distortion, can help estimate the impairment
from transcoding5. Using the values shown in Column 8 of Table 5-2
below, Table 5-1 computes the value of the E-Model output metric,
Transmission Rating (R), based on transcoding between pairs of codecs.
The highest possible R for conventional-band telephony is 93, and Ie for
each pair is added together, and then subtracted from 93. Where at least one
of the codecs is G.711, scores for the codecs shown remain near eighty or
above. Scores that drop below eighty may have minor impairment, and
those dropping below seventy more obvious impairment. (Wireless
operates happily in this region because it has different user expectations.)
Note that the scores in the table account only for the transcoding
impairment; contributions to quality from encoding delay, as well as other
impairments from the connection have not been included.
Transcoding between G.711 and G.726 can use a special feature of G.726
called the Synchronous Coding Adjustment (SCA) to limit accumulation of
distortion. Digital signals can be transcoded back and forth between G.711
and G.726 repeatedly, incurring only the degradation associated with the
lowest bit-rate used for G.726, provided the SCA is used in each successive
application of G.726. The effect of the SCA is included in the computations
for Table 5-1
Transcoder-free operation6
To maintain voice quality performance, we want to limit the number of
encodings by CELP and other low-bit-rate codecs to one. Packet networks
offer a unique opportunity to do this; speech that is already compressed can
be transported over IP without transcoding to an intermediate form.
Transcoder-free operation is a feature that ensures that speech signals can
5. Since the E-Model is an additive model, this estimate is approximate. It does not account for differences
in order of transcoding and becomes less accurate as more transcodings are combined.
6. Here, the term transcoder-free operation (TrFO) is used generically. In wireless technology standards,
TrFO refers to a specific signaling system to set up a clear TDM channel between the endpoints that
allows the encoded speech to be sent as data.

be carried through the network without encountering more than one low-
bit-rate encoding.
Table 5-1:The table entries show the effect of transcoding on predicted

voice quality in terms of Transmission Rating (R). Values are
computed by subtracting the Equipment Impairment (Ie) for with
each codec from the G.711 reference of 93. Transcodings between
non-G.711 codecs are assumed to have an intermediate G.711
step between the two. G.726 is assumed to have the Synchronous
Coding Adjustment (see description of G.726, below). See
Chapter 3 for a discussion of R and Ie of the E-Model.
Several things must be in place to support transcoder-free operation. First,
the network endpoints must be able to communicate with the call server or
each other about the codecs available and which one to use. Second, since
the packet portion of the network may not know the final end point of any
call, gateways must be able to pass on the data to a compatible endpoint or
to decode it where it will be passed to a TDM endpoint.
Where compression is used in the access, such as cellular/wireless,
transcoder-free operation can provide simultaneous benefits to both end-
user quality and bandwidth efficiency. For example, instead of converting
wireless speech back to G.711 for transport across the core network, the
low-bit-rate code can be sent instead. Where the call goes to another
wireless endpoint using the same wireless standard, the speech data can be
carried across the packet network from one base station to the other, and
sent down the second wireless channel, without incurring additional
impairments from intermediate codecs. Similarly, Enterprise LAN using

G.729 VoIP connecting via a public carrier to another similar LAN

currently decode the G.729 to G.711 for transport across the TDM link. As
the public network converts to IP infrastructure, it will be possible to
forward the G.729 packets across the public carrier IP network to the far
end G.729 decoder. Voice mail on G.729-based networks should store
G.729 code, rather than transcoding to a proprietary codec for storage,
reducing further the number of transcodings speech is subject to in the
network.
Codecs and conference calls

Audio conferencing is a well-established business communications tool.
For large Enterprise networks, as much as one-third of user call-minutes
include a conference bridge. User satisfaction with Enterprise VoIP, then,
depends on the acceptability of conference call quality. Service providers
do not have the same conference call volumes, but many operator-assisted
calls are made through a bridge.
The core of a conference bridge is an algorithm that mixes the speech
signals and sends the result to individual receivers. The selection of which
talkers to mix can be complex, but the difficulty VoIP poses for
conferencing quality is that the conference bridge can only mix PCM code.
If the signal is compressed, it must be decoded before the bridge can
process it. The mixed signal must be encoded again before being sent to the
endpoints. G.711 will not suffer impairment from the mixing itself, but
must be unpacketized and repacketized, increasing delay. To make matters
worse, many conference callers use headsets or speakerphones, which can
combine with the increased delay to produce audible echo.
The implications of this are clear: traditional conferencing algorithms
insert the equivalent of a TDM hop in the middle of the call. This doubles
the equipment delay, and where a compression codec is used, transcoding
will be inevitable7.
Adaptation of the conferencing algorithms to the VoIP environment can
relieve some of these impairments. One strategy is to send only one talker
at any time, so that the signal does not need to be mixed. This may solve
the delay and transcoding impairment problem, but the detector may
introduce other impairments, since it will have to switch quickly between
talkers, and the signal sent to participants will be very choppy. Some
7. Where the bridge sits in a legacy network, and only some lines are calling from a network using packet
technology and/or speech compression, the situation is more complex. If there is only one caller from
such a network, there will be no additional impairments over the TDM, since that signal must be
unpacketized/uncompressed at the network interface in any case. Where there is more than one line
with packet or compressed speech, the analysis applies to signals they hear from the other talkers on
such networks, regardless of whether the conversions take place at the bridge or at an intermediate
interface.

designers are introducing hybrid algorithms that mix sometimes but not
other times, depending on whether there is significant activity on more than
one line. The success of these new strategies is not yet determined, but it is
clear that the traditional conference bridge will not deliver the quality users
expect when combined with VoIP technology.
Impairment to conference calls over VoIP can be minimized by using
G.711 or G.726-32 as the VoIP codec. Delay will increase, but the increase
will be less than for compression codecs, and there will be no transcoding
impairment.
Speech Codecs for Voice Services

The design intent of the codecs that have been adopted for VoIP ranges
from general use on digital services (for example, G.711) to digital wireless
(for example, G.729, which was designed for use with digital wireless,
although it was never used for such). To date, no codec has been designed
specifically for use on packet networks. While this has meant that some
codecs have required some retrofitting, there are some potential
advantages. In particular, where the same codecs are used on different
types of networks, interworking between them is simplified. Even so, while
VoIP hasn't produced a completely new crop of special-purpose codecs,
there are still more than a few to keep track of. Some are common choices
for VoIP services, while others are used in existing networks and have
implications for interworking and end-to-end quality.
Commonly used telephony codecs

While there are many standard and proprietary speech codecs, commercial
telephony employs a comparatively small subset. Table 5-2 lists the most
commonly used telephony codecs for wireline, wireless, and VoIP. The
G.700-series codecs are ITU standards. The remaining codecs listed here
are wireless standards adopted by various standards bodies (see References
for details). Also shown are the systems each codec was standardized for or
where it is normally deployed. All these codecs use the conventional
telephony frequency band (300–3500 Hz) associated with 8 kHz sampling.
The codec type shows the dominance of the CELP technology for speech
compression coding. Other defining characteristics include data rate(s),
frame size, look ahead, an estimate of encoding delay, and the E-Model
Equipment Impairment (Ie) indicating the baseline coding distortion of the
codec. The sections following the table provide additional information on
design intent, usage, features, idiosyncrasies, listening quality of the
individual codecs.
This list of codecs is not exhaustive. Other codecs are used in wireless and
IP telephony in specific national and proprietary systems.

Data Frame Look- Delay E-model

Codec Systen rate size ahead Impairment
Codec type implementation factor (Ie)1
(Standard) (kb/s) (ms) (ms) estimate (ms)
G.711 PCM 64 — 0 0.125 0
G.726 ADPCM 32, 24, — 0 0.250 7, 25, 50
16
G.728 LD-CELP 16 0.625 0 1.250 7
G.729 CS-ACELP 8 10 5 25 10 (VAD off)
G.729AB CS-ACELP 8 10 5 25 11 (VAD on)
G.723.1 MP-MLQ 6.3 30 7.5 67.5 15
GSM-EFR GSM ACELP 12.2 20 0 40 5
2
AMR GSM, ACELP 12.2 20 0 40 5
Mode 1 UMTS
AMR2 GSM, ACELP 7.4 20 5 45 10
Mode 4 UMTS
IS-641 TDMA ACELP 7.4 20 5 45 10
3
SMV CDMA RCELP 8 20 10 50 not determined
3
EVRC CDMA RCELP 8 20 10 50 6
3
733-A CDMA QCELP 13 20 7.5 47.5 5*
4
iLBC IP 15.2 20 5 45 not determined
(2 rates) 13.33 30 104 70 not determined
BV16 IP proprietary 16 5 0 10 not determined
1
Equipment Impairment (Ie) factors given here account for codec distortion on clean channels, that is.,
without packet loss or wireless RF impairments, in the computation of the E-Model. (Ie never includes
impairment from delay.)
2
AMR has eight rates in total. AMR Mode 1 is identical to GSM-EFR, while AMR Mode 4 (7.4 kb/s) is
identical to IS-641, and is very similar to G.729.
3
These codecs are used in North American CDMA, and each is made up of four fixed-rate codecs. Each
uses a rate determination algorithm that decides which codec is to be used at any time, depending on the
nature of the signal present and on the mode of operation. The bit-rate given is the average rate for the
highest quality operating mode.
4
This delay is not technically a look-ahead, since it occurs in the decoder rather than the encoder.
However, it has a similar effect on the overall encoding delay.
Table 5-2: Summary of telephony codec information
G.711
The workhorse of the PSTN for digital trunking and switching
The best quality conventional-band codec
Handles nonspeech as well as speech

Two coding laws are defined for G.711: A-law and µ-law; the voice
quality produced by these two coding laws is very similar;
transcoding from one to the other adds distortion equivalent to
somewhat less than one unit of R or Ie.
The two coding laws do not interwork (a signal encoded by one can
not be decoded by the other), but the data can be translated using a
simple look up table.
A-law coding is used in most of the world and on international
connections; µ-law is used in North America.
Requires external packet loss concealment and silence suppression,
which are easily added.
G.726
32 kb/s rate is commonly used for compression in TDM networks,
private networks, undersea cables, and satellite links.
Specified for low-power (in-building) digital wireless systems, such
as CT2 and DECT
Common in ATM and FR environments, but not found in many
VoIP systems yet
Sounds slightly raspier on active speech than G.711. The noise
floor is slightly higher, and is acceptable for both public and private
networks.
Quality of the lower rates (24 and 16 kb/s) not generally acceptable
for commercial telecommunications, although these rates are
sometimes used in private networks.
The Synchronous Coding Adjustment (SCA) is used to avoid
cumulative quantization distortion from multiple conversions
between G.711 and G.726.
More sensitive to data loss than G.711 because the decoder can lose
its adaptive reference, and it takes a finite time to reconverge.
Requires external packet loss concealment and silence suppression;
which are easily added.
G.729, G.729A8:
G.729A (that is, Annex A) is a reduced-complexity version of
G.729
8. When reference is made to G.729, it is almost always the 8 kb/s rate that is intended. However, other
rates are defined in G.729 Annexes. G.729A is a reduced complexity version of the 8 kb/s codec
defined in the main body of the standard. In this book, references to G.729 without qualification may
be taken to mean the 8 kb/s algorithms, either G.729, G.729A, or both.

G.729 and G.729A can be considered interchangeable; the

differences are important to developers, but not to deployment,
operation, or interoperability
Both use the same decoder; therefore, they are fully interoperable.
Often used in VoIP networks, especially Enterprise LAN-based
systems.
Offers acceptable quality on point-to-point calls; minor coding
distortion on single encoding; rated similar to G.726 at 32 kb/s,
although qualitative difference in the sound; poorer reproduction of
non–speech signals (DTMF, music); typical CELP transcoding/
tandeming degradation.
Built-in packet loss concealment
Built-in silence suppression algorithm, defined in G.729 Annex B
When G.729A was introduced, marketing claims that G.729A was
better than G.729 (or vice versa) were common; there is no basis for
these claims.
G.723.1:
Developed for use in video teleconferencing
Early de facto standard for “shrink-wrap” VoIP applications
Slightly poorer baseline quality than G.729, plus relatively long
delay
Built-in packet loss concealment
Runs at two rates, 6.3 kb/s and 5.3 kb/s
GSM-EFR:
GSM EFR (Enhanced Full-Rate) wireless speech coding standard.
Baseline quality with speech signals is essentially equivalent to
G.711
Tandeming degradation less than for the earlier compression codecs
Built-in silence suppression feature, called DTX (discontinuous
transmission)

AMR:
Adaptive Multi-Rate (AMR) codec for GSM wireless
Developed to optimize quality over wireless channels: the coding
rate adapts to current channel conditions: The bit rate of the speech
codec is reduced in the face of data loss; bits freed up are
transferred to the error protection function.
Has eight rate modes available (half-rate operation uses the lower
four modes).
Top rate mode is identical to the GSM-EFR codec; half-rate mode
is equivalent to IS-641.
iLBC, BV16:
Selected as low bit-rate codecs for CableLabs Packet Cable
standard.
Selecting a codec for your network

What should you consider in determining which speech codec is
appropriate for your particular network? The main selection drivers are
voice distortion, delay, and system capacity. The importance of specific
considerations will depend on the type of network (service provider, large
Enterprise, small Enterprise), what kind of calls are being made (two-party,
conferencing, wireless access, long distance), and who the primary users
are (subscribers, employees, business-to-customer, etc.). Refer also to
Chapter 20, which discusses the effect of codec selection and choice of
packetization on overall network performance. Chapter 20 also offers
design guidelines integrating codec choice into the overall packet network
planning for voice.
Voice performance
For maximizing voice performance calls with a codec with low distortion
and low delay, G.711 is the natural choice. G.711 will provide the best
intranetwork voice quality, and will optimize interworking with other
networks. Conferencing and voice mail performance will be similar to that
with TDM. Running G.711 with 10 ms packets will offer the best end-to-
end delay, but where capacity considerations prevent that, G.711 with
20-ms packets is a good alternative. Using G.711 with careful network
provisioning, it is possible to shift from TDM to VoIP without users being
aware of any change in the infrastructure.
Note that voice mail can suffer more from coding distortion than live
conversation, because the listener does not get a chance to ask for repetition
of any unintelligible parts. This can be a significant problem if the voice

mail system has its own compression codec, in which case the signal may
be transcoded during storage and again during playback.
Sometimes G.711 cannot be used, for instance, with low speed links from
remote sites, when teleworkers or road warriors (salesmen or other users
who are often out of the office) dial in, or where LAN has insufficient
margin for G.711 operation. G.726-32 is the next best codec, but many
VoIP gateways have not yet implemented G.726.
Capacity
Where bandwidth is the top priority, there is a strong push to adopt the
lowest bit-rate codec. Where a codec is chosen based on bandwidth, make
sure that the quality is acceptable to your users on all their common calling
scenarios. Often when justifying a codec choice, the codec quality
considerations are limited to two-party calls over the immediate network.
An encoding delay, the kinds and quality impact of any transcoding, and
the quality of long distance calls, or calls to other networks such as wireless
are rarely considered. Only after the network is up and running do user
complaints focus attention on performance shortcomings.
When selecting a low bit-rate codec, remember that the bandwidth
efficiency obtained will be less than the compression ratio as determined by
the bit rate alone. VoIP packets are small, meaning that the header accounts
for a significant portion of the bits. For example, G.729 has a compression
ratio of 8:1 (compared to G.711),but the bandwidth efficiency of G.729
packets (with one frame per packet) is approximately 4:1. For networks
running with low proportions of voice traffic compared to data traffic, the
savings in terms of percentage of overall capacity may not be worth the
cost in terms of voice performance. This trade-off must be examined
independently for each network.
Where a low bit-rate codec is used, it may be advisable to specify a higher
rate for certain call scenarios that are particularly vulnerable to transcoding
degradation. Equipment features and the network architecture will
determine whether it is possible to implement contingent selection of
codec. Three-way and n-way conferencing and calls to or from cellular/
wireless networks are two situations that show unavoidable degradation
with additional low bit-rate coding.
Bandwidth calculators are useful in understanding the capacity
implications of various network provisioning choices.
Delay
Networks that will carry calls from cellular/wireless access, international
calls, or private networks with global reach must pay close attention to
delay. Your choice of codec and packetization can increase delay across the
network. Increasing the speech payload of the packets will improve

bandwidth utilization by reducing header overhead. It does so by increasing

delay. Ensure that your delay budget can handle the increase.
Table 5-2 shows some of the characteristics of the codecs most commonly
chosen for packet and wireless applications.
Here are some points to note:
G.711 and G.726 are sample-based codecs, and do not use frames
in their processing. The packetization delay (collection of digital
samples for the payload) is the main source of delay with these
codecs.
While their encoding rates are very similar, G.729 uses a 10-ms
frame, and G.723.1 uses a 30-ms frame. G.729 also has better
encoding distortion performance.
Where multiple frames per packet are used, the packetization delay
will increase according to the number of frames that must be
collected before the packet is sent.
Compatibility with existing equipment and other networks

Determine what codecs are running in your existing network equipment,
including in-building wireless and voice mail. Will traffic from the packet
network be handed over to the old network? (The answer to this is almost
always yes.). If older equipment is using a particular codec, it is worth
considering this for the new equipment as well. Alternatively, consider
bringing the older network in line with the new equipment. Where neither
is possible, try to select a codec for the new network that will minimize
transcoding distortion and delay. Again, G.711 gets top marks for
compatibility and interworking.
Robustness to packet loss

Where robustness to packet loss is an issue, the codec implementation must
include packet loss mitigation. For codecs without a built-in packet loss
concealment, an external PLC must be added. If packet loss from network
jitter will be a factor, then an adaptive jitter buffer will also contribute to
final quality. Commercial enhanced packet loss mitigation techniques may
contribute additional quality over a standard PLC plus adaptive jitter buffer.
In some cases, packet loss concealment can work so well that it is difficult
to tell that there is speech missing from the received signal. While this gets
high marks for low distortion where the listener does not know what was
actually said, there are implications for users where a received signal is so
well repaired that there is no indication that there is missing information.
Missing words can make the modified utterance nonsensical, in which case
a user will ask for clarification. Sometimes, however, a misplaced “no” or a
missing “thousand” can significantly alter the meaning, without leaving

any clue that something is wrong. It may be possible to repair data losses
too well!
Highest priority is:

Rank bandwidth savings low distortion low delay
1 G.729/G.729A/G.723.1 G.711-64 G.711-64
2 G.726-32 G.726-32 G.726-32
3 G.711-64 G.729/G.729A G.729/G.729A
4 G.723.1 G.723.1
Table 5-3: Codec rankings for three different selection priorities. Rank one is the best fit for
given selection priority.
Restrictions on transcoding
For equivalent-to-TDM wireline access, there can be no transcoding to
frame-based compression codecs. For equivalent-to-2G wireless mobile-to-
land (2G being current digital cellular operating with TDM backhaul), only
one frame-based encoding can be tolerated.
Audio Codecs
The term audio codec can refer to all codecs intended to digitize sound, but
it often refers specifically to codecs intended to handle all signal types, or
especially nonspeech signals such as music. In general, the operation of an
audio codec is similar to that of a speech codec. Audio codecs are likely to
model the human auditory system, as compared to speech codecs, which
often model the human vocal tract.
Audio codecs are frequently used in IP applications and computer-based
audio applications. Audio streaming and exchange of compressed music
files are two common ones. Such applications are less likely than speech to
be real time, although some components of multimedia applications such
as games may include audio signals. Commonly used audio codecs are
summarized below.
Some general purpose audio codecs combine two different algorithms, one
for speech and one for nonspeech. This arrangement is essentially two
codecs, each operating on the portions of the signal for which it is best
equipped. A detection scheme is used to determine which algorithm is
appropriate for any particular signal. This technique allows the codec to
obtain better quality for a given compression (or higher compression for a
given quality) than might be achieved with a single coding algorithm, since
the use of specialized algorithms allows the codec to make simplifying
assumptions about the characteristics of the content.

Audio codecs offer multiple data rates. For some codecs, the higher rates
offer lossless compression. Recall that lossy compression is generally used
for real-time applications such as telephony. Lossless compression allows
complete recovery of the original; thus, there is no coding distortion.
However, there are limits to the amount of compression that can be
achieved with lossless techniques, and further compression must be lossy.
In the realm of streaming (and stored) audio, it is important to differentiate
between codecs, file formats, and players. Codecs convert the audio signal
from one form to another. The file format is a defined structure that
provides information needed by a player to parse the incoming data. (An
example of a format is .wav.) The format includes such information as the
codec and data rate used for compression. A player is a software device
used to play back the audio signal, and contains one or more decoders
associated with different codecs. The player is equipped to read the format
information and select the right decoder and any settings to restore the
analog output.
Some commonly used audio codecs are royalty-free standards. Others offer
a licence-free decoder, but licence the encoder to content providers. Some
audio codecs commonly used for streaming and music file exchange are
described below.
MP3
The MP3 codec is the de facto standard for music files stored and played in
the computer environment. It is closely associated with the Internet because
of music file sharing and the associated copyright disputes. It was
developed as an audio codec for digital video, which is hidden in its name:
MP3 stands for Motion Picture Experts Group (MPEG) 1, Layer 3.
The MP3 codec is used to compress audio files to reduce the space needed
to store them. MP3 compresses to several different data rates, which trade
off fidelity for file size. Because MP3 is used for listening only, encoding
delay is not an important factor.
MPEG-4 AAC
This new audio codec is part of the latest MPEG video coding standard. It
has been predicted that MPEG-4 AAC will displace MP3 as the de facto
music file standard because of its improved quality and features.
Ogg Vorbis
Ogg Vorbis is a creation of the Xiph.Org Foundation, a nonprofit developer
of tools for the Internet. It consists of two separate tools: the Vorbis codec,
which is a free-form variable bit-rate codec, and the Ogg transport
mechanism, which supplies free-form framing, sync, positioning and error
correction. Both the Vorbis codec and the Ogg transport mechanism are

available royalty-free. The Vorbis codec can be used with RTP, rather than
Ogg, as the transport protocol.
RealAudio
RealAudio* is a proprietary coding system that offers free use of its
decoder, in the form of RealPlayer*. RealPlayer is used extensively for
audio streaming and audio clips offered over the Internet. RealAudio (and
the associated RealPlayer) includes a number of decoders ranging from
very low to very high bit rates. RealAudio 10, the release current at this
writing, uses data rates ranging from 12 to 800 kb/s (recall that monaural
linear PCM requires 705 kb/s). The codecs sport various rates, frequency
bands, and options that optimize for specific audio content and application
(for example, speech/music, mono/stereo, Dolby*). RealAudio 10 includes
the MPEG-4 AAC codec as one of its operating modes.
RealAudio includes a packet loss concealment feature. In addition, the
Surestream feature can dynamically adapt to changes in the available bit
rate to maintain a continuous playout even where the access channel may
be intermittently shared.
Wave
Wave is an audio data file format, not a codec. The Wave format is a
version of an Interchange File Format (IFF, a standard established for all
kinds of data, from sound files to pictures to musical scores). Wave files are
designated as .wav. They includes information about the type of data in the
file, how the data is encoded, the length of the file, and so on. The format
also specifies how the data is structured (chunked) inside the file, so that
players that read the file know what setup to use and how to parse the data.
Video Codecs
Analogous to speech and audio codecs, video codecs convert standard
analog video signals into digital Pulse Code Modulated (PCM) signals or
compress them. Because of the much greater information content of a
video signal, compression is even more important for video than for audio.
The nature of the analog video signal, the way video information is used
and transmitted, and the characteristics of human visual perception all
place special requirements on video codecs. Some background on the
analog video signal will assist in understanding the general digitization
process. The sidebar below describes how analog video signals are
generated and the formats used to transfer the signals to local and remote
receivers.

Sidebar: Analog video signals

A video signal represents the instantaneous brightness and colour
information of each element or “spot” of an image, as captured with a
camera or other video source. An image consists of a large array of such
picture elements or pixels, and motion or other changes are reproduced
by presenting updated images at a regular rate. The basic quality of the
image depends on the size of the image, the density of pixels used to
record it, and the rate at which the image is updated. Full-motion video,
especially the NTSC (North American) or PAL (European) broadcast TV
standards, has been developed so that the presentation rate of the images
is fast enough to offer an acceptable representation of the actual event.
(High definition television—HDTV, which operates differently from the
system described here—offers a significant improvement in image
quality over these standards.)
By convention, the image is scanned along each row of pixels, starting at
the top left and finishing at the bottom right of the image (as seen by the
viewer). Each scan provides a complete image called a frame, and the
rate at which these are created is known as the frame rate. For
television, the frame rate is 25 or 30 times per second. The scanning is
carried out in one of two ways: interlaced or progressive. Interlaced
scans proceed by scanning odd numbered lines, followed by even
numbered lines. Each of these scans of half of the lines is known as a
field. Progressive scanning takes each line in sequence. Interlacing is
applied to reduce flicker and is a very effective means of doing so.
Without interlacing, analog broadcast TV would have unacceptable
levels of flicker.
Interlace, however, creates some other problems when capturing and
displaying moving objects such as comb effects on horizontally moving
vertical lines. Storage and processing capability in modern display
devices allows a full frame to be generated from each interlace field.
This can then be presented at double the frame rate to reduce flicker and
at the same time avoid the problem of interlace artifacts. Computer
displays use progressive scan; they avoid flicker by using a higher frame
or refresh rate, which is at least twice that of TV.

Sidebar: Analog video signals (Continued)

In high quality systems used for generation of broadcast quality video,
the sensor in a camera or imaging device usually consists of three
separate sensing devices, one for each of the three primary colours: red,
green, and blue (RGB). The incident light is optically separated into
three identical paths. An optical filter inserted in each light path,
allowing, red, green, or blue light to reach the sensor array. This same
effect is achieved (with lower definition) in consumer-grade video
cameras by using a single sensor with a colour filter of diagonal stripes
applied in the optical path. Processing is then applied to extract the
luminance (brightness) and chrominance (color-value) information.
Getting the image information from the camera to the display
RGB format provides the highest quality image reproduction, but also
requires high bandwidth to convey the information to the receiver. Two
alternative signal formats have been defined to reduce the bandwidth
required: component video and composite video.
Like RGB, component video uses three information channels, but
compresses the information by employing difference signals to carry the
color information. The three channels are designated as Y (Luminance),
CR, (Red minus Luminance difference signal), and CB (Blue minus
Luminance difference signal). By keeping the color and intensity
information separate, component video retains more of the original
image quality than does composite video by avoiding the noise
associated with imperfect separation of the signals.
Component video requires three parallel signal paths and is used in
broadcast production facilities and for distribution of video signals
locally, for example between a DVD player and a display device. Due to
the requirement for three parallel channels, component video is never
broadcast.
Composite video consolidates all the image information into a single
waveform. Composite video consists of a luminance signal to which is
added a composite sub carrier carrying the chrominance signal. This
chrominance signal is within the frequency band of the luminance signal
and therefore the combined signal can be broadcast using a single carrier
transmitter.1 A sub-carrier is a higher frequency component
superimposed on the primary carrier, which can be modulated
independently of the main carrier. The sub-carrier is a well defined fixed
frequency and can therefore be separated from the primary carrier in the
receiver.

Sidebar: Analog video signals (Continued)

Composite video loses some information in the consolidation process,
and the separation of signals in the receiver results in increased noise in
the luminance and color signals from imperfect separation. However, the
single waveform is a significant benefit to the design of distribution
systems, and the composite format is used for all analog broadcast and
cable TV.
To maintain accurate information as to which picture element is being
transmitted at any time, the signal contains indicators of the beginning of
each frame and the beginning of each line. This information consists of
an excursion of the signal amplitude into a region below that needed for
full black signal. A synchronization pulse occurs before the picture
information in each line. A longer pulse indicates the end of a field, in
the case of interlaced information, or a frame, in the case of progressive
scan. The period between lines, fields, or frames also provides blanking.
This “blacker than black” state ensures the “flyback” (the return scan of
the electron gun in a cathode ray tube [CRT] from the end of one line to
the beginning of the next) remains invisible on CRT display devices.
Gamma Correction
A nonlinearity known as Gamma correction was introduced to
compensate for non-linearities in brightness response of the receiver.
Differences in low level intensities are represented by proportionally
greater differences in signal level than similar changes in intensity at
higher intensity levels. Introducing this nonlinearity into the original
signal (that is, at the broadcast end) made the signal more robust to noise
picked up in the broadcast channel. It remains part of television
standards even though advances in the quality of broadcast and receiving
equipment mean that it is no longer necessary for its original purpose.
However, Gamma correction has provided benefits for videotape
recording as well as for digitization, where it reduces the number of
quantization steps needed to obtain good resolution for low intensity
signals.
1. The color signal is added to the subcarrier using a technique called quadrature
amplitude modulation, a combination of amplitude and phase modulation that
allows the sub-carrier to be modulated with two independent signals. A reference
signal called the color burst is added at the beginning of each scanned line, between
the synchronization pulse and the video information. This reference signal consists
of ten cycles of the unmodulated sub-carrier to be used as a phase reference for the
color subcarrier enabling both colour difference signals to be reliably extracted.
The frequency of the color subcarrier was carefully chosen to work well with either
a color or monochrome receiver. For NTSC, it is nominally 3.58 MHz and for PAL
it is nominally 4.43 MHz.

Digitizing Video signals

Analog video signals may consist of separate signals for luminance and
color, or a single composite signal. These analog signals can be digitized,
just like the analog sound signals discussed earlier. Any particular video
codec will operate on only one of the three standard formats: RGB,
Component, or Composite. The input signals are not interchangeable. As
well, digitizing RGB or Component video generally yields a better
reproduction quality than digitizing composite video because the analog
signal is higher quality to begin with. Given the higher bandwidth
associated with RGB signals, however, video codecs for RGB input are
rare.
A video codec designed for component video is really a set of three codecs,
one for each of the Y, CR, and CB signals. In most cases, the sampling rate
used for the two colour difference signals (CR, CB) is lower than that used
for the luminance signal (Y). Established coding standards for component
video codecs define a sampling rate of 13.5 MHz for the luminance signal.
This frequency was selected to be compatible with both the NTSC and PAL
systems. The chrominance (color difference) signals are sampled at half of
this frequency (6.75 MHz). Because the frequency of 13.5 MHz was
chosen to be approximately four times the NTSC subcarrier frequency of
3.58 MHz, this sampling is referred to as 4:2:2, indicating the sampling rate
is approximately four times the nominal colour subcarrier for the
luminance sampling rate and twice the nominal sub-carrier for each of the
color luminance difference signals. The frequency 13.5 MHz has an
additional advantage of compatibility with both 525 line NTSC systems
and 625 line PAL systems. The digital bit stream formed from the three
components are transmitted sequentially, frame-by-frame, in a well-defined
pattern.
Unlike the speech and audio codecs described above, video codecs
invariably rely on linear step sizes for quantization. This means that equal
differences in signal level equate to equal levels of digitized bit patterns.
However, because the signal levels themselves are not linear, the actual
luminance levels are not represented in linear steps. This nonlinearity,
called the Gamma correction, was defined as part of the analog video
standard to improve reproduction of low brightness levels (see Sidebar on
Analog Video Signals, above). The existence of the Gamma correction in
the analog signal means that an acceptable dynamic range and noise
performance can be obtained with only eight-bit quantization. Without the
Gamma correction, a quantization using fourteen bits or more would have
been necessary.
Video signals require a large amount of data to describe each frame.
Fortunately, they contain substantial temporal and spatial redundancy,
providing the opportunity for significant compression. Video compression

is lossy, but with careful design of the compression technique, these losses
will not be detrimental to the perceived image.
Processing is done to identify redundancies between adjacent frames. For
example, a static background need not be updated until there is any change.
A moving object either remains approximately static on the screen while
being tracked by the camera, in which case the background will move, or
the object moves across the display and the background remains static.
Compression algorithms also look for a number of other common types of
movement to facilitate compression. Such analyses identify redundant
information that can be removed, and allow a high proportion of the
remaining information to be coded in differential terms.
Video codecs use either eight or ten-bit encoding. Composite video is
normally eight-bit encoded. Higher definition signals are sometimes ten-bit
encoded, in which case the most significant eight bits are regarded as the
integer part and the least significant two bits are regarded as fractional.
This allows decoding equipment designed to handle eight-bit streams to
handle the ten-bit words by simply truncating them to eight bits.
Synchronization of sound and video applies to both analog and digital
signals. However, it is perhaps more of a problem in digitized video
because of differences in encoding delay for speech/audio codecs and
video codecs. Audio and video signals may be separated, which is
beneficial where the identity of the far end receiver is not known, because it
may be possible to capture and play the audio component even where there
is no video receiver and display. However, separate streams for audio and
video must be synchronized for playback within certain tolerances;
otherwise, the quality of the playback is reduced. For full motion video, the
audio signal should not lead the video image by more than 20 ms, nor trail
it by more than 40 ms.
Common Video Coding Standards in IP Applications

Some popular products in use today are listed below. However, this list
changes rapidly, so this should be considered examples only. There are over
one hundred video codec products available.
File formats, players, encoders and decoders, streams

Care must be taken not to confuse codecs (compression algorithms and
procedures) with file formats. The MPEG and ITU (H.261, H.262, H.263,
H.264) standards define the syntax of a video stream together with methods
for reconstructing the video stream suitable for presentation. The latest
version of the stream format is known as Advanced Video Compression or
AVC. ITU-T has standardized AVC as H.264.
At the highest level there are players. Players are software devices that
contain one or more decoders, and that determine from information in the

file which decoder to use. There are three players in common use today:
Apple’s* QuickTime*, Windows Media* Player, and Realplayer*. These
are described below.
Players take streamed information (information being presented in real
time) or information in a recorded file and convert it back to a form suitable
for presentation to a driver and hence to a display device. These players can
accept information in many common file and stream formats. File and
stream formats contain not only the information to be presented, but also
information telling the player how to decode the information in the stream.
Among other things, the information indicates which decoder and data rate
to use. Typically, files consist of frames (file frames, not to be confused
with video frames) with a file header containing the control information
followed by the video data. Most players can begin decoding at the start of
any frame and can ignore any partial frame data in the bit stream prior to
the next start-of-frame.
Even when they only provide the decoding function, video decoders are
usually referred to as codecs. The need for a decoder assumes the prior use
of an encoder. Decoders are sometimes referred to by the coding standards
to which they refer. For example, H.261 and H.263 are video codecs, as is
MPEG. Strictly, MPEG is both a file format and a codec. The MPEG
standard includes enough detail to define a file or stream that an MPEG
decoder can play. H.261 has been largely superseded by H.263. MPEG is
included here in the codec section, but is more strictly a standard defining
the data structure that a codec will act upon.
Video Codecs
Sorenson. Sorenson is a proprietary codec from Sorenson Media*. Apple
uses it in their QuickTime player. It has a number of encoding features
including bi directional (B frame) frame encoding. Added to this it
incorporates the ability to drop B frames when needed due to short term
restrictions on data rate. This makes it very tolerant to variations in data
rate. It also automatically determines video frames where new scenes
begin, based on how much video changes between adjacent frames, and
flags these as key frames, which are then used to begin a sequence of
decoding.
Cinepak*. Cinepak was an early codec that was rapidly established as a
standard. Cinepak is an asymmetric codec: the video has a long
compression time, and can not accept video input in real time, but it can
decode in real time. This ensures a smooth playout of streamed material.
Cinepak is included with Quicktime. Like Sorenson, it identifies key
frames based on the difference between adjacent video frames, and flags
these to indicate the beginning of a decoding sequence.
H.263. H.263 was designed for and is primarily used for video
conferencing and is a symmetrical real-time codec. It is limited to displays

of 352x288 pixels or smaller and when acting on material of larger size, it

scales accordingly, losing resolution. It is optimized for low data rates and
relatively low motion (slow), with little change between successive video
frames. The compression technique used in H.263 is Discrete Cosine
Transform (DCT) with motion compensation. It works well for its intended
use as a video conferencing standard and codec.
Indeo* (formerly “Intel* Indeo” but now owned by Ligos* corporation).
Indeo is a range of codecs based on both DCT and later versions on
Wavelet, another compression technology. It is used for gaming and
streamed material. It is included as part of the QuickTime player, and was
once provided as part of the Microsoft* operating systems, but is no longer.
MPEG-2. MPEG-2 is the de facto standard for broadcast quality video. It
provides excellent image quality and resolution. A subset of MPEG-2 is the
standard for DVD video. The compression ratio for MPEG-2 is variable
between 50 and 100, averaging about 70.
Playback is normally by means of a hardware decoder built into a DVD
player or cable broadband access set-top box. It offers a wide range of
options in terms of data rates and display definition levels. The
compression scheme used is Discrete Cosine Transform (DCT) with
forward and bi-directional motion prediction.
MPEG-4. MPEG-4 is based on, and similar to, H.263. However, it is
designed to deliver interactive multimedia across networks. It is capable of
dealing with objects within the video frames as well as video frames
themselves. The compression scheme is DCT with motion prediction.
MEG-4 AVC (MPEG-4 Part 10). Although it is part of MPEG-4,
Advanced Video Codec (AVC) operation is closer to MPEG-2 than it is to
the to the other parts of MPEG-4. In particular, it does not incorporate the
object processing features of the MPEG-4 suite. AVC requires more
processing power than MPEG-2, but offers several advantages. It offers
higher compression than MPEG-2, works better with packet-based
networks, and is more robust to packet loss. AVC has been standardized by
ITU-T as H.264.
Dicas*. Dicas produces a range of codecs for MPEG under the brand name
Mpegable(TM), which is MPEG-based video coding technology with a
specific focus on MPEG-4 and MPEG-4 AVC/H.264.
DivX*. The DivX decoder is offered as freeware for personal use. DivX is
often used for video clips traded over the Internet.
Players
QuickTime. QuickTime is a comprehensive system from Apple and is
among the most popular coding and decoding systems available today. It is
a comprehensive system that handles video, still images, music, and speech

(telephony), and is compatible with formats designed for wireless. The

decoder is available free for all major platforms. QuickTime contains
technologies from Sorenson and Indeo among others.
The QuickTime architecture is progressive download, or “fast-start,” which
allows users to start playing video and audio before the entire file has
downloaded. It also has a wide range of features, such as text tracks for
subtitling, Virtual Reality panorama, and object support.
RealPlayer*. Similar to the RealAudio player described under Audio
Codecs, above, Realplayer is a comprehensive system including a player,
or decoder, available free, as well as encoders and a Digital Rights
Management system. It is available for all major platforms and is very
popular for viewing streamed information over the Internet and Wireless.
RealNetworks*, Inc. claims to provide “the universal platform for the
delivery of any digital media from any point of origin across virtually any
network to any person on any Internet-enabled device anywhere in the
world.”
RealMedia* is particularly suitable for long-duration or live-broadcast
time-based media. Subjective performance is significantly influenced by
performance of the processing architecture used for decoding typically a
computer and the performance of that computer.
Windows Media Player. Windows Media Player is a comprehensive
system incorporating codec technology from a number of third parties.
Windows Media Player is widely available as a component of Microsoft
operating systems. The player accepts media formatted in all common
formats. The soft decoders are downloaded as needed. Due to the
widespread distribution of Windows Media Player, it is becoming a de
facto standard for streamed and stored video files. The Windows Media 9
video codec has been standardized by the Society of Motion Picture and
Television Engineers (SMPTE) as VC-1.


This chapter provided background on speech, audio, and video codecs
required for understanding their use in real-time applications. The basic
characteristics of codecs were defined, including sampling rate, data rate,
encoding delay, frame size, look-ahead, and compression ratio. Types of
codecs such as waveform and CELP were described.
Compression is an important aspect of coding for all analog media.
Compression allows more efficient use of network bandwidth, but
introduces impairments such as distortion and delay. Among speech
codecs, low bit-rate codecs provide higher compression ratios, but sacrifice
general utility, since they operate better on some signals (speech) than
others (tones, music), and differ in how well they cope with unwanted
signal components such as noise. Transcoding between compression
codecs can degrade quality, codecs can react differently to data loss (packet
loss). Mitigation techniques like packet loss concealment were covered.
Some codecs come equipped with silence suppression. How silence
suppression works, when and where silence suppression can improve
efficiency, and associated impairments were described. Effects of
background noise on Silence Suppression performance was addressed. The
concept of comfort noise was introduced.
The effects of transcoding on call quality were addressed in detail. The
reader was shown how to identify transcodings that increase distortion
versus those that do not. Transcoder-free operation is a goal for packet
equipment, since it both minimizes the number of transcodings and
maximizes the bandwidth efficiency of the network. Conference bridges
introduce transcoding, since the signals must be decoded to PCM so that
they can be mixed. Some bridges avoid mixing, but can introduce other
impairments because of this.
A detailed review of telephony speech codecs was provided. Codec
selection for a voice services network can be based on voice performance
(minimizing distortion and delay), bandwidth efficiency, or other criteria.
A short review of audio (non-speech) codecs was provided, including a
summary of commonly used codecs for audio streaming.
The general digitization of video signals and the operation of video codecs
were discussed. The general methods used to identify and remove
redundancy were described. Synchronization of audio and video is
important. The differences between file formats, codecs, players, and
streams were defined. A sampling of video codec standards and commonly
used players for video streaming was offered.

References
General References:
ITU-T Recommendation P.862, Perceptual evaluation of speech quality
(PESQ): An objective method for end-to-end speech quality assessment of
narrow-band telephone networks and speech codecs, Geneva: International
Telecommunication Union Telecommunication Standardization Sector
(ITU-T), 2001.
ITU-T Recommendation G.107, The E-Model, a computational model for
use in transmission planning, Geneva: ITU-T, 1998.
Codec Standards:
733-A, ANSI/TIA-733-A-2004, High Rate Speech Service Option 17 for
Wideband Spread Spectrum Communications Systems,
BV16, J.Chen et al, BroadVoice™16 Speech Codec Specification, Version
1.2. October, 2003. (For further information, contact PacketCable, Cable
Television Laboratories, Inc.)
EVRC, ANSI/TIA-127-A-2004, Enhanced Variable Rate Codec Speech
Option 3 for Wideband Spread Spectrum Digital Systems,
ITU-T Recommendation G.711, Pulse code modulation (PCM) of voice
frequencies, Geneva: ITU-T, 1988.
ITU-T Recommendation G.722, 7-kHz Audio-Coding within 64 kBit/s,
ITU-T Recommendation G.722.1, 7kHz Audio - Coding at 24 and 32 kb/s
for hands-free operation in systems with low frame loss, Geneva: ITU-T,
1999.
ITU-T Recommendation G.723.1, Dual-rate speech coder for multimedia
communications, (includes Annex A: Silence Suppression, and Annex C:
Channel Coding Scheme for use in wireless applications.) Geneva: ITU-T,
1996.
ITU-T Recommendation G.726, 40, 32, 24, 16 kbit/s Adaptive Differential
Pulse Code Modulation (ADPCM), (includes Annex A: Extensions of
Recommendation G.726 for Use with Uniform-Quantized Input and
Output-General Aspects of Digital Transmission Systems.), Geneva: ITU-
T, 1990.
ITU-T Recommendation G.728, Coding of Speech at 16 kBit/s Using Low-
Delay Code Excited Linear Prediction, Geneva: ITU-T, 1992.

ITU-T Recommendation G.729, Coding of Speech at 8 kbit/s Using

Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-
ACELP), (includes Annex A, Reduced Complexity 8 kbit/s CS-ACELP
Speech Codec, and Annex B, Silence Compression Scheme for G.729
Optimized for Terminals Conforming to Recommendation V.70.), Geneva:
ITU-T, 1996.
GSM-AMR, GSM 06.71 Version 7.0, EN 301 703 v7.0.1, Sophia
Antipolis. Digital Cellular Telecommunications System (Phase 2+);
Adaptive Multi-Rate (AMR); Speech Processing Functions; General
Description, (includes GSM AMR codec, VAD, DTX, NS, and other GSM
speech processing features), France: European Telecommunications
Standards Institute (ETSI), 1999.
GSM-EFR, EN 301 243 V4.0.1, Sophia Antipolis, Digital Cellular
telecommunications system (Phase 2): Enhanced Full Rate (EFR) speech
processing functions; general description (GSM 06.51 version 4.0.1),
(includes GSM EFR codec, VAD, DTX, and other GSM speech processing
features), France: European Telecommunications Standards Institute
(ETSI), 1997.
ITU-T Recommendation H.261, Video codec for audiovisual services at p x
64 kbit/s, Geneva: ITU-T, 1993.
ITU-T Recommendation H.263, Video coding for low bit rate
communication, Geneva: ITU-T, 1998.
ITU-T Recommendation H.262, Information technology-Generic coding of
moving pictures and associated audio information: Video, Geneva: ITU-T,
2000.
ITU-T Recommendation H.264, Advanced video coding for generic
audiovisual services, Geneva: ITU-T, 2003.
iBLC, Speech Codec Fixed Point Reference Code for PacketCable, Version
1.0.3, October 2003. (For further information, contact PacketCable, Cable
Television Laboratories, Inc. or see
http://www.ilbcfreeware.org/)
IS-641, TIA/EIA IS-641-A, TDMA Cellular/PCS Radio Interface
Enhanced Full-Rate Voice Codec, Revision A, Telecommunications
Industry Association, 1998.
MP3, ISO/IEC 11172-3, Information technology—Coding of moving
pictures and associated audio for digital storage media at up to about 1,5
Mbit/s. Part 3: Audio, Geneva: ISO, 1993.
MPEG-2, ISO/IEC 13818, Information technology—Generic coding of
moving pictures and associated audio information, (The complete suite
consists of nine parts), Geneva: ISO, 2000.

MPEG-4, ISO/IEC 14496, Coding of audio-visual objects, (The complete

suite consists of nineteen parts), Geneva: ISO, 2004.
MPEG-4 AVC, ISO/IEC 14496-10, Coding of audio-visual objects, Part
10, Geneva: ISO, 2003.
SMV, C.S0030-0 v3.0, Selectable Mode Vocoder (SMV) Service Option for
Wideband Spread Spectrum Communication Systems. 3rd Generation
Partnership Project 2, 2004.
Further Information on MPEG video and audio coding:
John Watkinson, The MPEG Handbook, Oxford England/Burlington, MA:
Focal Press, 2001.


121
Section II:
Legacy Networks
Legacy networks rely largely on Time Division Multiplexing (TDM) and
Synchronous Optical NETwork (SONET) technologies. Time Division
Multiplexing networks were designed and built specifically for one real-
time application, namely Voice. TDM networks now carry data as well as
voice, but make no distinction between real time and non–real time
applications because all data are treated as real-time. TDM is exceptionally
good at delivering real-time service, and Chapter 6 reviews how it achieves
such high performance.
TDM uses time slots to combine individual calls together on faster links in
the network core. Optical networking provide very fast links, and the
SONET standard was developed to facilitate interworking between optical
networks. Synchronous Digital Hierarchy (SDH) is the international
equivalent of SONET.
SONET is not just for Telcos—virtually all optical Layer 1 uses SONET.
Large Enterprises and even small Enterprises use SONET for long (> 15
km) distances. SONET is agnostic about the form of the data riding on it,
and continues to be used as Layer 1 for IP links.

122

123
Chapter 6
TDM Circuit-Switched Networking
Stephen Dudley
Signaling Voice
SS7 T1 ISDN
TCAP ISUP MF Q.931 G.711 Voice codec
SCCP MTP3 ISDN

AAL1/2
MTP2 Voice Switching ATM
MTP1
SONET / TDM
For the TDM Network, our transport path diagram has components that
don't exist in the packet switching network as well as components that we
would recognize in the other diagrams in the book. Shown in the diagram
are the following components:
The voice trunk signaling mechanisms most commonly used in the
network (MF, ISDN, SS7).
The voice path, which always uses a G.711 codec when the Public
Switched Telephone Network (PSTN) is employed.
Other speech codecs besides G.711 can be used, but they have to be set up
on a dedicated path without PSTN switching, one example, using an ATM
transport, is shown in Figure 6-1. When using this kind of a dedicated
connection, a wide variety of codecs can be used.

124 Section II: Legacy Networks
Concepts covered
Why the TDM Network is built around a 64 kb/s channel.
How a telephone call proceeds through the network.
How the digital switch uses a dedicated connection to end users
(line) and a nondedicated connection between switches (trunk).
How the digital switching network uses a switch hierarchy with a
trunking overlay of high usage to minimize the information
maintained in call routing tables.
TDM principles
The public telephone network was being converted to digital in the 1970's.
Even at that time, data rates for digital signals were fast enough that a
single transmission facility could send far more data than was needed to
support a single voice conversation. The technique of multiplexing multiple
digital streams onto a single facility by assigning each stream to a
particular block of time in round robin fashion, called Time Division
Multiplexing (TDM), became the basis for the Public Switched Telephone
Network (PSTN).
Framing
Sequence
DS0 #1
DS0 #2
DS0 #3
DS0 #3
Data
Stream
Figure 6-2: Time Division Multiplexing
The diagram illustrates how multiple signal streams (called DS0 here) can
be put together with a Framing sequence to create a complete bit stream.
Each data stream is assigned a time slot to transmit. The framing sequence
contains a recognizable pattern that can be easily detected in the data
stream and that has a definite start and endpoint. By knowing the starting
point of the framing sequence, it is possible to know which bits belong to
which data streams.

Chapter 6 TDM Circuit-Switched Networking 125
Sidebar: Voice and data convergence, Round 1

Although the prime motivator for public carrier network architecture is
the efficiency that TDM brings to voice transmission, protocols were
developed to adapt TDM paths to carry data traffic as well. It is not
uncommon for networks that claim to be all data, to have some portion
of the transport path over TDM links (for example, T1/E1, DS3, OC3/
OC12/OC48). Mixing of voice and data on TDM networks developed
partly because it was conceptually simple to do, but also because TDM
networks extend everywhere. Once the protocols were implemented, it
was possible to send data anywhere voice could go.
64 kHz building block data rate

The TDM network was built on the assumption that it would be carrying
digital streams from voice conversations. The standard codec used in TDM
networks is G.711 that runs at 64 kb/s. Each G.711 sample is an eight-bit
byte (or word), representing the amplitude of the waveform where
sampling occurs. A total of 8000 samples are taken per second to create the
digital stream for one voice channel. The choice of 8000 samples per
second is based on the need to be able to reconstruct signals of up to 4000
Hz.
8 bits per sample × 8000 samples per sec = 64,000 bits/s = 64 kb/s
Since calls are duplex, each call requires two 64-kb/s paths, one in
each direction.
Note: The minimum sampling rate is double the frequency of the highest
frequency being carried.
Higher and lower data rates are possible on a TDM network by subdividing
or combining time slots. Higher rates are used to transmit high quality
signals such as studio quality audio (which are used, for example, in TV
and radio interviews with people in remote locations). The lower rates are
used where transport facilities have a higher cost (for example, long-haul
trunks) or where there are a limited number of channels compared to the
number of people who use them (for example, undersea cable). As an
example, a commonly used lower rate is the G.726 codec, running at
32 kb/s. A 64 kb/s channel can carry the data associated with two 32 kb/s
signals.
Multiplexing
A channel capable of carrying one call is called a DS0. The faster the
transmission rate, the larger number of calls can be combined onto it.
Combining channels together is called multiplexing and the output of one

multiplexer can be used as an input to other multiplexers. The requirement

to interwork between vendors means that a set of standard multiplexing
rates has been developed.
Figure 6-3 shows how DS0 channels are combined into larger chunks
within the TDM hierarchy. The number of channels being combined varies
slightly, depending on whether you are in North America or elsewhere in
the world. In North America, 24 DS0s are combined into a DS1, also called
a T1. In Europe and much of the rest of the world, thirty two DS0s are
combined into an E1. Similarly, T1s and E1s are combined into higher
capacity facilities. The standard sizes of higher capacity links are given in
Figure 6-3.
Note: A T1 (that is, Trunk Level 1) was originally an outside plant
transmission system (with analog repeaters) and did not refer to the digital
interfaces on multiplexers inside the telephone office, which are called
DS1. The terms are often used as if they mean the same thing, and we will
use them interchangeably in this chapter as well.
In Figure 6-3, SONET systems are indicated as yellow boxes. Systems
shown in green employ asynchronous, rather than synchronous
transmission technologies. They eventually are mapped into the lowest
level of the SONET hierarchy at the OC-3 or STM1 level. For example,
three DS-3s are mapped into an OC-3 (North America) and three E3s are
mapped into an STM1. The OC-3 and STM-1 have identical data rates.
Knowing how many voice channels fit into each level of the hierarchy can
be used to calculate the total payload of each system. The chart in
Figure 6-3 shows the following information:
DS1 = 24 channels
DS3 = 24 channels X 28 DS1/DS3 = 672 channels
OC-3 = 24 channels X 28 DS1/DS3 X 3 DS3/OC3 = 2016 channels
OC-12 = 24 channels X 28 DS1/DS3 X 12 DS3/OC3 = 8064
channels
OC-192 = 24 channels X 28 DS1/DS3 X 192 DS3/OC3 = 129024
channels

Figure 6-3: SONET/SDH hierarchy
The importance of clock rate and synchronization in TDM

TDM relies on both a calibrated clock rate and exact synchronization
among all the clocks in the network. In some ways, TDM data travels like a
headerless packet. Only the data is sent. The way that the network knows
which bit belongs to which data stream, is by which position in which time
slot it occupies in the data stream. TDM transmission systems are
essentially point to point. The TDM network delivers the data on a specific
time slot, to a specific destination. That destination is configured offline
from the data transport process. The only way to flexibly determine a
destination for a data stream, is to use a set of switches that knows how to
select transmission facilities and timeslot(s) to reach that destination. On
the Public Switched Telephone Network, digital voice switches handle that
role and in a data network, a router or ATM switch can handle that role.

Principles of digital switching, voice switches

Almost everyone in the world can be reached by telephone, and it is almost
certain that their connection is to a digital switch. Some understanding of
the way that they work makes it easier to build a network that
accommodates this interface. To understand how digital switching works, it
is probably best to start with the familiar wireline telephone set and look at
a typical call flow.
Figure 6-4 illustrates call progression from off hook to talk path
establishment.
Each telephone set has dedicated hardware (probably a line card) in the
digital switch that is responsible for managing signaling and transporting
speech information.
When the telephone set goes off hook, the digital switch detects that the
user wants to make a call and applies dial tone to the line. The user hears
the dial tone and starts dialing digits. Those digits are received by the
switch and the connection to the called party is set up. The switch then
sends a ringing tone to that party and a ringback tone to the calling party.
When the called party answers the phone, the talk path is completed
between them.
Digital Switch
IB M
Calling Party Called Party
1 2 3 1 2 3
4 5 6 4 5 6
7 8 9 7 8 9
* 8 # * 8 #
Off Hook Start Signal
Dial Tone
Dial Digits Digits

Ringing
Ringback
Phone Rings
Conversation
Answer Phone
Figure 6-4: Call progress

For almost all residential wireline (as opposed to wireless) telephone
company customers, conversion from analog to digital occurs in the line
card at the telephone switch (wireless analog to digital conversion is in the
handset). After conversion, the user's voice is transmitted digitally on a
TDM network of some kind. At the far end, the terminating switch converts

digital to analog and transmits the signals to the user's telephone. In

between, the signal is digital, and in a traditional telephony environment in
the format defined by G.711.
In switching terminology, each 64 kb/s channel running between switches
is called a trunk. The connection between the switch and the telephone is
called a line. Each wireline telephone customer has a dedicate pair of wires
that run from either the telephone Central Office or PBX all the way to the
customer's location or from Digital Loop Carrier (DLC) equipment located
in the customer's neighborhood. Figure 6-5 shows the differences between
a trunk and a line.
User Phone Digital Switch Digital Switch

64 Kb/s
Channel
IBM IBM
Wire
1 2 3
4 5 6
7 8 9
* 8 #
Line Trunk
Digital Loop Carrier Digital Switch Digital Switch

User Phone
64 Kb/s IBM
64 Kb/s IBM
Wire Channel Channel

1 2 3
4 5 6
7 8 9
* 8 #
Line Trunk
Figure 6-5: The difference between a trunk and a line

Figure 6-6 illustrates switching from line to line, line to truck, or trunk to
line. In its simplest form, the switch provides the interconnection between
all of the lines (which connect to telephones), and all of the trunks (which
connect to other switches). For every call initiated by a user on that switch,
the user's line must be connected to either another user's line in the switch
or with a trunk leaving the switch. For every call arriving at a switch on a
trunk, that call must either be connected to a user's line on that switch or to
another trunk for transmission to another switch.

1 2 3
4 5 6
7 8 9
* 8 #
IBM
Line to Line Trunk to Trunk
1 2 3
4 5 6
7 8 9
* 8 #
Line to Trunk Trunk to Line
Figure 6-6:Switching from line to line, line to trunk or trunk to line

The concept of lines and trunks does not exist for VoIP calls. For a VoIP
caller to reach users of traditional phones, a gateway must be provided to
convert the VoIP data stream into a 64 kb/s channel on a trunk that connects
to the switch. A discussion of some of the different types of trunks is
included near the end of this chapter.
Sizing network connections, trunks, from a switch

Every user of the switch has an individual line appearance in the switch.
There are not, however, as many trunks as there are lines. There are only as
many trunks between any two switches as the traffic volume between them
requires.
Sidebar: Erlangs
The number of trunks needed to support a given calling volume was
initially studied by a man by the name of Erlang. Because of the
statistical nature of call arrivals, it is not possible to add up the total
number of minutes or seconds that people want to talk on the phone and,
with a little arithmetic, calculate the number of trunks needed. However,
since it is based on statistics, the relationship between the number of
trunks needed to support a given calling volume 99% of the time or
99.9% of the time is always the same; so, traditionally, these values have
been captured in tables. As you might expect, they are often called
Erlang tables or Poisson tables because of the basic distributions
involved.
The one piece of direct arithmetic that does go into an Erlang table is the
calling volume. It is either measured in units of call seconds or in units
of (surprise!) Erlangs. One Erlang is one trunk continuously used for one
hour. This measurement is directly equivalent to 3600 call seconds (60
seconds X 60 minutes = 3600 call seconds). And 3600 call seconds = 1
Erlang.
The telephone company identifies how many trunks are needed by

measuring how much call traffic is associated with the busy hour. Traffic

fluctuates over the course of the day and the hour with the most traffic is
called the busy hour. Traffic also fluctuates from day to day in both a
statistical and a nonstatistical way. In North America, the busiest hour of
the whole year is likely to be sometime on Mother's Day. Almost all
telephone networks are engineered to deliver 99% or more of all traffic on
that busiest hour of the year.
Control, call routing

Call routing in a telephone system is based on the switch knowing which
trunks can be used to route a call to a specific telephone number. Part of
this information is kept in tables inside each switch, and part of it is
information built into the architecture of the entire switching system.
Digital voice switches carry routing information about a very small number
of destinations. The way that the telephone network optimizes the path
between endpoints, is through the offline traffic engineering practice of
setting up high usage trunks and the deliberate distribution of routing
information between switches in a hierarchical manner. If a high usage
trunk path is unavailable at a given switch, it uses a trunk destined for the
switch sitting at the next higher level of the hierarchy until it reaches a
point where tables within the switch carry routing information about either
the destination digital switch (called and end office) or the area code or the
country code that the user dialed. At the international gateway level, all
country codes are data-filled so that all calls will be guaranteed to receive,
as a minimum, information about how the far end can be reached.
An optimum path is not calculated for each call. Although protocols for
dynamically maintaining routing information between telephone switches
have been developed, they have not been widely deployed. From a data
networking paradigm, most of the information in digital switch tables looks
something like static routes, in that it is manually provisioned. It is,
however, much more flexible than static routes, in that it can be provisioned
to have an extremely complex route selection algorithm that accommodates
a wide variety of load sharing and failover mechanisms. Unlike the data
networking example, the digital switching world also has a reliable
mechanism for detecting end-to-end path availability so the reliability of
the routing process is better. These differences between the digital
switching and IP routing approaches to routing are important to note for
engineers who need to implement real-time networking solutions on a data
network. What is not there in each approach, can sometimes be as
important as what is there.
Signaling
Two types of signaling systems must exist within the switch, one for lines,
and one for trunks. Line side signaling communicates with telephone sets.
There are dozens of signaling protocols that might be used, each specific to
a particular country, type of telephone switch, or telephone set. Trunk side

signaling systems communicate between switches. There are three major

categories of signaling systems:
per trunk
ISDN
SS7
There are perhaps hundreds of per trunk signaling systems in use
throughout the world. These are the oldest kinds of signaling systems. They
steal bits from the data stream on the trunk to send signaling information
such as going on hook (hanging up). These allow point-to-point
communication between switches that share trunks.
ISDN signaling systems came along at a time when people were trying to
simplify signaling and build a universal network. You might expect that
you could just specify an ISDN connection and be able to count on the
signaling protocol, but there are dozens of ISDN variants. ISDN signaling
steals one or two trunks from a T1 or E1 to send signaling information for
all of the calls on that T1 or E1 (as many as eight T1).
Signaling System 7 is the closest thing that there is to a universal signaling
system between telephone switches. Entire countries often use the same
variant, usually some flavor of the ANSI or ETSI standards. SS7 signaling
relies on an entirely separate signaling network, very closely monitored, to
send signaling information between switches.
Per Trunk signaling
Gateway Switch A
Trk Trk
IP
Network
Call Control
No Gateways for
Per Trunk Trunks to
other
Signaling
switches
Trk: Trunk Interface with Per Trunk Signaling
Figure 6-7: Trunking - Per Trunk Signaling

When trunk signaling is done on a per trunk basis, each trunk (that is,
64 kbps channel) donates some of its bits to provide a signaling channel.
This reduces the effective bandwidth of the trunk to 56 kbps. Each switch
makes independent routing decisions to select a trunk bound for the next
and then pass call routing information to the next switch in the path.
From a practical perspective, there are no known gateways to/from IP
networks that support any of the per trunk signaling mechanisms.

ISDN signaling
Robbing bits from every channel to convey signaling information had
limitations for data communications and setting up an end-to-end
connection was slow. Another mechanism to convey signaling information
is to rob a complete channel from a DS1 or E1 for signaling purposes to
allow the other channels to be delivered at full rate (that is, 64 kb/s). This
scheme is implemented in the Primary Rate Interface1 (PRI) of ISDN
(Integrated Services Digital Network). A PRI is essentially one DS1 or one
E1. For a DS1, channel 23 (the last one in the 0-23 sequence) is used for
signaling. For an E1, channel 15 is used. These channels are called D (data)
channels and the other 23 (or 31) are called B (bearer) channels.
Switch
PRI Gateway
PRI Trk Trk PRI
IP
D D
Network
Call Control
Trk Trunk Interface

D D Channel Handler
PRI Primary Rate Interface Trunks
Figure 6-8: Trunking - ISDN

A digital switch that supports ISDN signaling will have some kind of a D
channel handler to relay messaging to the call control function of the
switch. As with the per trunk signaling mechanisms, the call setup process
progresses from switch to switch over the same transport facilities used to
carry the bearer channels.
To complete a call, each switch must find a path to the next switch in the
chain that leads towards the far end user. The first switch finds an idle
trunk, marks it so that nobody else can use it, and signals the far end switch
over the D channel. The signaling protocol used for ISDN is Q.931.
Q.931
The signaling protocol used by ISDN facilities is Q.931. The Q.931
protocol defines the messages sent over the D-Channel, both in terms of the
message format and the message sequencing. Figure 6-9 illustrates a
1. Note that if someone has an ISDN phone, it would not have PRI signaling. Instead, it would have BRI
signaling. A Basic Rate Interface (BRI) is two 64 kbps bearer channels and one 16 kbps D channel.
There are D Channel handlers for line side peripherals as well that permit signaling information to be
relayed to the call control functions of the switch.

typical call connection and tear-down sequence between two ISDN phones
connected to a Private Automatic Branch Exchange (PABX).
IBM
Originating Terminating
Switch IBM
Switch 1
4
2
5
3
6
7 8 9
* 8 #
1 2 3
Tandem
4 5 6
7 8 9
* 8 #
Switch
Off Hook
Setup
Dial Setup
Ringing
Proceeding Proceeding
Alerting
Alerting Answer
Hear Connect
Ringing Connect
Connect Ack
nowledge
Connect Ack
nowledge
Talk Path Established
t Hang Up
Disconnec
t
Hang Up Disconnec
Release
Release
omplete
omplete Release C
Release C
Figure 6-9: Q.931 Signaling

Besides having a D channel designated on every DS1 or E1, a single
64 kb/s channel can also convey signaling information for more than 23
channels. This mode of operation is called Non–Facility-Associated
Signaling (NFAS). In some switches, up to eight PRI can use the same
signaling channel. The way that Non-Facility Associated Signaling
participates in call processing is almost identical to Facility Associated
Signaling. The only difference is that rather than using the D Channel from
an individual DS1 or E1, the D-Channel provides the signaling channel for
multiple DS1s or E1s.
SS7 signaling
The last way that signaling information can be provided is through an
entirely outboard communication system. This method is the way that
CCS7 (Common Channel Signaling System number 7) is implemented.
Figure 6-10 shows the interconnection of switches by way of the CCS7
Network.

Switch A
PRI Gateway
ISUP Trk Trk ISUP
IP
SS7
Network
Call Control Trunks to
other
switches
STP
Switch Z
STP
ISUP Trk Trk ISUP
SS7
Signaling SS7
Links Network
STP
Call Control
Trk Trunk Interface STP
SS7 SS7 Interface
STP Signal Transfer Point
ISUP ISDN User Part Trunks
Figure 6-10: Trunking - SS7

Each switch in the network is referred to as a Service Switching Point
(SSP) and it sends signaling messages to other switches by way of a
network of Signal Transfer Points (STP). Databases (not shown) can exist
in the network to help with tasks like translating an 800 number dialed by a
caller to local number. These databases are called Service Control Points
(SCP).
There are several IP to SS7 gateways available on the market today.
Generally, these systems are capable of supporting a very large number of
calls (thousands to tens of thousands). Using them requires connecting to
the SS7 network with appropriate signaling links. (See the following
paragraphs for more details.)
If the messages being sent are related to call setup, they will use the ISDN
User Part protocol (ISUP) as shown in the transport path diagram at the
beginning of the chapter. All other messages, such as queries for call
forwarding and local number portability, will use the Transaction
Capability (TCAP) protocol.
Common Channel Signaling System number 7 (that is, SS7 or C7) is a
global standard for telecommunications defined by the International
Telecommunication Union (ITU) Telecommunication Standardization
Sector (ITU-T). The standard defines the procedures and protocol by which
network elements in the Public Switched Telephone Network (PSTN)
exchange information over a digital signaling network to effect wireless
(cellular) and wireline call setup, routing and control. The ITU definition of
SS7 allows for national variants such as the American National Standards
Institute (ANSI) and Bell Communications Research (Telcordia

Technologies) standards used in North America and the European

Telecommunications Standards Institute (ETSI) standard used in Europe.
The SS7 network and protocol are used for the following functions:
basic call setup, management, and tear down
wireless services such as Personal Communications Services
(PCS), wireless roaming, and mobile subscriber authentication
Local Number Portability (LNP)
toll-free (800/888) and toll (900) wireline services
enhanced call features, such as call forwarding, calling party name/
number display, and three-way calling
The way that Common Channel Signaling is implemented is that each
CCS7 enabled switch in an area has a connection to a pair of STPs that
serve the local area. That pair of STPs connects to each other and to other
pairs of STPs in a hierarchy of connections that span a region. Switches
that serve as gateways can have connections to a pair of STPs that serve a
different area or region.
The way that Common Channel Signaling participates in call processing is
on a switch by switch basis. The switches talk directly to each other
through the CCS7 network. Each switch along the path receives a request
to participate in the creation of an end to end circuit. Each switch must
make a routing decision and determine if it has trunks available to
participate in that circuit. However, the time it takes to signal each switch
along the path is very short, since each switch along the path needs to
identify only whether a trunk is available before the initial address message
is forwarded.
Each signaling point in the network has a unique address called a Point
Code. Point Codes are assigned in a hierarchical manner. Switches are
grouped together into a cluster, and clusters form a network. The Point
Code shows the Network Identifier, the Network Cluster Identifier and the
Cluster Member number.
Signaling links
SS7 messages are exchanged between network elements over 56 or 64 kb/s
bidirectional channels called signaling links. From the perspective of
setting up a connection to a VoIP gateway, two or more of these links will
need to be set up. The company that owns the SS7 network will require that
the equipment being connected goes through a rigorous certification
process. The company depends heavily on their network and needs to be
sure that any equipment attached to it, and the messages networked behave
in an expected manner, and that unwanted messages do not flood the
network.

SS7 Message Flow - ISUP

The following figure illustrates at a high level the signaling messages sent
to set up a call on the SS7 network. Testing for continuity is optional but is
a common practice for large carriers.
IBM IBM
Originating Terminating
Switch Switch 1
4
2
5
3
6
7 8 9
* 8 #
1 2 3
4 5 6
7 8 9
* 8 #
Off Hook Inital Addre

ss Messag
e (IAM)
Continuity
Loop
Continuity
M essage
Ringing
ge (ACM)
s s Com p lete Messa
Hear Addre
Ringing Off Hook
NM)
essage (A
Answer M
Talk Path Established

Hang Up
Release
Release C
omplete
Hang Up
Release
omplete
Release C
Figure 6-11: SS7 Call Setup Signaling


Voice channels on the Public Switched Telephone Network (PSTN) are
sampled 8000 times per second. This produces, with eight bit samples, a
channel of 64 kb/s. All TDM equipment carries 64 kb/s channels or
multiples of 64 kb/s channels. Every bit in a TDM stream occupies a
predetermined timeslot.
Conversion from analog to digital is often done at the digital switch.
Signaling (off hook, dialed digits) is needed for a digital switch to know
how to handle a particular call from a telephone set.
A digital switch has a dedicated line for every telephone set. It has
nondedicated trunks to connect between switches. Trunks go from one
switch to another.
The number of trunks needed to handle interswitch communication is
calculated from calling volume statistics using Erlang or Poisson tables.
A digital switch maintains its call routing tables (almost always) as manual
entries. In order to minimize the size of these tables, the switch takes
advantage of a hierarchy of switches within the network that maintain
information about portions of the numbering plan. A switch maintains its
local connection information, plus information about high usage routes. At
the apex of the hierarchy are international gateways that contain
information about all country codes in the world.
ISDN signaling uses Q.931 signaling.
SS7 signaling requires the setup of signaling links between switches and
Signal Transfer Points in an out of band.

139
Chapter 7
SONET/SDH
Anthony Lugo
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET
SONET/ TDM
Concepts covered
The evolution of SONET, which is driven by the next generation
applications and services of the telecommunications market.
The SONET protocol functionality within the network layers.
SONET Network element types and attributes.
Variance of flexible solutions to ensure network survivability,
redundancy and exceptional reliability.
The significant role synchronization plays within the SONET world.

Introduction
Today we deliver information at the speed of light, and the fiber optic
medium has proven to be a viable solution as a delivery agent.
Synchronous Optical Network (SONET) is an optical standard that permits
multiple vendors to deliver data, voice and multimedia applications to
residential and business customers. The purpose of this chapter is to define
and illustrate the concept of Synchronous Optical Network (SONET),
allowing the reader to understand the associated SONET structure and how
it can deliver value added applications to the telecommunications industry,
benefiting the consumer with new services.
Figure 7-2 illustrates the progressive evolution of SONET, beginning with
the foundation of traditional private line services growing into the optical
private line service and entering the new market demand for the SONET
next generation data applications and services.
Fast
SONET next Ethernet
ATM
generation ESCON
RPR
Data Services SAN FDDI
Service Flexibility
& Development
OC-3c/ Fibre
STM-1 Channel
Optical Private OC-12c/
STM-4 HDTV Video
Line Service OC-48c/ GE LAN
STM-16 WAN
DS1
OC-192c/
Traditional Private DS3 STM-64
10 Gigabit
Ethernet
Line Service DS0
Network Service Evolution
Figure 7-2: SONET service application evolution
Overview
The SONET Network has evolved over the years due to the demands of
new applications and services. The increasing demand for the
telecommunications industry to deliver and meet the demands of the
consumer market, has migrated the once traditional SONET legacy, time
division multiplexing networks, into the Next generation of SONET.
SONET enters a new stage of growth with the telecommunications market
striving for reduced capital and operating expenditures. The growing
demand of high-speed internet access, network security, new application
delivery, and the full drive of a competitive market is demanding the
utilization of the SONET network infrastructure, which has been critical
for a successful business market operation.

Chapter 7 SONET/SDH 141
What once was LAN, MAN or WAN has been blended into the SONET
fiber cloud, creating flexible, secure applications, such as ATM, Storage
Area Networks (SAN) and high bandwidth applications.
As the evolution of applications and services has been developing, so has
the Optical Network backbone infrastructure. In the early development of
the Optical Legacy network, the majority of configurations used to be a
simple ring or linear applications.
Today's optical network involves more complexity, as a network element
has evolved into a HUB or meshed network creating more flexibility and
density per Network Element (NE). A single NE that used to add and drop
traffic for a single BLSR ring, can now terminate multiple different types
of network configurations, such as a BLSR, UPSR, Linear 1+1, and 0:1
unprotected.
This advancement in the Optical Networks arena allows for the
interconnecting or meshing of networks, to a scale no one could have
foreseen within a multivendor environment. What used to be a separation
of networks in terms of LAN, MAN, and WAN, is now seamless, due to the
SONET optical network backbone.
SONET a practical introduction

As technology has evolved so has the complexity of the SONET network
infrastructure, which has been flexible to deliver a variety of applications to
meet market demand. SONET is an acronym for synchronous optical
network and is a standard for optical communications. The SONET
standard was initiated by Telcordia (formerly known as Bellcore) on behalf
of the Regional Bell Operating Companies (RBOCs) for the following
reasons:
multivendor environment (mid-span meet)
positioning the network for transport of a multitude of services
synchronous networking
enhanced Operations, Administration, Maintenance, and
Provisioning (OAM&P)
bandwidth management capabilities
In summary, SONET provides a solution with multivendor internetworking
capabilities. SONET also allows a single platform for multiplexing and
Demultiplexing since it is a synchronous format. Due to SONET's
bandwidth flexibility, it can carry many other services, such as high speed
packet switched services, ATM, Gigabit Ethernet, Resilient Packet Rings
(RPR), Generic Framing Protocol (GFP), ESCON*, and Fiber Channel,
while still permitting the existing DS1, DS3 Digital signal infrastructure.

SONET network migration

The vision of an all SONET/SDH network includes direct optical interfaces
on the digital switches and direct optical interconnection between all the
network elements comprising the fiber transport network.
This network minimizes and simplifies the equipment required for
bandwidth management and is not restricted to the transport of DS1s and
DS3s. The transport infrastructure can now be used for new services,
including video, data, SAN, ESCON, fiber channel, ATM, and Gigabit
Ethernet. These new service applications developed, utilizing the current
SONET transport infrastructure, while still maintaining the traditional
digital voice switches.
The copper DS0-DS1-DS3 (traditional services)

The SONET/SDH transports the asynchronous data streams that existed
prior to the introduction of SONET. In North America, the standard pre-
SONET digital hierarchy consisted mainly of the DS1 (1.544 Mb/s) and the
DS3 (44.736 Mb/s). The DS1 is the main interface to digital voice switches
and channel banks, whereas the DS3 is the main interface to fiber optic
transmission systems. The M13 multiplexer links the two together.
A similar hierarchy exists in most of the rest of the world. The 2.048 Mb/s
(E1) interface is the key digital switch interface, whereas most pre-standard
fiber optic transmission systems carry the 139 Mb/s signal.
ATM and other traffic services

One of the important benefits of SONET is its ability to position the
network for carrying new revenue-generating services. With its modular,
service-independent architecture, SONET provides vast capabilities in
terms of service flexibility.
High-speed packet-switched services, LAN transport, and High-Definition
Television (HDTV) are examples of new SONET-supported services.
Many of these broadband services may use Asynchronous Transfer Mode
(ATM)-a fast-packet-switching technique using short, fixed-length packets
called cells.
Asynchronous transfer mode multiplexes the payload into cells that may be
generated and routed as necessary. Because of the bandwidth capacity it
offers, SONET is a logical carrier for ATM.
Optical Ethernet applications

Optical Ethernet applications include the GbE application and the RPR
application, both explained below.

GbE application
Ethernet service at the Layer 2 level, using a SONET/SDH circuit
connecting multiple sites, can offer point-to-point Ethernet connectivity for
data centers, remote data backup sites, and servers without the addition of
dedicated data equipment. It can broaden their service offering and
increase their revenue potential. GbE service also benefits from the Layer 1
SONET/SDH protection schemes.
The addition of GbE support provides for more efficient bandwidth usage
of the 10 G signal, by being able to have different size payloads to carry the
GbE traffic and by being able to mix different types of services within the
same optical link.
RPR application
Resilient Packet Ring is a distributed switch (IEEE 802.1D bridge
functionality) application that is connectionless, packet-based, and allows a
shared bandwidth networking solution for Ethernet traffic over a SONET/
SDH backbone. RPR has 10/100/1000 Base-X capabilities, which is an
efficient carrier grade method, connecting to routers and LAN's utilizing a
Layer 2 add/drop/pass-through technology.
SONET terminology
This section describes SONET terminology including SONET level rates
and SONET layers and architecture.
SONET level rates

In brief, SONET defines Optical Carrier (OC) levels and electrically
equivalent Synchronous Transport Signals (STSs) for the fiber-optic based
transmission hierarchy. The standard SONET line rates and STS-equivalent
formats are shown in Figure 7-3. The fiber-optic rates shown in bold, seem
to be the most widely supported by both network providers and vendors.
The basic building block of SONET is an STS-1 signal, which has a bit rate
of 51.84 Mbit/s. Higher-level signals are integer multiples of the base rate.

For example, an STS-N has exactly N times the rate of 51.84 Mbit/s (for
example, an STS-12 is exactly 12 x 51.84 = 622.080 Mbit/s).
SONET
SONET SDH
SDH Rate
Rate
OC-1
OC-1/ /STS-1
STS-1 STM-0
STM-0 51.84
51.84Mb/s
Mb/s
OC-3
OC-3/ /STS-3
STS-3 STM-1
STM-1 155.52
155.52Mb/s
Mb/s
OC-12
CC-12
CC-12/STS-12
/STS-12 STM-4
STM-4 622.08
622.08Mb/s
Mb/s
OC-48
OC-48/ /STS-48
STS-48 STM-16
STM-16 2488.32
2488.32Mb/s
Mb/s
OC-192
OC-192/ /STS-192
STS-192 STM-64
STM-64 9953.28
9953.28Mb/s
Mb/s
OC-768
OC-768/ /STS-768
STS-768 STM-256
STM-256 39813.12
39813.12Mb/s
Mb/s
Figure 7-3: SONET/SDH rates
SONET layers and architecture

The SONET frame format is segmented into four interfacing layers
including, photonic, section, line, and path layers. SONET provides
considerable overhead information, the section and line layer create the
transport overhead section in which the customer payload is carried in the
Synchronous Payload Envelope (SPE) that contains the path layer. The
path Layer then can carry STS path payloads and/or Virtual Tributary (VT)
path payloads. Figure 7-4 illustrates the SONET frame format with STS

payloads and Figure 7-5, which depicts the SONET frame format with VT
payloads.
Figure 7-4: SONET FRAME FORMAT with STS path reload
Figure 7-5: SONET FRAME FORMAT with STS-1/VT Path Payload
Photonic layer
The optically transmitted SONET signal is referred to as an OC-N. This
layer is primarily responsible for the electrical to optical conversion. The
OC-N is essentially the optical equivalent of the STS-N, however, the
STS-N terminology is used when referring to the SONET format.

Section layer
The section layer transports the STS-N frames and the section overhead
across the photonic layer. This layer has the job of performing such
functions as, performance monitoring, local orderwire and Section Data
Communication Channels (SDCC). The SDCC is used to provide a
communications path between a centralized Operations System (OS) and
the various network elements.
Line layer
The line layer is responsible for the transportation of the SPE (customer
Payload) and the line overhead. Some attributes of this layer consist of Line
Data Communication Channels (LDCC), express orderwire, performance
monitoring, protection switching signaling and line alarms.
Path layer
The path layer transports the customer traffic at the DS-1, DS-3, DS-1VT,
DS-3VT, and Video level for the Path Terminating equipment. The Path
layer transports the customer payload and the path overhead to the
terminating SONET/SDH equipment.
Customer payload can be mapped into the path layer in a STS level payload
and a STS/VT level payload as illustrated in Figure 7-4 and
Figure 7-5.
In addition to the STS-1 Path level base format, SONET also defines
synchronous formats at sub-STS-1 levels that are defined as Virtual
tributaries. VT's are synchronous signals used to transport low-speed
signals.
The VT-Virtual Tributaries (VT) is designed to carry (asynchronous)
payloads, such as the DS1 that requires considerably less than 50 Mb/s
bandwidth. The DS1 is such an important payload, that the entire SONET
format can be traced back to the need for DS1 transport. There are seven
VT Groups within an STS-1 SPE, and different groups may contain
different VT sizes within the same STS-1 SPE. The structured STS-1
signal has VT payloads and VT path overhead that together constitute the
VT SPE similar to the STS SPE.
Summary
All SONET network elements have section and photonic layer
functionality. However, not all have the higher layers. A network element is
classified by the highest layer supported on the interface. Thus, a network
element with path layer functionality is referred to as Path Terminating
Equipment, either VT PTE or STS PTE.

Similarly, a network element with line layer functionality but no higher, is

Line Terminating Equipment. Finally, a network element with section layer
functionality but no higher, is Section Terminating Equipment.
Figure 7-6 shows the complete end to end attributes of the SONET
overhead path mappings.
Path Section Line Path

Terminating Terminating Terminating Terminating
Equipment Equipment Equipment Equipment
(PTE) (STE) (LTE) (PTE)
DS1 / VT VT Path VT Path DS1 /VT
DS3 STS Path STS Path DS3
Line Line Line
Section Section Section Section
Photonic Photonic Photonic Photonic
Section Section Section
Line Line
STS Path
VT Path
Figure 7-6: Section, line and path illustration
The network element

The network element is the device that incorporates the customer traffic
payload, so that it may be transported on the SONET network
infrastructure. Although Network Elements (NEs) are compatible at the
OC-N level, they may differ in features from vendor to vendor. SONET
does not restrict manufacturers from providing a single type of product, nor
require them to provide all types.
The main flavors of the SONET NE come in the type of a terminal, Add/
Drop Multiplexer (ADM), and drop and repeat (repeater), broadband
digital cross-connect and the wideband digital cross-connect.
Terminal
A SONET Line Terminating Equipment takes in a number of electrical
signals (tributaries) and transmits a single electrical or optical signal.
Add/Drop Multiplexer (ADM)

A single-stage multiplexer/demultiplexer can multiplex various inputs into
an OC-N signal. At an add/drop site, only those signals that need to be

accessed are dropped or inserted. The remaining traffic continues through

the network element without requiring special pass-through units or other
signal processing.
Drop and repeat (repeater)

SONET enables drop and repeat (also known as drop and continue), which
is a key capability in both telephony and cable TV applications. With drop
and repeat, a signal terminates at one node, is duplicated (repeated), and is
then sent to the next node and to subsequent nodes.
Broadband digital cross-connect

A SONET cross-connect accepts various optical carrier rates, accesses the
STS-1 signals, and switches at this level. It is ideally used as an optical
SONET hub. One major difference between a cross-connect and an add-
drop multiplexer is that a cross-connect may be used to interconnect a
much larger number of STS-1s. The broadband cross-connect can be used
for grooming (consolidating or segregating) of STS-1s or for broadband
traffic management.
Wideband digital cross-connect

This SONET NE type is similar to the broadband cross-connect except that
the switching is done at sub-rate STS/VT levels (similar to DS-1/DS-2
levels). It is similar to a DS-3/1 cross-connect because it accepts DS-1s and
DS-3s and is equipped with optical interfaces to accept optical carrier
signals. This is suitable for DS-1 level grooming applications at hub
locations.
Network configurations
A network comprised of SONET network elements allows for a network
configuration to be formed. Some customers may have a need for a certain
type of customer payloads, which may create the need for certain types of
network configurations.
The SONET NE can be part of the three traditional configurations, linear,
Bidirectional Line Switched Ring (BLSR), and/or Unidirectional Path
Switched Ring (UPSR) configurations.
As technology evolves, so does the network topology. The SONET next-
generation NEs can utilize not only the three types of traditional
topologies, but all topologies in a single box, allowing the capability of the
Optical Hub and Meshed configurations.

Linear configuration
A linear configuration can comprise of a 1+1 type or a 1: N type. The
protection switching detection for a single failure at a STS/OC-N level is at
the 50 ms level.
1+1 Protected
The most basic protection system is a linear 1+1 system (see Figure 7-7).
The term linear, differentiates it from ring systems and the 1+1 indicates
that there is one working fiber and one standby fiber, and the traffic in both
directions is permanently bridged onto both the working and the standby
fiber.
Terminal Terminal
Working Working
Linear
1+1
Protection Protection
Figure 7-7: 1+1 protected
1: N Protected
In a 1: N Configuration, there is one protection facility for several working
facilities (range one to fourteen). Figure 7-8 illustrates the 1: N protection
architecture. If one of the working lines detects a signal failure or line
degradation, then the working traffic will be switched to the protection line.

1:N Protection
Terminal Terminal
Working # 1 Working # 1
Working # 2 Working # 2
Working # N Working # N
N <= 14
Protection Protection
Figure 7-8: N protected
Ring configurations
The telecommunications market demands the utmost quality from a
SONET infrastructure, and network performance and network survivability
are important. Survivable rings and route diversity are two characteristics
that almost become a necessity for a SONET infrastructure, and the Uni-
directional Path Switched Rings (UPSR) and Bidirectional Line Switched
Rings is a solution to meet these demands.
The two primary types of ring configurations are as follows:
UPSR
Dedicated protection bandwidth
Bellcore GR-1400-CORE
BLSR
Shared protection bandwidth
Bellcore GR-1230-CORE
A comparison of UPSR and BLSR is shown in Figure 7-9.

A Comparison of BLSR and UPSR

BLSR - Bellcore GR-1230-CORE UPSR - Bellcore GR-1400-CORE
STS-1 # Ch1 STS-1 # Ch 2
STS-1# Ch 2 Used
STS-1# Ch 2 Used in all Spans
NE 1 in all Spans NE 1
NE 2
STS-1 # Ch1 STS-1# Ch 1 STS-1 # Ch 2 NE 4 TA-496 NE 2
STS-1 # Ch 1
NE 4
NE 3
NE 3
STS-1# Ch 1 Used
in all Spans
STS-1# Ch 1
STS-1 # Ch 1
• Bi-directional flow enables timeslots to be
reused around the ring • Uni-directional flow requires dedicated timeslots
• Total BLSR Network capacity is always around entire ring
greater than or equal to capacity of UPSR • Total UPSR network capacity cannot exceed the
• Total 2F-BLSR Network Capacity* = optical line rate
(OC-N rate ÷ 2) x # Nodes • So Total UPSR bandwidth = OC-N rate of the main
*4F-BLSR Capacity = (OC-N rate) x # Nodes Optical Line Rate
Figure 7-9: UPSR vs. BLSR

UPSR and BLSR both provide route diversity and 100% survivability
against fiber cuts and node failures. Service restoration is provided by
interworking of the SONET network elements in the ring. The two main
differences between UPSR and BLSR are the total network capacity and
the protection switching process of the network elements.
BLSR can support more than one connection on the ring using the same
channels, as long as the connections do not overlap on any spans. This is
not true of the UPSR, which uses an entire channel in both directions, all
the way around the ring for each and every connection. Therefore, in
general, BLSR can support more connections than UPSR, even if they are
both operating at the same line rate.
The exception is when all of the connections have one end at the same
node, in which case, the BLSR and the UPSR have equivalent capacity.
Thus, UPSRs are often used in access networks, in which all of the traffic is
homing in on the local switching office.
Ring network survivability

Ring-based APS provides an increased level of survivability for SONET
networks, by allowing as much traffic as possible to be restored, even in the
event of a cable cut or a node failure.

In a ring, a number of network elements are connected in a closed loop. In

the event of a disruption at any point on the ring, the affected traffic can be
restored by rerouting it the other way around the ring. In the BLSR, half of
the capacity in each direction is assigned to working traffic and half is
assigned to protection. If it is a two fiber ring, then half of the STS-1s on
each fiber are assigned to working and half to protection. If it is a four fiber
ring, then two fibers, one in each direction, are assigned to working and the
other two are assigned to protection. Bidirectional flow enables timeslots to
be reused around the ring. Thus, the total ring capacity is always greater
than or equal to capacity of UPSR.
Optical hub and meshed configuration

Most existing asynchronous systems are only suitable for point-to-point,
but SONET supports an optical multipoint hub, or meshed configuration.
An optical hub is an intermediate site from which traffic is distributed to
three or more spurs. The hub allows the four nodes or sites to communicate
as a single network instead of, three separate systems. Optical hubbing
reduces requirements for back to back multiplexing and demultiplexing,
and helps realize the benefits of traffic grooming.
Also realize that with Optical hubbing, depending on the Vendor, there is
the ability to support many different configurations from the SONET hub
itself, that is, multiple BLSR, multiple UPSR and multiple 1+1
configurations.
A SONET Network Element can provide this level of flexibility, allowing
for tremendous savings in Capex and Opex. Network providers no longer
need to own and maintain customer-located equipment.
Figure 7-10 shows the comparison between a hub and meshed pattern
network infrastructure.
Uniform Mesh
Hub Pattern
Pattern
Figure 7-10: Comparison between hub and meshed pattern

A meshed pattern simply does not only have the network element to
network element topology, but offers an interconnection to each and every
node within the meshed topology offering network survivability,
redundancy and exceptional reliability.
Gain in flexibility of today's optical network infrastructures provides
tremendous opportunities in terms of service and applications. Consider the
network shown in Figure 7-11, that can basically provide every type of
SONET topology within a customer's network infrastructure, thus enabling
an endless array of applications and services to meet every customer's
needs.
Linear
1+1
BLSR Hub Pattern Uniform Mesh

Pattern
Linear
1+1
UPSR
Linear
1+1
Figure 7-11: Example network
Synchronization
Synchronous vs. asynchronous

Traditionally, transmission systems have been asynchronous, with each
terminal in the network running on its own clock. In digital transmission,
“clocking” is one of the most important considerations. Clocking means
using a series of repetitive pulses to keep the bit rate of data constant and to
indicate where the ones and zeroes are located in a data stream.
Asynchronous multiplexing uses multiple stages. Signals such as
asynchronous DS-1s are multiplexed; extra bits are added (bit stuffing) to
account for the variations of each individual stream and are combined with
other bits (framing bits) to form a DS-2 stream. Bit stuffing is used again to
multiplex up to DS-3. The DS-1s are neither visible nor accessible within a
DS-3 frame. DS-3s are multiplexed up to higher rates in the same manner.

At the higher asynchronous rate, they can not be accessed without

demultiplexing.
In a synchronous system, such as SONET, the average frequency of all
clocks in the system will be the same (synchronous) or nearly the same
(plesiochronous). Every clock can be traced back to a highly stable
reference supply. Thus, the STS-1 rate remains at a nominal 51.84 Mbps,
allowing many synchronous STS-1 signals to be stacked together when
multiplexed without any bit stuffing.
Understanding synchronization
SONET-based equipment derives many of its basic attributes from
synchronous operation. Synchronization is required in networks that
contain:
Add/Drop Multiplexers
Terminals
Synchronous tributaries
These configurations require synchronization among the network elements,
to avoid the effects of the SONET synchronous transport signal pointer
repositioning within the frame. When a network element is synchronized,
all synchronous tributaries and high-speed signals generated by that
network element are synchronized to its timing source. Normally, one
network element in a UPSR is externally timed. To protect the network
timing against complete nodal failure, two network elements in a UPSR
can be externally timed.
Network element timing methods

Generally, each network element is synchronized by one of the following
methods:
Internal timing
External timing
Line timing
Figure 7-12 illustrates the different timing options for a typical network
element.

Figure 7-12: Network Element Timing Mode Examples
Internal timing
A SONET-compliant free-running clock produced within the network
element provides internal timing. Network elements with certain circuit
packs can provide timing signals of Stratum 3 (ST3) quality.
External timing
An external timing signal is obtained from a building-integrated timing
supply (BITS) clock of ST3 or better. ST1 reference quality would be the
preferred level of timing for SONET network elements.
Line timing
Line timing is derived from an incoming SONET frame (OC-3, OC-12,
OC-48, and OC-192), DS1 facility or EC-1 facility.
Transport line timing

Transport line timing is shown in Figure 7-12, example (c). When using
transport line timing, a network element derives timing from a received
transport signal. Possible sources of transport line timing are OC-3, OC-12,
OC-48, and OC-192 facilities.

Tributary line timing

Tributary line timing is shown in Figure 7-12, example (d). When using
tributary line timing, a network element derives timing from a received
tributary signal. Possible sources of tributary line timing are OC-3, OC-12,
OC-48, DS1, and EC-1 facilities.
When the network element timing mode is set to line timing (no distinction
is made between Transport and Tributary on the user interface), it selects
one of up to two provisioned timing sources (primary and secondary timing
references) as the active timing reference. This signal is used in network
elements to synchronize the outgoing transport signals in all directions, and
the synchronous tributaries terminated by the network element. The
selection of the best quality signal is made, based on the stability of the
transport signal, the synchronization message, and any incoming
synchronization status provisioned by the user.
Stratum clocks
Stratum clocks are stable timing reference signals that are graded
according to their accuracy. American National Standards Institute (ANSI)
standards have been developed to define four levels of stratum clocks.
The accuracy requirements of these stratum levels are shown in
Figure 7-13.
Figure 7-13: ANSI Stratum Clock Accuracy Requirements
Timing loops
A timing loop is created when a clock is synchronizing itself, either
directly or through intermediate equipment. A timing loop causes excessive
jitter and can result in traffic loss. Timing loops can be caused by a
hierarchy violation, or by having clocks of the same stratum level
synchronize each other. In a digital network, timing loops can be caused
during the failure of a primary reference source, if the secondary reference
source is configured to receive timing from a derived transport signal
within the network.
A timing loop can also be caused by incorrectly provisioned
Synchronization Status Message (SSM) for some of the facilities in a linear
or ring system. Under normal conditions, if there is a problem in the system

(for example, pulled fiber), the SSM functionality will heal the timing in
the system. However, if the SSM is incorrectly provisioned, the system
might not be able to heal itself and might segment part of itself in a timing
loop.
Synchronization-status messages
Synchronization-Status Messages (SSM) indicate the quality of the timing
signals currently available to a network element. The timing sources that
can be provisioned in a network element include external timing from a
BITS clock, timing derived from SONET interfaces, and the internal clock
of the network element. A network element can select the better of the two
timing signals provided by the primary and secondary timing sources
provisioned by the user. The selection is based on the quality values carried
in the SSMs.
Figure 7-14 provides an example of a network showing the synchronization
flow, head-end network element, synchronization boundary, and
synchronization status messaging.
Figure 7-14: SSM Signal Flow Example


The reader should have an understanding of the evolution of SONET as it
progressed into the role for the next generation SONET applications it can
support today.
The basic building block of SONET is the STS-1 signal, which is at the
51.84 Mb/s rate. The SONET frame format is segmented into four layers
including, the photonic, section, line and path layers.
The Synchronous Payload Envelope (SPE) contains the actual STS and/or
Virtual Tributary (VT) customer traffic payload.
SONET network element types include, the terminal, add/drop multiplexer
(ADM), repeater, broadband digital cross-connect, and the wideband
digital cross-connect. As with these various flavors of network element
types, numerous network configurations can be created.
The traditional network configurations are linear, BLSR, and UPSR. BLSR
is a Bellcore GR-1230-Core standard and the UPSR is a Bellcore GR-
1400-Core standard. The two main differences between BLSR and UPSR
are the protection switching process and the total network capacity.
Synchronization is needed for terminals, ADMs, and synchronous
tributaries. The main network element timing schemes are comprised of,
internal timing, external timing, line timing, and tributary timing. Stratum
clocks are accurate synchronization reference signals, and the SSM can
indicate the current quality level of the timing source for a network
element. A timing loop is created when a clock is synchronizing itself,
either directly or through intermediate equipment.

159
Section III:
Protocols for Real-Time Applications
Protocols are basic building blocks of packet networks. Section 3
introduces protocols essential to real-time operation: protocols for media
transport and transport control, protocols for call and session setup, and
protocols and mechanisms that help real-time and non–real-time services
successfully share network resources. The section begins with a discussion
of Real Time Protocol (RTP) and its associated protocols Real Time
Control Protocol (RTCP) and Real Time Streaming Protocol (RTSP). The
characteristics and use of these protocols are described, and similarities and
differences with TCP and HTTP are highlighted. Chapter 9 talks primarily
about SIP and H.323. These are the media gateway protocols that establish
sessions, or call setup messaging, required in real-time communications.
The different setup protocols provide the same function, but accomplish
this in different ways. SIP works between end-points while H.323 places
the call setup intelligence in a centrally-located device.
The final chapter in this section covers the strategies and mechanisms
available in IP networks to provide Quality of Service (QoS). The chapter
explains what the different components, such as shapers and schedulers,
contribute to help differentiate flows and prioritize forwarding. The various
components are intended to ensure that real-time traffic is transported
across the network within performance limits, therefore, maintaining the
expected QoE (Quality of Experience). Later sections will address the
practical implementation of these techniques.

160

161
Chapter 8
Real-Time Protocols: RTP, RTCP,
RTSP
Hung-Ming (Fred) Chen
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
The purpose and operation of RTP & RTCP
RTP relays and how they operate: RTP mixer, RTP translator
Comparison of RTP and TCP
The purpose and operation of RTSP
Real-Time aspects of streaming applications
How RTP/RTCP, RTSP combine for form a complete package for
management of media streaming
Comparison of RTSP and HTTP

162 Section III: Protocols for Real-Time Applications
Introduction
Today, real-time and near–real-time applications are becoming very
common on IP networks and are being used for many different purposes
spanning both conversational (at least two party) personal communication
and streaming (typically one-way) applications. VoIP and IP Telephony are
becoming popular. Radio stations and TV channels now offer streaming of
live and archived programming over the Internet. Corporations use
streamed messages to promote new products and to provide education and
product documentation to customers. There is great promise for new
multimedia real-time services from converged networks. A set of protocols
has been developed to address the requirements of transporting the content
of these real-time/near-real-time services and to offer basic control for
streaming services. These are the Real-Time Transport Protocol (RTP), the
Real-Time Control Protocol (RTCP), and the Real-Time Streaming
Protocol (RTSP).
Successful Real-Time streaming applications must coordinate several
protocols: HTTP, RTP, RTCP, and RTSP. HTTP is used to retrieve the
presentation description. RTSP uses the description to set up and tear down
the sessions. RTP transports the contents to the end device, and RTCP is
used to report transmission statistics back to the server so that RTP can
adapt to network conditions.
In this chapter, we discuss features and operations of these protocols, and
the relationships between them. In addition, we compare RTP with TCP,
and RTSP with Hypertext Transfer Protocol (HTTP) to identify operational
similarities and differences.
Real-Time Transport Protocol (RTP)

The Real-Time Transport Protocol (RTP) is a lightweight protocol with
properties intended to assist delivery of real-time application data. Features
include timing information, loss detection, security, and content
identification. Although it is intended to operate over UDP/IP, the
developers have tried to make RTP operate independently of underlying
network protocols such as Connectionless Network Protocol (CLNP),
Internetwork Packet Exchange (IPX), UDP/TCP, IP, Frame Relay, or ATM.
In particular, RTP is designed to meet the requirements of multiparticipant
multimedia conferences. In cases where packets are sent to a large
distribution list, a companion protocol—the Real-Time Control Protocol
(RTCP)—adds some features that improve operation. A primary RTCP
function may be used to provide feedback to the application or to the
network service provider to monitor and respond to transport difficulties.
On the other hand, RTP does not require bundling with RTCP if preferred.
RTCP is discussed in more detail in the following paragraphs.

Chapter 8 Real-Time Protocols: RTP, RTCP, RTSP 163
Other applications can use RTP as well, for such functions as such as
storage of continuous data, interactive distributed simulation, active badge
tracking systems, and control and measurement.
RTP was defined by the IETF in RFC 1889 and revised in RFC 3550. The
International Telecommunications Union (ITU) has adopted RTP as one of
its standards for multimedia data streaming (H.225.01). Many standard
protocols use RTP to transport real-time content, including H.323 and SIP
for IP telephony applications and point-to-point and video conferencing,
RTSP for streaming applications, and Standard Announcement Protocol/
Session Description protocol (SAP/SDP) for pure multicast applications.
Its data format provides important information for operation and control of
real-time applications.
Features
RTP was designed with several goals in mind. First, it is intended to be a
lightweight and efficient protocol to define Application Level Framing
(ALF) and integrated layer processing. Second, one flexible mechanism
was provided rather than several dedicated algorithms. Third, RTP is
protocol-neutral, allowing it to integrate with various lower layered
protocols, such as UDP/IP, IPX, and ATM-AALx. Then, the elasticity of
the Contributing Source Identifier (CSRC) field provides a scalability
mechanism to communicate with large number of sources. Meanwhile,
partitioning of the control and transport functions into separate protocols
simplifies the operation. Finally, RTP also provides secure transport
through support of encryption and authentication protocols.
The RTP packet header provides information on packet sequence
numbering, type of payload, timestamping, and delivery monitoring. The
sequence number can be used to identify missing packets and to reorder out
of sequence packets. The payload type is indicated using a profile that
identifies all the types of data that may be used by the application during
the session. The timestamp corresponds to the time of packetization of the
first data sample in the packet. Timestamping permits inter- and intra-
media synchronization, such as time-alignment of audio and video signals
in a film (lip synch), since the audio and video components may be
transported as separate RTP streams. RTCP handles the delivery
monitoring function, sending reports to inform the RTP layer about the
status of network, such as reporting lost packets, interarrival jitter, and
other statistics. The feedback allows coding and transmission settings to be
adjusted to optimize the quality of the application or service.
1. H.225.0 is a component of the H.323 standards family.

Sequence numbering of packets

The sequence numbers provided by RTP support play-out functions in the
application. RTP does not contain any functions of its own to address
packet play-out or sequencing problems. The sequence numbers can be
used to identify gaps in the data or packets arriving at the buffer out of
sequence. This arrangement gives applications great flexibility to handle
packet aberrations, and keeps RTP a very thin protocol.
Payload type identification

The payload type field provides information for applications to identify
what type of coding is being used, the sampling rate, the number of
channels, and other information needed for playback. Once a session is
established, only one payload type is allowed. Changing the payload type
requires setting up a new session. IETF standards called Profiles define
generic fields within the RTP header, including payload type values
assigned to particular coding schemes. A couple of RTP profiles have been
proposed, such as RFC 3551 (RTCP-based feedback and TCP friendly rate
control, including RTP A/V), and RFC 3711 (RTP Secure A/V Profile).
The RTP A/V Profile defines audio and video applications. There are two
main categories of payload type specified in RTP A/V profile: static and
dynamic. Static payload types are those that are most commonly used, and
the payload type values are the same for everyone. Codecs with values
already assigned include G.723.1 (PT=4), JPEG (PT=26), and H.263
(PT=34). Dynamic payload types map payload type to a specific audio or
video encoding only for the duration of a session. The available values for
dynamic payload type range from 96 to 127. A dynamic payload type
might be used for session description, such as announcement and
invitation, and other signaling protocols.
Typically, a profile is used for an application within a particular RTP
session. A multimedia application uses more than one session to deliver
contents. For instance, a multimedia application transports audio and video
streams in concurrent sessions. Synchronization of the different sessions in
the output requires the global time information (NTP field in RTCP header)
provided in RTCP.
The payload format specifications define payload data carried in RTP. For
instance, payload format for MPEG1/MPEG2 video is specified in RFC
2250. Each static payload type defined in RTP/AVP assigns its own
payload format. Payload types not included in the RTP A/V profile can be
registered in the MIME subtype.
Timestamps of packets
The timestamp captures the sampling time of the first octet (sample) of the
packet. The timestamp increases monotonically according to the clock of

the source and the size of payload. The receivers use the information to
check for gaps in the data and for out-of-order packets. Timestamps can be
used to synchronize flows from different sources, such as different end
systems or different sessions (audio and video) within a multimedia
application. However, RTP itself does not provide the mechanisms to do
this; the applications must contain the appropriate functionality to use this
information (and global time information within RTCP messages) to
synchronize streams.
For audio, the packet size (the number of bits in the packet, based on the
interval of packetization and the sampling rate) determines the increment
of the timestamps. For instance, an audio stream using a sampling rate of
32 kHz and a 20 ms packet will have a timestamp increment of 640 for two
consecutive packets. If silence suppression is used and no packet is sent,
the timestamp increments nevertheless, so the timestamp on the next
speech packet that is sent will include any silent interval.
For video, timestamps vary for different conditions. In general, timestamps
increase with the nominal frame rate. For instance, timestamps increase by
3,000 for each frame with a 30-frame/sec video, while timestamps step by
3,600 for a 25-frame/sec video. When a video frame is segmented across
several RTP packets, all the packets are marked with the same timestamp.
Where an atypical system is used, such as a special codec, or the
application cannot determine the frame number, the timestamps might need
to be computed from the system clock.
Timestamps and sequence numbers allow the application to play out the
audio and video packets with the correct timing, even where silence
suppression is used (timestamp), and to detect and compensate for missing
packets (sequence number). When RTP packets have the same timestamp
(that is, a video frame segments into several RTP packets), the sequence
numbers are used to determine the appropriate order for decoding and play-
out.
RTP relay
RTP Relay agents are frequently used to translate payload formats for
flows where the two end systems cannot exchange packets directly. There
are two classes of RTP Relay agents: RTP translator and RTP mixer.
RTP mixer
Several RTP flows can be merged into a single flow with the help of RTP
mixers. For example, where the original sessions require more bandwidth
than network can provide, two audio streams can be combined into a
single, more efficient flow. Synchronization is calibrated from the
contributing flows according to the content. The RTP mixer also assigns a
new source identification for the combined stream. Therefore, RTP mixer
can greatly reduce the bandwidth consumption, which is especially helpful

with low speed dial up access network. Figure 8-2 shows the mixer
combining two individual sessions (SSRC = 7 and SSRC = 36) into one
combined session with SSRC = 42.
End System X
RTP Mixer End System Z

SSRC=7
SSRC=42
SSRC=36 CSRS List = {7, 36}
High Speed Low Speed

End System Y Links Links
Figure 8-2: Example of mixer
RTP translator
An RTP translator performs a similar function, but maintains the individual
RTP flows instead of combining RTP sessions into single one. The source
identifiers are preserved so they can be separated again downstream. The
RTP translator can be used to convert media encodings, duplicate multicast
streams into unicast streams, and filter RTP flows on the application level,
as, for example, firewall services. Figure 8-3 shows two translators placed
on each end of a tunnel for secured media distribution. In this diagram, the
two end systems use separate sessions (SSRC=7 and SSRC=36) end-to-
end, including the encrypted portion of the channel.
End System W
End System Y
RTP Translator RTP Translator

SSRC=7 SSRC=7 SSRC=7
SSRC=36 SSRC=36
SSRC=36
End System X
End System Z
Figure 8-3: Example of translator
RTP Limitations
RTP is often criticized for its heavy overhead. For each media session, the
combination of the IP, UDP, and RTP control information adds up to forty
bytes (twenty bytes IP, eight bytes UDP, and twelve bytes RTP). The

problem of overhead is especially important on low-speed links, such as

dial-up access. For instance, for a 64 kb/s access link, the serialization time
of the overhead is 2.5 ms. Some solutions have been proposed to relieve
cumbersome overhead, such as use of header compression or definition of a
light version of RTP.
To maintain a thin protocol, the design of RTP does not provide any
mechanism to ensure quality of service or timely delivery of packets. It
relies on lower-layer mechanisms to guarantee real-time performance.
Similarly, RTP does not contain any feature to detect missing or mis-
ordered packets, or to reorder packets arriving out of order. It relies on
applications to handle these problems.
Firewalls present a difficulty for RTP. The purpose of firewall is to prevent
unauthorized access. RTP uses UDP to transport its packets and most
firewalls deployed do not allow UDP traffic to pass through. An RTP proxy
has been introduced to make RTP compatible with firewalls.
Real-Time Control Protocol

RTCP is the control protocol defined to work in conjunction with RTP. It
provides the measurement and statistics feedback information for an
originator to adjust the delivery of media flows. It was originally
standardized in RFCs 1889 and 1890; these were updated in RFC 3550 and
3551, respectively.
The main features of RTCP include:
Provide quality feedback to the send and receive sides, such as
Network Time Protocol (NTP) timestamp, cumulative number of
packet loss, and interarrival jitter for senders and receivers;
Identify RTP participants with canonical names (CNAMEs, also
called persistent transport-level identifiers); for example, a
multimedia application requires CNAME to associate multiple
session data streams from a participant;
Determine feedback intervals for scalability; in particular, for large
group conference sessions, where each participant sends an RTCP
report to all the others;
Provide basic session control information for participants; for
instance, in sessions allowing participants to join and leave the
without extensive authentication control and/or parameters
negotiation (optional).
QoS monitoring and congestion control

Periodically, the RTP session endpoints report statistics via RTCP. The
function provides useful feedback to applications regarding the timeliness
and completeness of packet delivery. The reports include statistics on the

number of packets sent, the number of packets dropped in the network, and
packet jitter. The recipient of these quality reports uses the information to
adjust the system or to diagnose problems. For instance, a sender can use
the statistics to modify its transmission settings. Receivers can determine
whether problems are local or remote. Network and service providers can
use the RTCP information to monitor network performance, and in
particular, performance with multicast connections.
Translating source identifiers to real world names

RTCP provides a persistent transport-level canonical name for a source.
The typical format of a canonical name looks like
“smallgroup.biggroup.com.” This identifier will not change even when the
synchronization source identifier (SSRC) changes during a transaction. It
also provides binding across multiple media tools from a single user. The
receiver can use the identifier to synchronize different sessions, such as
audio and video flows.
Ensuring scalability
For a conferencing or multicast session, there is an inevitable tradeoff
between adding more and more participants and preventing overwhelming
network bandwidth requirements. This suggests that RTCP control
messages should be limited to a small fraction, say five percent, of the
overall session traffic. During video conferencing, each endpoint sends
control packets to other endpoints; thus, every endpoint can keep track of
the number of participants. Depending on the number of participants and
proportional traffic allowed for RTCP control messages, each member can
calculate the frequency with which to send RTCP packets. In addition, it is
suggested that at least 25% of RTCP bandwidth should be reserved for
source reports to permit new receivers timely recognition of their canonical
names.
Conveying minimal session control information

RTCP can serve as an optional channel for all participants to exchange
session control information. Normally, this is done by the Session Control
Protocol (SCP) with heavy information exchange between client and
server. RTCP provides a convenient channel with loose control for
exchanging information between participants. For example, participant
information supplied by RTCP can then be used by the user interface to
display participants names on the application GUI. This can be especially
useful where users can join and leave sessions casually, without
authentication.

RTP and TCP

As mentioned above, RTP is a thin protocol. To achieve this, it was
designed without many of the properties normally associated with transport
protocols. Missing functions include delivery acknowledgement, reliable
end-to-end communications (acknowledgement and resend capabilities), a
demultiplexing mechanism, and flow/congestion control. RTP runs on end-
to-end systems and provides a demultiplexing capability; however, it lacks
reliability assurance as well as flow and congestion controls. Since RTP is
usually implemented within an application, it can rely on the application or
lower layer protocols to provide these missing properties.
In contrast, Transmission Control Protocol (TCP) is a full-function
transport protocol. Even where TCP is implemented as a part of an
application, these functions are available. In fact, it is the operation of these
reliability and congestion control features that make TCP unsuitable for
real-time applications.
Real-time applications cannot wait for far-end acknowledgement of receipt
of packets, and TCPs tendency to throttle back the packet rate when faced
with congestion is incompatible with real-time data delivery requirements.
The real-time applications require continuous and timely delivery of
information. A reliable communication guarantees all packets arrived via
packet retransmission. The acknowledgement and retransmission increase
delay dramatically; the media output from the receiver would be held up
until a late packet is finally received or the process timed out.
TCP uses an adaptive interval (congestion window) to adapt its sending
rate to network congestion. The size of the interval or window determines
how many packets TCP will send before waiting for an acknowledgement.
Initially, the window is small (slow start), and increases exponentially as
acknowledgement (ACK) messages are received. The window size will
converge on a balance between sending rate and received ACK messages,
after which it stabilize until conditions change (self-clocking). If a packet
loss is detected, the TCP flow control mechanism decreases the congestion
window by half. In contrast, real-time applications cannot tolerate the
sudden change of bandwidth; their bandwidth must persist until a lower
rate can be properly negotiated.
In addition, TCP cannot support multicast. Media streaming applications
are multicast in nature. Neither does TCP carry any timestamp or encoding
information to support synchronization and timely play-out. We can see,
then, that TCP and RTP fill very different roles, and TCP cannot replace
RTP to distribute real-time flows. Table 8-1 lists the comparison of
features between TCP and RTP.

Service TCP RTP
Session protocol
Reliable connection
Flow/Congestion control
Error recovery
Multicast
Timing synchronization
Table 8-1: Service comparison of TCP and RTP
Real-Time Streaming Protocol (RTSP)

Real-time Streaming Protocol (RTSP) (RFC 2326) is an application-level
protocol that provides a command channel between the user and the media
server. It provides an extensible framework to provide controls on delivery
of streaming data, whether from stored contents, live broadcast, or
multicast. RTSP does not handle the data delivery (which is left to a
transport protocol, such as RTP), but acts as a network “remote control” for
streaming media servers. It provides functionality—such as play, fast
forward, reverse, pause, and stop—similar to VCR functions.
RTSP also plays an important role in interoperability of streaming media
services. It offers a standard method for clients and servers from multiple
vendors to coordinate delivery of multimedia streaming contents. Many
components are required to establish a media streaming flow, including
players, servers, and encoders. RTSP is only one of the components needed
for end-to-end delivery of media. For example, RTSP can deliver
commands to retrieve RealAudio data formatted for a QuickTime player,
but does not guarantee the data can be successfully played on that player.
Functions
RTSP is more like a framework rather than a protocol. The control
mechanisms include session establishment, session termination, and
authentication. RTSP is designed to carry “VCR-style” commands and
coordinates with RTP to control and deliver media data. Therefore, RTSP
takes advantage of RTP features, such as selection of different delivery
channels, including UDP, TCP and IP multicasting. RTSP can control
multiple delivery sessions simultaneously.

RTSP supports the following operations:

Retrieval of media from media server: A client can invoke a
media presentation by sending an HTTP request to the media
server. The media server replies with an associated presentation
description. The presentation description will include either a
multicast address and ports or a unicast destination address that the
client may use.
Invitation of a media server to a conference: For distribution
purposes, a media server can be “invited” to join an existing
multicast/broadcast presentation. The invited media server can
either distribute the presentation as a whole or in part, or record the
presentation for later distribution. The media servers participating
in the presentation can take turns controlling the presentation. For
example, if a server is scheduled for maintenance, then another
media server can take over the duty to deliver streams.
Addition of media to an existing presentation: During a live
presentation, the server can tell the client about additional media
becoming available. For example, a poll for audience feedback in
an online event could take advantage of this operation.
Methods and operation
RTSP methods
An RTSP request takes the same form as HTTP. However, while HTTP
request are limited to content download or stp, RTSP can request any of
several actions, referred to as methods, which are specified in the header.
The four main commands for real time services are:
SETUP: Requests that the server establish a session with the
requesting client. The SETUP request for a URI specifies the
transport mechanism to be used for the streamed media. The
available transport parameters of the client are given in the transport
header. Upon receiving the options, the server responds with the
selected transport parameters.
PLAY: Requests that the server begin transmitting streaming data
over the specified transport channel.
PAUSE: Requests that the server suspend transmission, but keep
the session open and wait for another command.
TEARDOWN: Requests that the server stop sending data and
terminate the session. The resources used by the media stream are
all be released.

Other commands and messages are available in RTSP.

OPTIONS: Request to receive a list of available commands
(required).
DESCRIBE: Get the description of a presentation or media object
from server (recommended).
ANNOUNCE: Either send the description of a presentation or
media object to server or update session description from server
(optional).
GET_PARAMETER: Access the parameters of a presentation or
stream (optional).
SET_PARAMETER: Set the value of a parameter in a
presentation or stream (optional).
REDIRECT: Request that the client connect to a different server
(optional)
RECORD: Request that the server record a presentation, such as
video/audio conferencing (optional)
Operation
RTSP, the so called “Internet VCR remote control protocol”, provides a
method of emulating VCR commands. It does not handle the transport of
streaming data between server and client, but relies on a transport protocol
to deliver it. RTSP works with other transport protocols, but is generally
combined with RTP and RTCP. Coordination of RTSP with RTP/RTCP
provides complete functionality for transport and control of streaming
media. Figure 8-4 shows a typical example in which RTSP and RTP are
used to stream stored content from a media server. The relationships
among HTTP, RTSP and RTP are shown as well.

web browser
HTTP Web
Server
meta file
rtsp://stream.server.net/foo/presentation.abc
meta file
RTSP
meta file,
streaming commands Content
Server
RTP
audio/video contents
media player
Figure 8-4: Streaming media retrieving with RTSP
To play a media presentation, the client (web browser) sends out a request
URL with required configuration parameters to the web server. The web
server responds with a presentation description containing information
about the media server and other required parameters. Meanwhile, the web
browser brings up the media player on the client processor. Upon the
receipt of presentation description, the SETUP request is sent to the media
server with available transport parameters. The media server chooses
transport settings and responds to the request. The player then sends the
PLAY command to request start of delivery of media streams. During the
play out period, reports about the streaming data reception might be sent
back periodically to the media server. The PAUSE or TEARDOWN
commands can be sent to the media server either during the playout or at
the end of the presentation to temporarily interrupt or terminate the
presentation, respectively. A walkthrough of the operation for a multimedia

presentation is illustrated in Figure 8-5. Note that the Web Browser and the
Media Player are both on a client endpoint.
Client Media Web
Web Media
Browser Player Server Server
HTTP GET
Presentation
Internal Description
Commun.
SETUP
PLAY
RTP Audio/Video
RTCP
PAUSE
TEARDOWN
Figure 8-5: Typical multimedia presentation walkthrough

RTSP control commands can be sent over either TCP or UDP. The
sequence of control requests affects play out behavior. For example, a
PAUSE request would not take effect before a PLAY command has been
executed. For smooth operation, it is helpful to have confirmation of
control commands. Where UDP is used, acknowledgement/retransmission
mechanisms need to be provided by the application to ensure that control
requests are received by the media server. Therefore, TCP is used most
often for transport layer communications.
Performance
As discussed in Chapter 2, streaming media to a client is not considered a
real-time process. Because the media path is one-way, the QoE of the
streaming session is not limited by response time, and the delay can be as
high as several seconds without affecting user performance. The addition
of the remote commands of RTSP changes all this. The use of interactive
commands demands a certain level of responsiveness from the system, and
so the session control becomes a real-time application.
Since UDP is can transport RTSP control requests, reliable communication
is not guaranteed within RTSP. Where a session has been PAUSED and a
subsequent PLAY command is lost, the media server may appear to be
frozen. For UDP operation, it is often left to the user to realize the loss of a

control request and repeat the command. This can affect the QoE of the
session.
RTSP is not robust to hardware or software failures. For example, consider
an unplanned reboot of a home PC system where a user is running a
streaming session. The session will be lost on the PC side, and the media
player cannot request that the server terminate the stream. The media
server continues to send streaming data to the client and occupying
bandwidth in the downstream direction. Even when the PC reboots, there is
no mechanism specified allowing the media player to recapture or
terminate the session. So far, RTSP does not specify how to recover a lost
state with the session identifier.
RTSP and HTTP

The syntax and operation of RTSP is intentionally similar to HTTP/1.1,
using the same framework to parse the control requests.
The RTSP Uniform Resource Locator (URL) is used to determine the
essential protocol which controls the delivery of the media streams. The
scheme “rtsp:” requires a reliable transport protocol, such as TCP, to
deliver the control requests, while the scheme “rtspu:” specifies an
unreliable protocol being used to establish the channel. When a secured
TCP connection is required, the scheme “rtsps:” is used to deliver
commands with Transport Layer Security (TLS) protocol. For example, a
valid RTSP URL, “rtspu://foo.media.net:5150”, will transmit a control
request to server “foo.media.net” with Port Number 5150 using an
unreliable transport protocol.
There are several reasons that RTSP mirrors HTTP/1.1. First, any new
extensions of HTTP/1.1 apply to RTSP automatically with little or no
adjustment. Second, HTTP or MIME parsers can be adapted to parse RTSP
methods without significant modification. Finally, HTTP mechanisms for
web security, caches, and proxies extend to RTSP.
One significant difference between HTTP/1.1 and RTSP is state
maintenance. The order of RTSP control requests influences the operation
of media presentations. Media servers are designed to serve many sessions
with different presentations, and, therefore, must maintain the “session
state.” This allows servers to match RTSP requests with the relevant
presentations. Web browsing relies on user command and is stateless.
A final difference between HTTP/1.1 and RTSP is the interaction between
client and server. In HTTP, the client issues a request and the server
responds by sending data to the client. However, RTSP allows the server to
send a request to the client. For instance, if RTCP feedback indicates
glitches on the network to the media server, it can issue a request to change
transport parameters to adjust the streaming data. HTTP has no mechanism

to make such adjustments, relying instead on the TCP transport protocol

reliability features.
Fundamentally, RTSP inherits most features from HTTP with minor
enhancement to meet the requirements of real-time applications. As
mentioned above, RTSP needs to maintain connection states; it also offers a
mechanism for servers to initialize requests, and always uses absolute URI
(Uniform Resource Identifier). The similarity and differences of RTSP and
HTTP are summarized in Table 8-2.
Property HTTP RTSP
Text base
MIME headers
Request Line + Headers + Body
Status code
Security
URL format
Content negotiation
State maintenance
Request from server
Always absolute URI

Table 8-2: Comparison of properties of HTTP and RTSP


The Real-time Transport Protocol (RTP) is efficient, flexible, protocol-
neutral, and scalable. Control functions are separated from the transport
mechanism with RTCP. Security mechanisms are provided with help from
other protocols.
RTP supports payload type identification, sequence timing, timestamping
and delivery monitoring. RTP mixer and translator serve as relay agents to
process RTP packets in the network core.
RTP has heavy overhead and firewall traversal problems and does not
guarantee performance or end user quality.
Real-Time Control Protocol (RTCP) provides feedback statistics to the
sending end, identifies RTP participants, determines the frequency of
RTCP feedback reports, and provides basic session control information for
participants.
RTP provides multicast and timing synchronization mechanisms. It does
not have the congestion control, receive acknowledgement, or error
recovery mechanisms of TCP. The delay associated with the operation of
these mechanisms is incompatible with Real-Time application
performance.
Real-Time Streaming Protocol (RTSP) uses a framework similar to HTTP,
with extensions for delivering real-time data. It is designed to deliver
“VCR-like” commands and integrates with RTP to deliver media content.
Successful Real-Time streaming applications must coordinate HTTP, RTP,
RTCP, and RTSP. HTTP is used to retrieve the presentation description.
RTSP uses the description to set up and tear down the sessions. RTP
transports the contents to the end device, and RTCP is used to report
transmission statistics back to the server.

References
RFC 1889, “RTP: A transport protocol for Real-Time applications,” IETF,
1996 (replaced by RFC 3550).
RFC 3550, “RTP: A transport protocol for Real-Time applications,” IETF,
2003.
RFC 1890, “RTP profile for audio and video conferences with minimal
control,” IETF, 1996 (replaced by RFC 3551).
RFC 3551, “RTP profile for audio and video conferences with minimal
control,” IETF, 2003.
RFC 2326, “Real-Time streaming protocol (RTSP),” IETF, 1998.
RFC 3711, “MIKEY: Multimedia Internet KEYing,” IETF, 2004.
RFC 2250, “RTP payload format for MPEG1/MPEG2 video,” IETF, 1998.
RFC 2586, “The Audio/L16 MIME content type,” IETF, 1999.

179
Chapter 9
Call Setup Protocols: SIP, H.323,
H.248
François Audet
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
SIP Perspective
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
H.323
SIP
H.248/MEGACO, MGCP, NCS/J.162
Introduction
Communications over the global IP network use three different types of
real-time and control protocols:

Real-time protocols for transporting and controlling media streams

(RTP/RTCP)
Peer-to-peer control protocols for establishing “calls” or “sessions”
(H.323, SIP)
Master/Slave gateway control protocols (H.248/MEGACO, MGCP)
Let us examine what makes communicating over an IP network so special.
People are quite familiar with client/server applications such as browsing
the web using HTTP, transferring files with protocols FTP, accessing
databases with LDAP, accessing information using XML/SOAP, and
monitoring devices through SNMP. In contrast, communication is
inherently a peer-to-peer application. Communication requires protocols
that let individuals communicate freely with other individuals, without
previous knowledge of each other’s location, capabilities and status.
Furthermore, these protocols need to be very scalable in order to allow
communication for a large number of users that ultimately, could be equal
to the World’s population, and to allow for a potentially even greater
number of physical devices.
The first type of protocols for communication, relates to the transport of
real-time media and is what early attempts at communicating over an IP
network focused on. Fortunately, a single standard emerged for
transporting real-time media over IP: the Real-Time Protocol (RTP) and its
associate Real-Time Control Protocol (RTCP).
While this in itself was no simple task, it avoided some of the most difficult
aspects of communicating over an IP network. Early voice over IP
communication usually involved the two participants phoning each other
(with their PSTN phones) or e-mailing, in order to exchange their IP
addresses. They would also agree on a common codec.
The second type of protocol is required by modern communication over an
IP network and requires peer-to-peer protocols performing the following
tasks:
Locate the other participant(s), and get the IP addresses necessary
to establish the session (call)
“Invite to the session” or “call” the participant(s)
Negotiate the type of session/call (for example audio, video, text
messaging, whiteboarding, application sharing)
Establish and modify the session/call
Create and remove media streams
Clear the session/call
Two peer-to-peer protocols stand out for communication in the global IP
network: SIP & H.323.

Chapter 9 Call Setup Protocols: SIP, H.323, H.248 181
Both H.323 (an ITU-T protocol) and SIP (an IETF protocol) were created
to address those requirements. There are consequently of lot of similarities
between the protocols. Furthermore, H.323 and SIP rely on other protocols
defined by the IETF and the ITU-T. For example, both protocols use
transport protocols such as RTP/RTCP, UDP, TCP and IP (IETF protocols),
and both protocols use voice codecs such as G.711 and G.729 (ITU-T
protocols). There are of course differences between the two, as they
evolved from two very different backgrounds.
A third type of protocol, consists of Gateway control protocols. They are
master/slave protocols necessary for a gateway controller to control slave
gateway devices. The main gateway control protocols used today are
MEGACO/H.248, MGCP and NTS/J.162.
The main difference between peer-to-peer and master/slave protocols, is in
the way intelligence is distributed between the network edge devices and
network based servers.
The master/slave approach, exemplified by Megaco/H.248, MGCP, NCS/
J.162, allows network gateway functions to be distributed or decomposed
into intelligent (master) and nonintelligent (slave) parts. Application
intelligence, such as call control, is contained in the functional control
servers (master), which also implement the peer-level protocols to interact
with other functional elements in the system and manage all feature
interactions. These control servers then drive a large number of dumb slave
devices that are optimized for their specific interface function and devoid
of application complexity, hence, they are lower in cost and not subject to
change as new services and features are introduced at the control servers.
A communication network can be comprised of both peer-to-peer and
master/slave elements, along with the real-time transport protocol.
Figure 9-2 illustrates the three types of protocols.

SIP, H.323
Peer Entity Peer Entity
Media Master Entity
H.248/Megago,
MGCP, etc.
Slave Entity
Media
RTP/RTCP
Figure 9-2: Three types of protocols
H.323
Architecture
H.323 has a well-defined architecture, with well-defined components,
functions and protocols.
H.323 defines the following physical components:
Gatekeeper The gatekeeper provides routing and call control
services to H.323 endpoints. It can not generate or receive calls.
Endpoint An endpoint can make or receive calls. Endpoints are of
one of the following three subtypes:
Terminal A terminal can be an IP phone, a PC, a PDA, a set-top
box providing video-conferencing, a voice-mail system or any other
device offering H.323 services to the end user.
Gateway Gateways provide an interface to non-H.323 network,
such as the GSTN, or a SIP network.
Multipoint Control Unit (MCU) MCUs support multipoint
conferences and must contain an MC. They can also contain an MP.
Physical components can be collocated. For example, it is very common to
have devices that are a Gatekeeper, a Multipoint Control Unit, and a
Gateway simultaneously. All Endpoints behave the same way at the
protocol level. It is not terribly important if a particular device is classified
as a terminal, a gateway or an MCU (or a combination of both), however, it
is important that they are “Endpoints” and not Gatekeepers. Sometimes,
the line can be a little blurry. For example, an IP PBX or TDM switch could

be any of them, depending on your point of view. When traditional PBXs

and TDM switches started to incorporate IP connectivity through IP
“trunks,” they were described as “H.323 Gateways” because they
interfaced between H.323 on one side, and TDM on the other. PBXs and
TDM switches then evolved into full blown “IP PBXs,” “Soft Switches”
and “Call Servers.” They even became able to control IP phones directly.
Often, these IP phones are a slave device of the Call Server. In H.323
language, the Call Server is the Endpoint. Figure 9-3 shows the H.323
components.
Gatekeeper
Terminals Gatekeeper
• address translation (IP, telephone)
• IP phones, PCs, • admission control
PDAs, set-top boxes
• cannot generate or terminate calls
Endpoints
• can make or receive calls
MCU Gateway
Terminal
Gateway
Multipoint Control Unit (MCU) • Interworking with other multimedia
terminals and GSTN
• Support for multipoint conferences
Figure 9-3: H.323 components
In addition, H.323 defines the following logical components:
Multipoint Controller (MC): MCs control multipoint conference
connections
Multipoint Processor (MP): MPs process and mix multiple audio/
video channels
MCs and MPs are logical components and not stand-alone entities. They
must reside within a physical component, such as a Terminal, a Gateway, a
Multipoint Control Unit or a Gatekeeper.
H.323 introduces the concept of a “Zone.” A Zone consists of one (and
only one) Gatekeeper, along with all the endpoints that are registered to
that Gatekeeper (see Figure 9-4). A Zone is independent of geography; it
can span multiple LANs segments and can include Endpoints anywhere on
the Internet.

Zone = 1 GK
Gatekeeper
Terminal Terminal MCU Gateway Gateway Terminal Gateway Terminal
Figure 9-4: H.323 zone

A gatekeeper is a very important component of an H.323 network. It is the
entity responsible for enabling connectivity between endpoints. A
Gatekeeper performs that following mandatory functions:
Address translation This is the main role of the Gatekeeper.
Address translation is the process by which an endpoint can locate
another endpoint, using a well-known address. That well known
address is typically a “phone number,” although other types of
addresses are also used (for example, an H.323-ID which is an
abstract name, an e-mail address, or an H.323 URI). The role of the
Gatekeeper is to determine the IP address to be used for
communication with an endpoint. Without a Gatekeeper, every
endpoint would need to know the IP address of every single
endpoint in the network, which is not practical in a real-life
deployment. The process by which an Endpoint makes itself known
to its Gatekeeper is known as “Registration”: it allows the
Gatekeeper to associate an abstract address to the IP address of the
Endpoint.
Admissions control A gatekeeper may allow or deny a call from
one Endpoint to another based on any criteria. In a trivial
configuration, Admission Control can be a null function (that is, all
calls are allowed). A gatekeeper may verify security credentials
before allowing a call, or allow calls only between certain users.
Bandwidth control Bandwidth control allow terminals to change
bandwidth usage mid-call, for example, for changing codecs or
adding a video conferencing channel to an existing voice call. In
practice, this function is very often a null function: the protocol is
still in place, but the requests are always granted.
Zone management Zone Management refers to the process by
which a Gatekeeper performs the previous tasks for all Terminals,
MCUs, and Gateways within its control. For example, it allows for
assigning a numbering plan to the various components within a
zone.

In the H.323 architecture, the Gatekeeper performs additional functions. In

fact, it can pretty much do whatever it likes, provided it doesn't break the
protocol. Therefore, a lot of these functions (such as call authorization,
alias address modification, maintaining Call Logs and billing) are not
formalized in the specification. One key optional function of a gatekeeper
is to complete all call signaling from endpoints, effectively acting as a
middleman. This will be described later.
H.323 is really an “umbrella” standard (called “Recommendation,” in ITU-
T jargon). It describes the role of Gatekeepers and Endpoints, using a series
of protocols defined in other Recommendations. An H.323 Endpoint and
its relationship with recommendations and standards under the scope of
H.323 are illustrated in Figure 9-5.
Scope of H.323
Video Codec
Video I/O equipment H.261, H.263
H.225.0
Audio Codec Transport Layer,
Audio I/O equipment G.711, G.723.1, RTP/RTCP,
G.729, etc. UDP, TCP, IP
(IPX, ATM, etc.) Network
Interface
User Data Applications
t.120, etc.
System & Media

Control - H.245
System Control User

Interface Call Control
H.225.0 (Q.931),
H.450.X
RAS Control
H.225.0
Figure 9-5: H.323 endpoint

The H.323 protocol stack is illustrated in Figure 9-6.

H.225.0 Audio/Video Streams

H.245
Call Control RAS RTCP RT P
(Q.931)
TCP or
UDP
UDP/H.323 Annex E
IP
Physical layer
Figure 9-6: H.323 protocol stack

The H.225.0 (RAS), H.225.0 (Q.931) and H.245 protocols are the core
control protocol of H.323. These protocols are encoded using Abstract
Syntax Notation One (ASN.1) with the Packed Encoding Rules (PER).
Audio and video streams are transported over RTP/RTCP, over UDP and IP.
H.225.0 (RAS)
H.225.0 (RAS) is the Registration, Admission and Status protocol.
RAS is used only in environments containing a Gatekeeper, but since most
environments these days have Gatekeeper, it is almost always mandatory.
RAS is the protocol used between and Endpoint and its Gatekeeper, and
between a Gatekeeper and another Gatekeeper, to perform the tasks that are
not strictly speaking part of the call establishment.
RAS includes messages that are independent of individual calls.
Gatekeeper Discovery messages are normally sent to a well known
multicast address, to allow the endpoint to automatically discover their
Gatekeeper. This is useful in LAN environments, for plug-and-play
operations. Registration allows an Endpoint to let its presence be known to
its Gatekeeper, allowing others to route calls to that particular endpoint.
Other messages are call related. Admission Requests (ARQ) are used prior
to a call, to get permission from the Gatekeeper to make a call; the
Gatekeeper will confirm the ARQ by telling the endpoint where to place
the call to.
There are other messages that are less-used, such as bandwidth request
messages, in environments where the gatekeeper manages the bandwidth
utilization in the network, as well as messages for status requests.

Another type of RAS messages are the Location Requests (LRQ)

messages. These messages are normally used between gatekeepers to
locate endpoints across different gatekeeper zones.
H.225.0 Annex G expands the concept of Gatekeeper-to-Gatekeeper
communication a bit more, by adding extra messages and procedures
across different administrative domains (for example, different public
operators).
RAS messages are carried over UDP. Typically, a well-known UDP port is
used for RAS, but it is possible to configure a different port if required.
UDP Port 1718 is used for multicast Gatekeeper communication, and UDP
Port 1719 is used for unicast Gatekeeper communication. Multicast
Gatekeeper communication can be used for Gatekeeper discovery and
Location Request. However, because of scalability and security issues,
multicast is not widely used in H.323.
Figure 9-7 illustrates an Endpoint registering with its Gatekeeper, then
making an admission request. Its gatekeeper can not locate the called
endpoint, so it performs a location request to another Gatekeeper.
Endpoint A Gatekeeper-A Gatekeeper-B Endpoint B
ARQ (+1 408 555 1212)

LRQ (+1 408 555 1212)
LCF (address for Q.931

signalling)
ACF (address for
Q.931 signalling)
H.225.0 (Q.931) Call Setup establishment
ARQ (+1 408 555 1212)
ACF
Figure 9-7: H.323 call walk-through with slow start
H.225.0 (Q.931)
H.225.0 (Q.931) is used for Call Control. It allows for the establishment of
a call between endpoints.
Once an endpoint has obtained permission to make a call from its
gatekeeper, as per the RAS Admission procedures, it will attempt to
establish a call using H.225.0 (Q.931) procedures.
H.225.0 (Q.931) is transported over TCP. The TCP port to be used is
exchanged as part of the RAS procedures. Typically, the well-known TCP
Port 1720 is used, although it is possible to use a different port.

H.225.0 (Q.931), as its name suggests, is based on the ISDN Q.931

protocol. However, H.225.0 (Q.931) is not Q.931. Many messages have
been eliminated (for example, DISCONNECT and RELEASE), and the
procedures are much looser than in proper Q.931, for example with regards
to timers. Furthermore, unlike proper Q.931, H.225.0 (Q.931) does NOT
establish a bearer channel (that is, a media channel), it only establishes a
relationship between the endpoints. This relationship is referred to as a
“call” in H.323. The media is established using the H.245 protocol
described in the next section.
Figure 9-8 illustrates a typical Call Setup and Call tear down.
Calling Called
endpoint endpoint
SETUP
CALL PROCEEDING
ALERTING
CONNECT
H.245 session
RELEASE COMPLETE
Figure 9-8: Call setup and call tear down

H.225.0 (Q.931) messages also include in the User-User Information
(UUIE) element, a large amount of ASN.1/PER-encoded protocol
information. It is important to note that most of the information used by
H.323 for call control comes from the UUIE, as opposed to from the Q.931
Information Elements themselves. In many cases, the Q.931 information is
simply irrelevant (for Bearer information for example), or it may be

conflicting with the UUIE information, in which case arcane rules are
defined to make sense out of it.
Sidebar: Should the phone number be in the UUIE or the Q.931

information elements?
One of the major sources of confusion and interoperability problems in
H.323 is the fact that when the address (both called and calling) is a
phone number, there are two places in the same message (for example, a
SETUP message) to put the phone number! Since H.323 was meant for
supporting arbitrary types of addresses (including phone numbers), a
phone number can be included in the AliasAddress in the UUIE in
SETUP. Since H.225.0 (Q.931) inherits information elements from
Q.931, phone numbers can also be in the Calling party number, Called
party number and Connected number information elements. Since H.323
was written by a committee, instead of choosing one, or the other, or
both, they came up with the following rule:
Public numbers (that is, E.164 numbers) are to be included in the
Q.931 information element
Private and unknown numbers (such as dialed digits) are to be
included in the UUIE AliasAddress
To add insult to injury, prior to H.323 version 4, dialed digits were called
e164 in the protocol, even if they are not E.164 numbers. They were
renamed dialedDigits in H.323v4.
Needless to say, there are a lot of nonconformances in various products
in the market. It is not uncommon to see implementations that use the
Q.931 information elements for everything, the UUIE AliasAddress for
everything, or even implementations that populate the numbers in both
places.
The H.225.0 (Q.931) signaling channel can be established in one of two

modes, depending on the Gatekeeper's preference which will be
communicated through the RAS protocol:
Gatekeeper-routed model (Figure 9-9) In this model, the H.225.0
(Q.931) messages are routed through the Gatekeeper. This allows
the Gatekeeper to monitor the status of the call, perform accurate
billing, and modify the call. It also allows for funnelling the
signaling to a specific location. This model is widely used by
Carriers and Service providers.

H.225.0 (RAS)
Gatekeeper Gatekeeper
H.225.0 (Q.931)
H.225.0 (Q.931)
H.225.0 (RAS)
H.225.0 (RAS)
Endpoint Endpoint
Figure 9-9: Gatekeeper routed model

Direct-routed model (Figure 9-10) In this model, the H.225.0
(Q.931) messages are routed directly between the endpoints,
bypassing the Gatekeeper. This allows for greater scalability as the
Gatekeeper is not participating in the calls themselves. It does not
allow the Gatekeeper to participate in call control. This model is
widely used with Enterprises when centralized control is not
required.
H.225.0 (RAS)
Gatekeeper Gatekeeper
H.225.0 (RAS)
H.225.0 (RAS)
H.225.0 (Q.931)
Endpoint Endpoint
Figure 9-10: Direct-routed model
H.245
After the H.225.0 (Q.931) call has been set up, it is possible to establish the
H.245 control channel in order to establish media sessions and control
sessions.
An H.245 IP address and port number is exchanged as part of the H.225.0
(Q.931) protocol for carrying H.245 over TCP. Unlike H.225.0 (RAS) and
H.225 (Q.931), a dynamic port is used; there is no default H.245 port.

H.245 starts by a process called Terminal Capability Negotiation

(Figure 9-11), by which the Endpoints exchange their capabilities with
each other, such as voice, video, fax, data collaboration (T.120),and DTMF
transport. It also describes preferences, for example, it could be G.711 is
preferred but G.729 is also allowed. It also negotiates a maximum payload
size for each voice codec: a default value of 20 ms is normally used.
Termina
lCapab
(G.711, ilitySet
G.729)
k
tySetAc
lCapabili
Termina a b ili ty Set
lCap
Termina , G.711)
(G.729
Termina
lCapa bilitySe
tAck
Figure 9-11: Terminal capability negotiation

The next step is to perform the master/slave determination process
(Figure 9-12). This process will result in the random assignment of a
master and a slave in a two-way call. This is purely a mechanism for
conflict resolution when, due to the asymmetric nature of connection setup,
two endpoints simultaneously try to open incompatible codecs with each
other in both directions. This is particularly useful since asymmetric codec
operation (for example, G.729 in one direction, G.711 in the other) is often
not supported.
n
rminatio
laveDete
MasterS
MasterS
laveDete
rminatio
MasterS n
laveDete
rminatio
(Slave) nAck
nAck
rminatio
laveDete
MasterS (Master)
Figure 9-12: Master/slave determination

After the terminal capability set negotiation and the master/slave
determination procedures are completed, the endpoints are free to open
media channels (Figure 9-13). Each endpoint will open a forward media
channel independently. When opening a media channel, each endpoint asks

the other endpoint what IP address and port will be used for sending media
on a dynamic RTP/UDP/IP port.
.711)
annel (G
gicalCh
OpenLo
OpenLo
gicalCh
annel (G
.711)
(G.711, OpenLogicalC
RTP/RT ha
CP=192 nnelAck
.168.0.1
:5200/5
201)
k
annelAc 2/4313)
gicalCh 1
OpenLo =10.10.0.2:413)
C P
RTP/RTTCP Media (G
.71
(G.71 1,
RTP/R
RTP/RT
CP Med
ia (G.711)
Figure 9-13: Open media channels

H.245 allows for many mid-call control commands. There are H.245
commands for opening video, T.120 data conferencing or fax channels. It
also has commands for camera control, sending in-band DTMF, resource
reservation, and many others.
One very important building block in H.245 is the “third-party pause and
reroute” procedures. Third party pause and reroute is a process by which an
endpoint (or gatekeeper) can request another endpoint to stop transmission
of media and close all its opened channels. The endpoint of gatekeeper
initiating third-party pause and reroute does so by sending a Terminal
Capability Set with no capabilities listed whatsoever. The receiver of the
“empty capability set” interprets this as the other entity telling it that it can
not support any media channels at all, and thus closes all its forward media
channels. Note that the entity that sent the empty capability does not have
to close its forward media channels, but it may. This process is the “pause.”
Later on, the entity that sent the empty capability set can send a proper
terminal capability set. This “full terminal capability set” contains the
capabilities of the endpoint and may be different from the original terminal
capability set. This full terminal capability set will trigger the master/slave
determination and ultimately the opening of media channels. The media
channels may be the same as the original one, or they can be different. It is
essentially a mid-call “slow start.” Third-party pause and re-route is a
powerful generic mechanism allowing endpoints and gatekeepers to
implement many traditional telephony features such as transfer, ad hoc
conferencing and call hold. Because it is a generic simple “primitive,” it is
extremely popular and does not require each feature to be implemented

one-by-one, as with H.450. Figure 9-14 illustrates a transfer feature

implemented using third-party pause and rerouting.
H.323 Endpoint A H.323 Transferor B H.323 Endpoint C
Established Voice Call
B initiates the call transfer to C
pty)
ty S et(e m
C ap ab ili
Te rm in al
Term inal Close the audio
C apabili
ty S etA ck connection
with node A
B)
nn el(A <-
gica lC ha
C lo se Lo
C lo se L og
ic alC ha
n ne lA ck
C lo se Lo
gica lC ha
nn el(A ->
B)
nn elA ck Setup (fastStart)

gica lC ha
C lo se Lo
CallProceeding
Alert
Connect (fastStart)
Speech Path between B & C
T er m inal
Make a new call to C apabili
ty S et(B )
node C
A ck
C apab ility
Te rm in al (C )
in al C ap ab iltyS et
T er m
T er m inal
C apa bilty
S etA ck
Master Slave Negotiation
Now B transfers the call to A T er m in al
C ap ab ili
ty S et(E m
pty)
ty A ck
C ap abili
T er m inal
Close the audio C lo seLo
gica lC ha
connection nnel(A <-
B)
with node C
nn elA ck
gica lC ha
C lo seLo
ha nn el(A -> B )
gica lC
C lo se Lo
C lo se Lo
gica lC ha
n nelA ck
iltyS et (C ) T er m in al
Te rm inalC ap ab C ap ab ili
ty S et(A )
T er m inal ty A ck
C apab ilt C ap ab ili
M as te rS
yS etA ck Te rm in al
la ve D ete D et er m inatio n
rm in atio la ve
n M as te rS
m in ation M as te rS
la ve D eter la ve D et
M as te rS e rm in at
io n
M as te rS ck
la veD eter in at ionA
m in ationA la ve D eterm
ck M as te rS
ck M as te rS
m in atio nA la ve D eter
la ve D eter m in ationA
M as te rS ck
O p en Lo <- C )
g icalC ha
nn el (A -> icalC ha nn el (A
C) O pe nL og
O pe nL o
el (A <- C ) gica lC ha
icalC hann nn el (A ->
O pe nL og C)
O pe nL og A ck
ic alC h an hannel
ne lA ck C lo se Lo gica lC
O pen Lo
nn elA ck g ic alC ha
gica lC ha nn e lA ck
C lo seLo
Speech Path between A & C
Figure 9-14: Transfer using third party pause and re-route

FastStart and H.245 tunneling

The H.245 procedures are very powerful and were designed to allow for the
largest variety of terminals. In particular, it allows for complex capability
negotiation for video conferencing equipment. However, the procedures are
heavy and slow and are not always suitable for IP telephony. The FastStart
procedures were defined to address this problem. The normal H.245
connection setup procedures are sometimes informally called “slow start,”
as opposed to the FastStart procedures described later.
Starting with H.323 version 2, the “fastStart” (that is, fast connect)
procedures were defined in order to speed up the process of setting up a
call, and getting the participants “talking” as fast as possible.
FastStart is a process that allows for combining call setup and connection
setup in one round-trip. See Figure 9-15.
FastStart allows the caller to include the SETUP message in the first
H.225.0 (Q.931) message, and a list of proposed media channels that it
wishes to open. This proposal consists of a list of H.245
OpenLogicalChannels elements, that is, the same H.245 messages used for
opening channels, using the standard non-FastStart procedures. The
OpenLogicalChannels, however, are interpreted differently. Instead of
being requests to open a specific media channel, it is a proposal for opening
one or more media channels from a list of many. The proposal includes
both forward (caller to callee) and reversed media channels (callee to
caller). Each OpenLogicalChannel includes the media type, codec and IP
address and RTP/RTCP ports for reverse channels, and the media type and
codec for the forward media channel. The callee must pick the
OpenLogicalChannels from the list and return it to the caller, without
modifying the OpenLogicalChannels, except for providing the IP address
and RTP/RTCP ports for the forward channel. The fastStart answer is
provided in any message sent from the callee to the caller in response to the
SETUP message, up to and including the CONNECT message.
SETUP (+1 408 555 1212, h245Tunnel

ling)
fastStart(G.711, G.729, RTP/RTCP=
192.168.0.1:5200/5201)
CALL PROCEEDING
ALERTING
CONNECT
2/4313)
fastStart(G.711, RTP/RTCP=10.10.0.2:431
dia (G.711)
RTP/RTCP Me
RTP/RTCP Media (G.
711)
Figure 9-15: FastStart

It is important to note that with FastStart, the caller proposes both the
forward and reverse media channels, and the callee chooses which channels
to use, for both directions. It is thus, very different from the non-FastStart

procedures, which are unidirectional (each side setting up their forward

media channels). It is also very important to note that the callee ultimately
makes the decision, but can not change anything from the proposal.
Another key aspect of FastStart is that it does not include a real capability
negotiation. The list of proposed channels does not describe all the
capabilities of the terminal, it describes what the caller is willing to open at
that particular time. If an endpoint is setting up a voice call for example,
there will be no indication that the endpoint supports video, T.120 data
conferencing, fax or even sending DTMF digits. FastStart’s only goal is to
get the endpoints talking as fast as possible. For this reason, most FastStart
calls are followed by the opening of an H.245 media channel in order to
perform proper terminal capability negotiation. The process looks exactly
like a “slow start” except that no opening of media channels is done after
the terminal capability set negotiation and master/slave determination
procedures, and the endpoints act as if the OpenLogicalChannels had been
opened using slow start instead of fastStart (they “inherit” the media
channels). The H.245 channel enables the endpoints to modify the call at a
later time, by adding a video channel, switching to fax transport,
redirecting media using the pause and reroute procedures, and even send
DTMF. None of these things would be possible if the H.245 channel was
not set up. Most practical H.323 implementations today will perform a
fastStart channel opening, and will follow it up with the standard “slow
start” procedures for terminal capability negotiation.
One of the early criticisms of H.323 was that it required two control
channels, each with their own TCP connection: the H.225.0 (Q.931)
channel and the H.245 channel. To make things worst, the H.245 channel
was dynamic and thus difficult to see. For this reason, H.245 Tunneling
was defined in H.323 version 2. H.245 tunneling is a process by which no
separate TCP connection is used for H.245, instead the H.245 messages are
embedded (“tunneled”) inside H.225.0 (Q.931) messages.
It should be noted that the H.245 channel can be set up using H.245
tunneling after fastStart, or a separate H.245 channel can be used.
Call walk-through
Figure 9-16 illustrates a complete call walk-through including H.225.0
(RAS), H.225.0 (Q.931) and H.245, with the normal (“slow start”)
procedures with the gatekeepers operating in direct routed mode.

Endpoint A Endpoint B
Gatekeeper A Gatekeeper B
(192.168.0.1) (10.10.0.2)
ARQ (+1 40
8 55 5 1212)
LRQ (+1 408
555 1212)
LCF
.0.2:1720)
ACF (Q.931=10.10
.0.2:1720)
(Q.931=10.10 SETUP (+1 408 555 1212)
555 1212)
ARQ (+1 408
.931=10.10.0.2:1720)
(Q ACF
CALL PROCEEDING
ALERTING
CONNECT (H.245=10.10.0.2:8878)
Termina
lCapab
(G.711, ilitySe
G.729) t
ck
ilitySetA
lCapab
Termina ility Se t
lCapab
Termina , G.711)
(G.729
n
rminatio
laveDete
MasterS
Termina
lCapab
MasterS ilitySetA
laveDete ck
rminatio
MasterS n
laveDete
rmin
(Slave) ationAck
ck
inationA
sterSla veDeterm
Ma (Master)nel (G.711)
an
gicalCh
OpenLo
OpenLo
gicalCh
annel (G
.711)
(G.711, O pe nLogicalC
RTP/RT hannelA
CP=192 ck
.168.0.1
:5200/5
201)
k
annelAc 2/4313)
gicalCh 1
OpenLo =10.10.0.2:413)
1, RTP /RTCPMedia (G.71
(G.71 RTP/RTCP
RTP/RT
CP Media (G
.711)
Figure 9-16: H.323 Call walk-through with slow start

Figure 9-17 illustrates a call walk-through with fastStart, using H.245
tunneling and including the terminal capability negotiation, with the
gatekeepers operating in direct routed mode.

Endpoint A Endpoint B
Gatekeeper A Gatekeeper B
(192.168.0.1) (10.10.0.2)
ARQ (+1 40
8 55 5 1212)
LRQ (+1 40
8 555 1212)
LCF
.0.2:1720)
ACF (Q.931=10.10
0.0.2:1720)
(Q.931=10.1 SETUP (+1 408 555 1212, h245Tunne
lling)
fastStart(G.711, G.729, RTP/RTCP=
192.168.0.1:5200/5201 )
555 1212)
ARQ (+1 408
.93 1= 10.10.0.2:1720)
(Q ACF
CALL PROCEEDING
ALERTING
CONNECT
2/4313)
fastStart(G.711, RTP/RTCP=10.10.0.2:431
dia (G.711)
RTP/RTCP Me
RTP/RTCP Media (G.
711)
FACILIT
Y{Termina
lCapabili
MasterS tyS
laveDete et (G.711, G.72
rminatio 9),
n} 11),
.729, G.7
a p ab ilitySet (G
inalC rmination}
Y {Term
FACILIT MasterSlaveDete
IL Y IT
13a: FAC inationAck}
ve D ete rm
la
{MasterS
13b: FA
{MasterS CILITY
laveDete
rminatio
nAck}
Figure 9-17: H.323 Call walk-through with fastStart
Other H.323-related protocols

H.323, being an umbrella standard, has a large number of protocols that
may be used by H.323 systems.
H.323 has a very large number of optional Annexes and Appendices that
build on top of the core H.323 protocols. Here is a list of the most widely
used H.323 Annexes and Appendices:
Annex C: H.323 on ATM
Annex D: Real-time Fax
Annex E: Multiplexing of control channels on TCP or UDP
Annex M: Tunneling of protocols such as QSIG and ISUP in H.323
Annex R: Robustness
Appendix II: RSVP and QoS
Appendix V: Use of E.164 and ISO/IEC 11571 Numbering Plan

There are also a number of H.323-related recommendations:

H.341: SNMP MIB for managing H.323 entities
H.235: Security for H.323 systems
H.450.X: Supplementary services
H.460.X: Generic H.323 extensions
H.450 Supplementary services, such as Message Waiting Indication, Call
Transfer, and Call Forwarding, are derived from QSIG supplementary
services. However, they are different enough that they have to be rewritten
one-by-one. Also, H.450 supplementary services require specific protocols
for every service. Many vendors often prefer to use the generic “third-party
pause and re-route” primitives to implement most of their features.
H.460.X with Generic H.323 extensions include extensions for number
portability, circuit maps, digits maps, QoS monitoring reporting.
SIP
Architecture
The purpose of SIP is to initiate multimedia sessions. SIP includes user
location, user availability and capability negotiation, session establishment,
and session modification.
SIP allows a user to invite others to a session. For example, Alice would
invite Bob to an IP voice call by sending an INVITE message describing
the voice codec to be used and the IP address and port where the media
stream should be sent. The INVITE message will be routed trough the SIP
network (through proxies, redirection servers and other network elements),
using a location service, and will be presented to Bob. Bob will accept the
invitation and provide his own IP address and port for the media stream.
Figure 9-18 illustrates a SIP session through two SIP Proxy servers.

Alice Bob
Proxy Server A Proxy Server B
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
INVITE sip:bo
100 Trying b@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing 200 OK
180 Ringing 200 OK
200 OK
ACK
ACK
RTP/RTCP ACK
Media (G.711
)
edia (G.711)
RTP/RTCP M
BYE
BYE
BYE
200 OK
200 OK
200 OK
Figure 9-18: SIP session walk-through using proxy servers

SIP defines a registration mechanism by which Alice and Bob's terminals
will send a REGISTER message to a registrar, in order for its presence and
IP address to be known by the SIP network. This registrar will work with
the location service to locate users.
SIP allows for users to register at multiple locations using multiple aliases.
Bob could, for example, have a SIP phone at work that is always registered.
He could also register dynamically with a “software phone” on his PC,
when traveling. Bob could also have his mobile phone and home phone
(either of which could be either native SIP phones, or could be traditional
phones accessed through a SIP gateway). He could also have a SIP
voicemail system.
Bob could define routing rules used by his SIP proxies that define how to
reach him. For example, Bob could opt for a “sequential search” where the
SIP network would first try to reach his SIP phone at work, and if he
doesn't answer, try his home, cell phone, or soft client. Alternatively, Bob
could opt for a “parallel search” where the SIP network would ring all the
phones at once. He could also opt for a mix, such as searching for him first
at work, then on the mobile phone, soft client and home in parallel, and if
everything fails, sequentially deliver the call to the voice mail system. For
example, this process of sequential or parallel search (called “forking”) can
also be used with multiple users in call centers.
Figure 9-19 illustrates parallel forking.

Alice Proxy Server A Bob Bob's Mobile
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing
INVITE sip:bo
b@biloxi.com
100 Trying
180 Ringing
180 Ringing 200 OK
CANCEL
200 OK
t Terminated
487 Reques
ACK
200 OK
Figure 9-19: Parallel forking

Figure 9-20 illustrates sequential forking.
Alice Proxy Server A Bob Bob's Mobile
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing CANCEL
Timeout
200 OK
t Terminated
487 Reques
ACK
INVITE sip:b
ob@biloxi.com
100 Trying
180 Ringing
180 Ringing 200 OK
200 OK
Figure 9-20: Sequential forking

SIP does not define “services” on a one-by-one basis, which has have been
the case in other protocols such as ISUP, Q.931, QSIG or H.450. Rather, it
describes a set of “primitives” in the protocol that allow feature developers

to define their own. This allows for much greater flexibility in the number
of features that can be deployed.
The core SIP specification is RFC 3261. Many other RFCs are necessary
for a functional commercial SIP implementation. For example, SDP
Session Description Protocol (RFC 2327) is used for media session
description, and the RTP Real-Time Protocol (RFC 3550) and its Audio
and Video profile (RFC 3551) are used for transporting media.
SIP is based on an HTTP-like request/response transaction model. The
originator of a request is a Client and the response is provided by the
Server. Because of the peer-to-peer nature of communication, each user in
a communication may make some requests. This means that the SIP
protocol entity acting on behalf of a user (called a User Agent) operate both
as a Client and as a Server, depending on which sides is initiating the
request. The protocol itself frequently uses the terms User Agent Client
(UAC) and User Agent Server (UAS). These terms are only useful from a
protocol point of view, as physical devices will always include both, and be
simply called User Agents (UA).
Requests from a SIP UAC invoke a particular Method. A Method will
generate at least one response. Methods and Responses are called SIP
Messages.
SIP Messages are encoded in text format using augmented Backus-Naur
Form grammar (BNF). SIP is loosely defined as a structured protocol,
although most people don't typically view it that way. The BNF-defined
syntax and grammar layer is the first layer. The second layer is the
transport layer. SIP typically uses TCP or UDP as the transport for SIP
messages, but other transports such as TLS and SCTP are also allowed.
Sidebar: UDP or TCP Transport?

The initial version of SIP 2.0 (RFC 2543) supported both UDP and SIP.
It made UDP transport mandatory and TCP optional. The rationale was
that the UDP made SIP more lightweight and it would allow for better
scalability when a large number of sessions are supported
simultaneously on one physical interface (for example, a large Proxy or
Gateway). A lot of early SIP implementations supported only UDP
transport. However, it became apparent with time that there are a certain
number of things that TCP does better, one of them being congestion
control. Another is that TCP has less difficulty traversing NATs and
Firewalls. Yet another one is that TLS can run only on TCP, and TLS is
the standard way of security SIP. Very large messages might not be
transportable in one UDP datagram and may thus require TCP transport.
For these reasons, RFC 3261 made both UDP and TCP support
mandatory. It is expected that TCP transport will ultimately dominate.

The third layer is the transaction layer described earlier, consisting of a

request, and one or more responses. The fourth layer is the transaction user
layer. All SIP components (described later), except a stateless proxy are
transaction users.
Although SIP is defined as a protocol, and the SIP entities are defined in
relation to that protocol, it is useful to understand the role of the different
entities, and how they relate to real-world equipment.
User agent (UA)

The UA is the entity generating the SIP messages on behalf of a “user.”
That user can be human, or an automatic task or service. A UA can reside
on an IP telephone, a PC Multimedia Client, video conferencing
equipment, a voice mail server, or an Analog/TDM circuit-switch gateway.
When making a request, the UA is a UAC, when responding to a request, it
is a UAS. A UA will use SIP to communicate with another UA, a Registrar,
or any Network Element.
Registrar
A Registrar allows a UA to make its presence known to the network. It will
associate an address-of-record URI, with one or more contact addresses
(normally IP addresses). This binding can be done manually, or through a
dynamic mechanism called “Registration.” SIP itself does not specify how
network elements, such as proxies, will use a location service to locate the
users or services based on their URIs. There is an implication that
somehow the registrar stores the binding of address-of-record URI and
contact addresses in a “location server” and that the proxy will somehow
use that location service. In practice, the registrar and proxy are very often
collocated, or have access to the same database.
Redirect server
A Redirect server is a very simple server that responds to a session
invitation by an indication of the requested address-of-record location. It is
up to the UA to then reestablish the session directly to the new URI. The
Redirect Server is not involved in that second session, as it does not stay in
the signaling path. A redirect server is transaction stateful, but it only
understands that particular simple transaction. Figure 9-21 illustrates a SIP
session establishment through a SIP Redirect server.

Alice Redirect Server Bob
INVITE sip:b
ob@biloxi.com
porarily
302 Moved tem
b@10.10.0.2
contact: sip:bo
ACK
INVITE sip:b
ob@10.10.0.
2
100 Trying
180 Ringing 0.2
:bob@10.10.
contact: sip
200 OK
ACK
Figure 9-21: Redirect server
Proxy server
SIP proxies are elements that route SIP requests to user agent servers and
SIP responses to user agent clients. It routes SIP messages and therefore
stays in the signaling path. Proxy servers have a lot of flexibility in the
amount of “state” information it keeps, depending on what its function is.
One key point with proxy servers is that it is only allowed to modify very
specific parts of the SIP messages, mainly related to routing. SIP forbids
proxies to modify most of the message content (for example, the SDP is not
allowed to be modified). SIP was very much written with end-to-end
transparency in mind. Figure 9-22 illustrates a SIP session establishment
through a SIP Proxy server.
Alice Proxy Server

Bob
INVITE sip:b
ob@biloxi.com
INVITE sip:b
ob@biloxi.com
100 Trying
100 Trying
180 Ringing
180 Ringing 200 OK
200 OK
ACK
ACK
Figure 9-22: Proxy server
Stateless proxy server

A stateless proxy server is the simplest form of proxy servers, it does not
maintain transaction state. It is basically a message router forwarding

requests and responses blindly. It does not understand the concept of a

transaction, let alone a session (a call).
Transaction stateful proxy server

A transaction stateful proxy server is aware of the transaction state, that is,
the request and responses. It does not understand the concept of a session.
Call stateful proxy server

A call stateful proxy server maintains session state awareness. It
understands, for example, that an Invitation transaction starts a call, and
that a Bye transaction terminations it.
Back-to-back user agent (B2BUA)

As its name implies, a B2BUA is simply an entity that completely
terminates a SIP session on one side, and starts another one on the other
side. A B2BUA is technically not an entity precisely defined in SIP.
B2BUA was defined because many implementers realized that numerous
useful SIP services can not be implemented, while preserving SIP message
transparency (as a proxy would). B2BUA’s can do pretty much anything
they wish, provided that they look like a UA on both sides. They are much
more complicated than call stateful proxy servers, as they maintain call
state from two sessions, and correlate them together. They can be used for
Application servers (for example, call centers, conference bridges, and
voice mail), as well as, for special routing services where the IP addresses
of the UAs are hidden from each other. Figure 9-23 illustrates a sample SIP
session establishment initiated by a B2BUA.
UA A B2BUA UA B
b@biloxi.com
INVITE sip:bo
100 Trying
180 Ringing
200 OK
INVITE sip:b
ob@biloxi.com
ACK
100 Trying
180 Ringing
200 OK
ACK
Session A to left side Session B to right side

of B2BUA of B2BUA
Figure 9-23: B2BUA

Different SIP network elements can be used for different purposes. A

Redirect server can be used for a completely distributed routing system.
Proxies are very useful for “funnelling” signaling through a specific
location. A Stateless proxy server or a Transaction Stateful Proxy Server
can be used for a simple routing service where billing and call detailed
recordings are not necessary. A Call Stateful Proxy Server can be used for a
routing service where billing or call detailed recordings are required.
B2BUAs can be used for Application Servers.
SIP messages
SIP Messages are defined using a syntax inspired from HTTP/1.1,
however, SIP is not an extension of HTTP.
Requests contain a method name and a Request-URI (Universal Resource
Identifier). The Request-URI indicates the user or service to which the
request is being addressed (see Table 9-1).
SIP Request Methods
Defined in RFC 3261: Defined in other RFCs:

REGISTER PRACK
INVITE REFER
ACK SUBSCRIBE
BYE NOTIFY
CANCEL
OPTIONS
INFO
Table 9-1: SIP request methods

Responses include a response code and phrase. The phrase is a textual

explanation, of the response with no particular meaning from a protocol
point of view (see Table 9-2).
SIP Responses
1xx: Provisional, Most common are as 3xx: Redirection, Most common are:
follows: 303 “Move Temporarily”
100 “Trying” 4xx: Client Error, Most common are:
180 “Ringing” 404 “User not found”
183 “Call Progress” 486 “User Busy”
2xx: Success, Most common is: 5xx: Server Error
200 “OK” 6xx: Global Failure
Table 9-2: SIP responses

All nonprovisional responses are called final responses.
SIP Messages include Header fields similar to HTTP header fields. They
include “To:”, “From:”, and “Contact:”.
REGISTER is used by a UA to dynamically provide its registrar with an
association between its address and its address-of-record URI and makes
its presence known.
A session is established by a UA sending an INVITE request. After
receiving one or more provisional responses, the UA will receive a final
response. In a typical session, the first provisional response will be a 100
Trying, followed typically by a 180 Ringing. When the session is
completed (that is, when the call is answered) a 200 OK final response is
received. The sender of the INVITE has to acknowledge the final response
with an ACK message.
An established session can be modified by sending an INVITE, modifying
the parameters of an existing call. This type of INVITE is known as a
“re-INVITE” and can be sent by either side, regardless of which side sent
the initial INVITE. That re-INVITE may modify the SDP for example, to
add a video channel to an audio session or to change a voice codec.
A session is terminated by sending a BYE message.
An OPTIONS request can be used to query the capabilities of another UA
at any time, inside the context of an established session or outside of it.
INVITE transactions are acknowledged by sender of the INVITE by
sending an ACK request. ACK is a special type of method since it does not
trigger a response.

Since SIP does not mandate a reliable transport, reliability is handled at the
SIP level itself. INVITE requests are retransmitted using an exponential
back-off until there is a response, or until there is a timeout. Final responses
to INVITE are also repeated until acknowledge by an ACK, or until there is
a timeout.
SDP and the offer/answer model

The Session Description Protocol (SDP) is a text-based protocol used to
describe the media characteristics of a session. The SDP is included in a
SIP message as a MIME Body.
SDP includes information such as the IP address for of a media
termination. It is also used to describe the type of media (for example,
audio or video), the codec to be used (G.729, G.711, etc.), the transport
(RTP for multimedia), and well as other details of a session. Other
capabilities that can be described in SDP include, parameters for
transmission of DTMF and audio tones in-band in special RTP packets
using RFC 2833, multicast backbone information, and timing information.
SDP also supports IPv4 and IPv6, as well as other transports such as ATM.
SDP is used by many protocols, including the Media Gateway Control
Protocol (MGCP), the Gateway Control Protocol (MEGACO/H.248), the
Real-Time Streaming Protocol (RTSP), the Session Announcement
Protocol (SAP), and of course SIP.
As its name indicates, SDP merely describes the characteristics of a
session. It does not describe how and when media sessions are established.
Merely looking at an SDP description is not sufficient to know if it
represents an existing media session, a request to open media channel(s) or
if it represents the capabilities of a device. SDP relies on the protocol using
it to establish the media sessions, and infer the meaning of the SDP itself.
This means that it can not simply pass along an SDP from one protocol to
another.
Early versions of SIP were very “loose” in the way media sessions were
established, which led to many interoperability problems. Some
implementations considered the SDP in response to the initial SDP in the
invitation to be a description of capabilities of the UA, others considered it
to be the selected media session. After many years of confusion, SIP finally
agreed on the Offer/Answer model.
The Offer/Answer model in theory, could be used by any protocol that uses
SDP. At this point however, it is only used by SIP.
In the Offer/Answer model, a UA sends an SDP in a SIP message to
another UA with which it is involved in a session. That initial SDP is called
the Offer. The Offer contains a proposal for a media session. It will indicate
which type of media streams the offering UA is proposing to open, audio,
video, application sharing, or a combination of them. It also describes the

characteristics of these media streams, for example, which codecs it

supports for an audio call (for example, G.711 or G.729). It will also
describe the IP addresses and port to be used for receiving the media. The
offer can also indicate if a stream is bidirectional (send/receive), or
unidirectional (send or receive). An offer can consist of multiple
alternatives. An offerer may, for example, indicate that it is willing to set up
a bidirectional G.711 audio stream or a G.729 audio stream (but not both at
the same time), and an H.261 video session. In summary, the Offer is a list
of proposed media stream alternatives.
The UA receiving the Offer may accept or reject an Offer. Acceptance of an
Offer is done by sending an Answer. The Answer is based on the Offer, it
must pick streams from the Answer. As such, it does not necessarily
represent the full set of capabilities of the answerer, it represents what the
answerer is willing to establish. The answer can not add streams that were
not included in the offer. The answered is expected to add only information
that was not known by the offerer (for example, the IP addresses and ports
where the media streams should be sent).
Typically, an answerer would pick the one session description it is willing
to accept, for example, pick one voice codec from a list of alternative.
However, it is allowed to select more than one, in which case it is leaving
the offerer the choice to use any of them (and even to alternate between
them without warning). If the answerer is not willing to received voice
alternating from one codec to another without warning, it has to explicitly
pick one.
After an Offer/Answer exchange is completed, both ends can establish
media sessions as per the negotiated parameters.
In a SIP session, the SDP is carried as a MIME Body within a SIP message.
Typically, the Offer is included in the INVITE. The answer must be in the
first reliable nonfailure response, that is, the 2XX, or an 18X if reliability
of provisional response is used. See Figure 9-24.

Alice Bob
(192.168.0.1) (10.10.0.2)
INVITE sip:bo
SDP (A) b@biloxi.com
Offer=SDP(A)
G.711, G.729
RTP/RTP=192.168.0.1:5200/5201)
100 Trying
g
180 Ringin
200 OK
(B)
Answer=SDP
SDP (B)
G.711
RTP/RTP=10.10.0.2:4312/4313
ACK
RTP/R
TCP M
edia (G
.711)
a (G.7 11)
CP Medi
RTP/RT
Figure 9-24: Offer/answer in SIP session establishment

If the offer is not in the INVITE, the offer will come from the UAS in the
first reliable non-failure response (that is, in the same message that would
normally be used for the answer), and the answer will be in the
acknowledgment for that message. In a typical session, the Offer is in the
INVITE, and the answer is in the 200 when the far-end “answers” the call.
See Figure 9-25.
Alice Bob
(192.168.0.1) (10.10.0.2)
INVITE sip:bo
b@biloxi.com
100 Trying
180 Ringing
200 OK
)
Offer=SDP(A
SDP (A)
G.711, G.729
RTP/RTP=192.168.0.1:5200/5201)
ACK
SDP (B) Answer=SDP
(B)
G.711
RTP/RTP=10.10.0.2:4312/4313
RTP/R
TCP M
edia (G
.711)
edi a (G .711)
CP M
RTP/RT
Figure 9-25: Offer/answer in SIP session establishment, delayed

It is possible for either side involved in the session to modify a session by
sending another Offer, typically in a re-INVITE, which will trigger another

Answer. However, this can only be done after the initial offer has been
answered.
The MMUSIC is currently defining a next generation SDP protocol for
Session Description and Capability Negotiation called SDPng. Unlike the
current SDP, it will have a clear distinction between capabilities and
session parameters. Instead of defining its own syntax, SDPng uses XML.
Early media, PRACK and UPDATE

In order to interwork with the PSTN, it is necessary to support the concept
of “early media.” Early media is used in the PSTN to provide media before
a call is accepted. A typical usage of early media is to provide “in-band”
ringing tones or an announcement. Since early media occurs before the call
is accepted, it is not possible to wait for a 200 OK. The concept of early
media in SIP allows for an answer to be provided in an 18X provisional
response (typically a 183 Progress), thus allowing media to be provided to
the caller before the called party answers.
However, provisional responses are not sent reliably. If the 183, for
example, is lost, there will be no answer and the session will fail. The
reliability of provisional responses introduces the PRACK method which
allows for confirmation of the 18X response. This allows the sender of the
18X message to ensure that it is received by the caller. Figure 9-26
illustrates early media.
Alice Bob
(192.168.0.1) (10.10.0.2)
INVITE sip:bo
b@biloxi.com
SDP (A) , Supported:
Offer=SDP(A 100rel
G.711, G.729 )
RTP/RTP=192.168.0.1:5200/5201)
100 Trying
0rel
required: 10
n Progress,
183 Sessio DP (B )
Answer=S
SDP (B)
G.711
RTP/RTP=10.10.0.2:4312/4313
PRACK
200 OK
)
ia (G.711
P Early Med
RTP/RTC
200 OK
ACK
RTP/RTCP
Media (G.7
11)
Figure 9-26: Early media

The PRACK method has other usages as well. It also applies to provisional
responses regardless of early media. For example, in a case where no

in-band early media is used, the caller may very well ensure that the 180
Ringing with no SDP is delivered reliably, in order to ensure that proper
alerting treatment be provided to the caller (that is, the let the caller know
the “phone is ringing”). Another usage for PRACK is to allow other pre-
conditions to apply before setting up the media session, for example,
resources may need to be reserved (through RSVP, or a circuit-based
transport like ATM).
Another twist on early media is that it is sometimes necessary to modify an
Offer before the call is answered (for example, when a preanswer
announcement is provided). Since modifying an INVITE (through a
re-INVITE) is not allowed by SIP before the first INVITE is accepted
through a 200 OK, the UPDATE method was introduced to allow a client to
update the parameters of a session (such as the Offer or Answer) but has no
impact on the state of a dialogue. In that sense, it is like a re-INVITE, but
unlike re-INVITE, it can be sent before the initial INVITE has been
completed. The IETF preferred to introduce this new method to modify the
INVITE method to allow this behavior for backward compatibility reasons.
REFER
The REFER method allows for a UA to “refer” another UA to the resource
provided in the REFER request and for the UA to be informed of the result
of the referral. It is a very generic and powerful primitive that has a very
large number of usages.
For example, it can be used to implement a call transfer feature. For
example, if Alice is in a call with Bob and decides Bob needs to talk to
Carol, Alice can tell her SIP UA agent to send a REFER to Bob's UA with
Carol's Contact information. Bob's UA will attempt to call Carol. Bob's UA
will then report with a notification whether it succeeded in reaching the
contact to Alice's UA. Figure 9-27 details this scenario.

Alice
Bob Carol
INVITE sip:bo
b@example.co
m
100 Trying
180 Ringing
200 OK
ACK
re-INVITE sip
:bob@exam
(hold existing ple.com
call)
200 OK
REFER sip:bo
Refer-To: sip
b@ biloxi.com
:carol@examp
le.com
d
202 Accepte INVITE sip:ca
rol@exam ple.com
100 Trying
180 Ringing
200 OK
ACK
NOTIFY
200 OK
BYE
200 OK
Figure 9-27: Alice calls Bob and transfer to Carol

REFER can also be used to refer to other resources, for example, a
conference bridge. REFER can also be used outside the context of a
session. Alice could tell his UA to send a REFER to Bob with Carol's
Contact information. Bob will be asked by his UA if he wishes to talk to
Carol before the call is attempted from Bob to Carol.
REFER is not limited to SIP. When a REFER is referring to a sip resource,
the request URI contains a SIP URI. However, it is possible to refer to any
type of URI. For example, one could refer to an HTTP URI (useful for
implementing a Web push), an H.323 URI, or even an e-mail address.
REFER is a very versatile and powerful tool. It is essential that security
holes not be opened by it. One could imagine unsolicited REFERs being
sent to force people to call others. For this reason, it is very important when
using REFER to have proper mechanisms to ensure that the requests are
valid. In many cases, this means that the receiver of the REFER request
will have to explicitly accept the REFER before the resource is accessed
(through a pop-up window for example). In other cases, the sender of the
REFER must be authenticated and authorized.

SUBSCRIBE/NOTIFY and SIMPLE

The SUBSCRIBE/NOTIFY allows for a UA to request asynchronous
notification of events, through a subscription mechanism. It is a very
generic tool that can be used for many purposes. The SUBSCRIBE method
allows for a UA to request that it be informed of events as requested in the
SUBSCRIBE message body. The NOTIFY method is the mechanism by
which the UA will be informed of the event.
The event reported can be reported by a third-party. The SUBSCRIBE/
NOTIFY mechanism is generic enough that it can apply to a variety of
applications. Examples of applications that can use the SUBSCRIBE/
NOTIFY mechanism include:
Automatic Callback Service (based on terminal service)
Presence (based on “Friends lists”)
Message Waiting Indication (based on mailbox state change)
Keypad presses (based on user input)
Conference service (based on conference status)
In certain cases, an “implicit” subscription exists; for example, REFER
means that the sender of the REFER will be notified of the status of its
request with a NOTIFY message without requiring sending a SUBSCRIBE
message.
A NOTIFY message includes a message body whose content and format
depends on the application. Very often, it will be an XML description.
SIP for Instant Messaging and Presence Leveraging Extensions (SIMPLE)
makes use of the SUBSCRIBE/NOTIFY methods to support Presence
(similar to a “friend or buddy list”). SIMPLE also defines procedures for
Instant Messaging, which is independent of Presence. Figure 9-28
illustrates a simple example where Alice subscribes to Bob presence status
and is informed of two status changes.

Alice Bob
SUBSCRIBE
Bob's Presen
ce Status
200 OK
Bob sets his status
to "unavailable"
NOTIFY
ble
status=unavaila
200
Bob sets his status

to "available"
NOTIFY
status=available
200
Figure 9-28: Subscription to presence status and notification
SIP-T
SIP for Telephones (SIP-T), is a set of practices describing the use of SIP
for interoperability with SS7 PSTN gateway. The main idea behind SIP-T
is that SS7 PSTN gateways can use SIP to set up “calls” and maintain total
PSTN transparency, by allowing ISUP to be encapsulated in SIP messages
as a MIME Body. SIP-T is thus, SIP and ISUP at the same time. It is an
“Inter Call Server” protocol. Tunneling ISUP inside SIP messages has an
advantage over backhauling ISUP independently, as it maintains
intrinsically the association between the SIP session and the PSTN call.
ISUP messages are tunneled in corresponding SIP messages when possible
(for example, an IAM in an INVITE), and in the INFO method defined for
this purpose, when no other SIP message is appropriate. SIP-T is well
suited to Carrier equipment migrating from TDM SS7 architecture to SIP
IP architecture. As such, SIP-T is largely limited to “Carriers” as ISUP is
not widely deployed in Enterprises.
Figure 9-29 illustrates the concept of tunneling ISUP in SIP.

PSTN PSTN
Gateway Gateway
SIP UA SIP SIP SIP UA
IP Network
ISUP ISUP
Termination Termination
ISUP ISUP
PSTN
Figure 9-29: SIP-T

Other protocols, such Q.931, QSIG or proprietary PBX protocols can also
be tunneled in SIP using the same mechanism as SIP-T (although it
wouldn't really be called SIP-T).
SIP-T has the advantage in that it is an addition to SIP. A UA that does not
understand ISUP (such as an Enterprise SIP Gateway or a SIP phone) could
ignore the tunneled ISUP information and treat the call as a normal SIP
session.
However, with tunneled protocols approaches such as SIP-T, the tunneled
protocol can not conflict with SIP. Great care has to be taken to ensure
consistency. Over time, useful ISUP features are likely to have an
equivalent or superior version defined natively in SIP. SIP-T can thus be
viewed as a “transition” protocol (although with a potentially very long
transition).
Other SIP-related protocols capabilities

There are numerous other protocols and capabilities related to SIP in
existence or in development.
A mechanism for compressing SIP signaling is defined for environments
where the size of SIP messages is a concern (including for example low
bandwidth wireless interfaces).
Scripting languages such as the Call Processing Language (CPL) can be
used to define services and policies on a SIP proxy or redirection server or
B2BUA.

ENUM allows an ENUM client to perform a reverse DNS query on an

E.164 telephone number. The ENUM server then responds with contact
information consisting of Uniform Resource Identifiers (URI). Since this
URI would typically be a SIP URI, ENUM allows for reaching a SIP user
by knowing only a phone number. It permits not only to bypass the PSTN,
but to also use SIP advance multimedia capabilities. With ENUM, an
Enterprise could publish the E.164 numbers that can be directed to its
corporate proxy server. It would allow anybody to reach any of those
numbers through SIP.
KPML is a stimulus protocol currently being defined, using the
SUBSCRIBE/NOTIFY mechanism to allow for reporting key presses on a
phone to a third party for providing services. For example, it would allow a
service provider to provide a “calling card” service based on pressing key
presses on a SIP phone, even if the entity offering the service is not in the
media path (and could therefore not use the in-band RFC 2833 DTMF in
the RTP stream). Other SIP protocols are being defined for Conferencing.
SIP is continuously being built upon.
Comparison of SIP and H.323

H.323 and SIP are two very different protocols. H.323 uses binary ASN.1/
PER encoding, which puts most of the complexity in the encoding and
decoding of the messages, while SIP uses text representation, which puts
most of the complexity in the parsing. Countless studies have been
published showing why one protocol is superior to the other. Very often,
these studies were based on comparison with an older version of the
protocols that lacked certain features. Through time, H.323 and SIP fed on
each other, adopting the capabilities of the other protocol that it lacked.
This proved to be a very beneficial process, as it resulted in two more
powerful and robust protocols. It also ensured that interoperability between
the two was feasible, even if it is not necessarily easy. On the efficiency and
performance front, the two protocols are now very similar because H.323
became more streamlined with time and SIP had to add extra protocol to be
more usable.
It is sometimes useful to draw the following parallels (see Table 9-3).
H.323 SIP
H.225.0 SIP
H.245 SDP
Endpoint User Agent (UA)
Gatekeeper (for Registration) Registrar

Gatekeeper (Direct-routed model) Redirect server
Gatekeeper (GK-routed model) Proxy Server
Call Session
Registration (RRQ) Registration (REGISTER)
Admission Request (ARQ) INVITE
SETUP INVITE
CALL PROCEEDING 100 Proceeding
Out-of-band ALERTING 180 Ringing (no SDP)
In-band ALERTING (Progress 183 Progress (with SDP)

Indicator)
CONNECT 200 OK
RELEASE COMPLETE BYE
H.450 Transfer Request or REFER

FACILITY redirection
FastStart Offer/Answer start in first

INVITE
SlowStart Offer/Answer start in response to

INVITE
H.245 Capability negotiation OPTIONS
Third party pause and re-route Re-INVITE with sendonly/

(TCS=0) inactive
H.245 media channel modifications Re-INVITE/UPDATE
H.323 Annex M - QSIG & ISUP SIP-T and ISUP & QSIG
Tunneling Tunneling
Table 9-3: Parallels between the protocols
H.323 is well-defined protocol suite with roots in both video conferencing
(H.320) and telecommunications (Q.931). These roots make it palatable to
software writers who are familiar with one or the other sector.
SIP is a simpler and more flexible protocol that has its roots in Internet
Protocols and as such, is much more palatable to software writers who
write for the Internet. SIP also makes it easier to integrate with other
software applications than H.323. SIP is a better protocol for simple

applications that can conceivably use a smaller subset of SIP than they
could of H.323.
H.323 includes well-defined procedures for video-conferencing, while SIP
is starting to include the protocol necessary for commercial video
conferencing. H.245 is a much more feature rich protocol than SDP.
H.323 has a larger installed based today. SIP does not have a large installed
based yet, but most of the development dollars are being spent on SIP, and
not H.323.
Today, the overwhelming majority of protocol development taking place at
the Standards level is on SIP and SIP-related protocols. H.323 is essentially
in a maintenance mode, and only minor additions are actively worked on.
The amount of work devoted to SIP is blossoming. The original IETF
MMUSIC Working Group was so swamped with SIP work that it spun-off
the SIP Working Group who itself spun-off a SIPPING Working Group
(SIP focuses on the core SIP protocol, while SIPPING focuses on noncore
aspects). There is now a SIMPLE Working Group for SIP Presence and
Instant Messaging, and an XCON for Centralized Conferencing. Certain
groups are also defining extensions or writing protocols with SIP in mind,
and not H.323. Some examples are Firewall and NAT traversal solutions in
the MIDCOM, MMUSIC and NSIS Working Groups, or telephony-number
related work in the ENUM or IPTEL Working Groups. Other features such
as Presence and Instant Messaging as defined by the SIMPLE Working
Group can have “equivalent” in H.323, but are much more appropriate in a
SIP environment. IPv6 transition mechanisms are also more usable by SIP
than H.323.
At the development level, the situation is similar. R&D investments are
pouring in SIP, while H.323 development is more in the established phase.
SIP will certainly become the most pervasive protocol for most people and
most applications, but there will still be vast amounts of H.323 systems out
there and it may well remain dominant in certain markets for the
foreseeable future.

Gateway Control protocols
Figure 9-30: Network topology
Sidebar: Evolution of gateway control protocols

Gateway control protocols have had a long, confusing, and convoluted
history. This complicated evolution is not well understood, nor has it
been well represented in the press. The next few paragraphs clearly
outline the evolution of gateway control protocols and provide a context
for Megaco/H.248.
Beginning in early 1998, several competing gateway control approaches
were proposed by many vendors and researchers. While the need was
clear, there was no consensus on the best approach.
Toward the end of 1988, the IETF formed the MEdia GAteway COntrol
(Megaco) Working Group with the strong mandate to provide an open
standard for IP-based gateway device control, using master/slave
principals. Nortel was, and still is, Chair of this Working Group.
There were several proposals brought forward and discussed, the early
leaders being MDCP (media device control protocol) and MGCP (media
gateway control protocol). Although some early products were deployed
using MGCP, none of the early proposals were accepted in their entirety
by IETF Megaco WG or other open standards bodies.

Sidebar: Evolution of gateway control protocols (Continued)

Instead, key aspects of the MGCP and MDCP proposals, along with
many other inputs, were integrated into the Megaco WG development
stream, choosing the best aspects from each. The end result was that, by
March 1999, the Megaco Protocol was created.
Parallel to the efforts of the IETF, the ITU-T was evaluating a number of
options, and in May 1999, the ITU-T Study Group 16 (SG-16) initiated
an H-series gateway control protocol project, at that time called H.GCP,
and later designated H.248.
The IETF and ITU SG-16 began to work on a compromise approach
between the Megaco Protocol and H.GCP. In the summer of 1999, an
agreement was reached between the two organizations to create one
international standard—Megaco/H.248 Protocol was born. During the
following year, considerable effort was expended to perfect the solution,
and Megaco/H.248 Protocol was approved by both standards bodies in
June 2000.
While the IETF and ITU SG-16 were developing their compromise
approach, the IETF was also asked by MGCP supporters to release a
Request for Comments document on MGCP, for information purposes
(Informational RFC). In October 1999, the document was released but is
not an “official” IETF standard, nor was it accepted by the ITU-T SG-
16. Although currently supported by the International Softswitch
Consortium (ISC), MGCP has failed to achieve consensus in the open
standards bodies and is therefore not a true, open standard.
The PacketCable* NCS protocol also evolved out of the MGCP
approach, and NCS1.0 became the de facto standard for analog phone
control behind cable modem networks. In the first quarter of 2000,
PacketCable was successful in having the NCS specification accepted by
the ITU-T SG-9, which saw its transition from de facto standard to
legitimate standard for analog telephone control behind cable access, as
Recommendation J.162.
Megaco/H.248 remains the only truly open international standard for
master/ slave control of all media gateway device types. Megaco/H.248
continues to be further improved and developed over time, with ongoing
efforts in both IETF and ITU-T as well as other standards organizations.
Megaco/H.248 overview
As shown in Figure 9-31, the architecture of Megaco/H.248 is based on the
media gateway control (MGC) layer, the media gateway (MG) layer, and
the Megaco/H.248 Protocol itself.

Figure 9-31: Megaco / H.248 gateway control architecture

The MGC layer contains all call control intelligence and implements call
level features such as forward, transfer, conference, and hold. This layer
also implements any peer-level protocols for interaction with other MGCs
or peer entities, manages all feature interactions, and manages any
interactions with signaling, such as SS7.
The MG layer implements media connections to and from the packet based
network (IP or ATM), interacts with those media connections and streams
through application of signals and events, and also controls gateway device
features, such as user interface. This layer has no knowledge of call level
features and acts as a simple slave, but it is the Media Gateway that
establishes the end to end connection, not the Media Gateway Controller.
The Megaco/H.248 Protocol drives master/slave control of the MG by the
MGC. It provides connection control, device control, and device
configuration. Because the Megaco/H.248 Protocol is separate from, and
independent of the peer call control protocol (for example, SIP or H.323),
different systems can be used at the call control level with minimal cost
impact on the gateway control layer.
Call walk through

Figure 9-32 illustrates a simplified message flow for the Megaco protocol.
The Media Gateway detects the off-hook condition and advises the Media
Gateway Controller (MGC). The MGC instructs the Media gateway to
apply dial tone and to collect digits for the dialed number. The MGC then
advises the far end (Service Data Point) how to set up the call. While
waiting for the far end to answer, the MGC will receive and pass on the

Service Data Point information needed to establish the call and will signal
the Media gateway that audible ringing has been established. When the far
end answers, the MGC will be notified and will pass this on to the Media
Gateway to establish the two way talk path.
Media
Media
Gateway
Gateway
Controller
Lift Handset
Notify: "Off Hook"
Req Notify: "dial tone"

Dial Tone
Dial Digits
Notify: "digits"
Create Connection
Ack: Connection Servcice

Data Point Info
Call Setup to far
end (multiple
protocol options)
Modify Connection: Far end Service Data
Point info, signal ringing
Audible Ringing
Far End
Answers
Modify Connection: Send-Receive
IP Message Path Talk Path
Hang up
Notify: "on hook"
Delete Connection
Figure 9-32: Megaco / H.248 Call walk through
The Megaco/H.248 protocol in more detail

Megaco/H.248 uses a connection/resource model to describe the logical
entities or objects within the Media Gateway (MG) that can be controlled
by the Media Gateway Controller (MGC). It is fundamentally based on two
key concepts, termination and context.
Terminations
Terminations identify media flows or resources, implement signals,
generate events, have properties and maintain statistics. They can be
permanent (provisioned) or transient (ephemeral). All signals, events,

properties, and statistics are defined in packages that are associated with
the individual terminations.
Contexts
As shown in Figure 9-33, context (C) refers to associations between
collections of terminations (T), defines the communication between the
terminations, and acts as a mixing bridge. A context can contain more than
one termination and can be layered to support multimedia.
Figure 9-33: Megaco / H.248 connection / resource model

The command structure of Megaco/ H.248 has only seven commands—
Add, Subtract, Modify, Move, Notify, AuditValue/AuditCapabilities, and
ServiceChange. One level of event/signal embedding is allowed, such that
trigger events can cause reflex actions in the MG.
Commands between the MGC and the MG are grouped together into
transactions using simple construction rules that result in small messaging
overhead. Commands use descriptors to group related data elements. Only
those descriptors needed for the particular intended operations need to be
sent along with commands.
Megaco/H.248: connection model

Based on two key concepts—Termination and Context
Terminations represent media connections to/from the packet
network, allow signals to be applied to the media connections, and
events to be received from the media connections
Contexts implement bridging and mixing of the media flows
between terminations
Only seven commands: Add, Subtract, Move, Modify, Notify,
Audit, ServiceChange
Grouping of commands into transactions, using flexible
construction rules

Commands use descriptors to group related data elements

Package extension mechanism provides a clear, simple and openly
extensible method to specify signals, events, properties and
statistics on terminations
Profile interoperability mechanism defines MG organization and
specific selection of optional elements for particular applications
Sidebar: Terminations
All signals and events are assumed to occur at a specific termination and
they provide a mechanism for interacting with the remote entity
represented by that termination. Specific signals and events are defined
in packages. Examples of signals include tone generation, playing of
announcements, and the display of caller identity. Examples of events
include line off-hook, DTMF digit received, and fax tone detected.
Properties are defined within the Megaco/H.248 Protocol in two ways.
The term can be assigned to any piece of information that may be placed
into a descriptor in either a request or a response. The term can also
apply to package definition where properties act as state, configuration,
or other semistatic information regarding the termination to which the
package is attached.
Statistics can be accumulated at particular terminations and returned
from the Media Gateway (MG) to the Media Gateway Controller (MGC)
to provide information relevant to monitoring of the MG, network
performance or user activity. Statistics are also defined in packages.
Examples of statistics include, number of bytes sent and received while
in a context, duration of a termination in a context, packet loss rate and
other operational measurements.
Sidebar: Packages and profiles

Packages are the primary extension mechanism within Megaco/H.248.
Packages define new termination behavior by way of additional
properties, events, signals and statistics. Packages use the Internet
Assigned Numbers Authority (IANA*) registration processes. New
packages can be defined based on existing packages (“extends”
mechanism, version mechanism).
Profiles define applications of Megaco/H.248 at the MG, including
package/termination organization and requirements, specific selections
of optional elements (such as transport and encoding), and any other
behavior definition required for the application. Profiles are application
level agreements between the MGC and MG that specify minimum
capabilities for improved interoperability and reduced complexity/cost.


There are distinctions between real-time transport protocols (such as RTP/
RTCP), peer-to-peer call/session control protocols such as H.323 and SIP,
and gateway control protocols such as MEGACO/ H.248, MGCP, NCS/
J.162. Each has distinct advantages in a converged architecture.
It is paramount that you can clearly articulate call flows when designing for
real-time applications. H.225.0 (RAS) is used for communications between
an endpoint and a Gatekeeper, or between Gatekeepers. RAS performs
registration, address translation, admission control, bandwidth control and
zone management. Following admission control, H.225.0 (Q.931) is used
for establishing a call between endpoints. A gatekeeper-routed model, or a
direct-routed model can be used. After the call is established, an H.245
connection is established. This allows for Terminal Capability Negotiation,
Master/Slave Determination, and eventually set up of logical channels for
media (typically on RTP). FastStart is a popular mechanism for performing
the opening of the media channel in the H.225.0 (Q.931) phase without
resorting to a separate H.245 channel. During a call, H.245 procedures,
such as third-party pause and rerouting can be used to modify the
characteristics of the media streams.
A SIP User Agent may make it's presence known to a SIP network by
registering with a Registrar. A SIP session is established by a SIP User
Agent sending an INVITE message, which is routed by SIP proxy or
redirect servers to another User Agent. B2BUA can be used by Application
Servers for implementing special services. The User Agent receiving the
invitation may accept it with a 200 OK response. In order to establish
media sessions, the Offer/Answer model is used with the Session
Description Protocol included as a MIME body in the SIP messages. After
a session is established, media can be re-negotiated using re-invitation and
updates. Reliability of provisional responses can be used for things such as
providing early media. REFER allows for requesting that another User
Agent contact another resources, for example, another SIP User Agent
thereby performing a transfer service. Subscription and notification can be
used to report events to a third party, enabling important features such as
Presence, part of SIMPLE, which also includes Instant Messaging.
Gateway Control protocols include H.248/MEGACO, MGCP, NCS and
J.162. The gateway control protocol is between the Media Gateway
Controller and the Media Gateway. The call control intelligence reside in
the MGC. The media connection is on the MG. The MG is a slave device
while the MGC is a master device.
MEGACO/H.248 uses “terminations” to identify media flows or resources,
implement signals, and generate events, have properties and maintain
statistics. A “Context” refers to associations between collections of
terminations, defines the communication between the terminations, and

acts as a mixing bridge. MEGACO/H.248 has seven commands, Add,

Substract, Move, Modify, Notify, Audit and ServiceChange. Packages
provide extensibility for, and Profiles provide interoperability options.

References
ITU-T Recommendation H.225.0v4, Call Signalling Protocols and Media
Stream Packetization for Packet Based Multimedia Communications
Systems, International Telecommunication Union Telecommunication
Standardization Sector (ITU-T), 2000
ITU-T Recommendation H.245v7, Control Protocol for Multimedia
Communication, ITU-T, 2000
ITU-T Recommendation H.323v4, Packet Based Multimedia
Communications Systems, ITU-T, 2000
RFC 2833, “RTP Payload for DTMF Digits, Telephony Tones and
Telephony Signals,” IETF, http://www.ietf.org/rfc/rfc2833.txt
RFC 3551, “RTP Profile for Audio and Video Conferences with Minimal
Control,” IETF, ftp://ftp.rfc-editor.org/in-notes/rfc3551.txt
RFC 3550, “RTP: A Transport Protocol for Real-Time Applications,”
IETF, ftp://ftp.rfc-editor.org/in-notes/rfc3550.txt
RFC 2326, “Real Time Streaming Protocol (RTSP),” IETF, ftp://ftp.rfc-
editor.org/in-notes/rfc2326.txt
The tel URI for Telephone Calls, IETF, http://www.ietf.org/internet-drafts/
draft-ietf-iptel-rfc2806bis-02.txt
RFC 3261, “SIP: Session Initiation Protocol,” IETF, http://www.ietf.org/
rfc/rfc3261.txt
RFC 3515, “The Session Initiation Protocol (SIP) Refer method,” IETF,
http://www.ietf.org/rfc/rfc3515.txt
RFC 3264, “An Offer/Answer Model with the Session Description Protocol
(SDP),” IETF, http://www.ietf.org/rfc/rfc3264.txt
RFC 3265, “Session Initiation Protocol (SIP) – Specific Event
Notification,” IETF, http://www.ietf.org/rfc/rfc3265.txt
RFC 2976, “The SIP INFO Method,” IETF, http://www.ietf.org/rfc/
rfc2976.txt
RFC 3262, “Reliability of Provisional Responses in the Session Initiation
Protocol (SIP),” IETF, http://www.ietf.org/rfc/rfc3262.txt
RFC 3311, “The Session Initiation Protocol (SIP) UPDATE Method,”
IETF, http://www.ietf.org/rfc/rfc3311.txt
IETF MMUSIC Working Group, IETF, http://www.ietf.org/html.charters/
mmusic-charter.html
IETF SIP Working Group, http://www.ietf.org/html.charters/sip-
charter.html

IETF SIPPING Working Group, http://www.ietf.org/html.charters/sipping-

charter.html
IETF XCON Working Group, http://www.ietf.org/html.charters/xcon-
charter.html
IETF SIMPLE Working Group, http://www.ietf.org/html.charters/simple-
charter.html

229
Chapter 10
QoS Mechanisms
Ralph Santitoro
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts Covered
Network convergence
Comparison of voice application over TDM and IP packet networks
Convergence drives the need for QoS mechanisms in packet
networks
Implementing QoS mechanisms versus adding bandwidth
Overview of QoS mechanisms – classifier, meter, marker, dropper,
shaper, scheduler
DiffServ QoS architecture
TOS and DiffServ Field Definitions and their importance in
determining the IP QoS a packet should receive.

Overview of DiffServ PHB groups, including the Expedited

Forwarding (EF) PHB, Assured Forwarding (AF) PHB, the Class
Selector (CS) PHB group and the Default Forwarding (DF) PHB.
DiffServ configuration considerations when DiffServ is
implemented differently on different products in the network.
Overview of Ethernet IEEE 802.1Q and the definition of the VLAN
ID and 802.1p user priority bits
Other simple forms of QoS (port prioritization and IP address
prioritization) that work for single application hosts
Packet fragmentation and interleaving to reduce voice packet jitter
introduced by large data packets
Introduction
Quality of Service (QoS1) is a broad term used to describe the treatment an
application's traffic receives from the network. Quality of Service involves
a broad range of technologies, architecture, and protocols. Network
operators achieve end-to-end QoS by ensuring that network elements apply
consistent treatment to traffic flows as they traverse the network.
Today, network traffic is highly diverse and each traffic type has unique
requirements in terms of bandwidth, delay, loss and availability. With the
explosive growth of the Internet, most network traffic today is IP-based.
Having a single end-to-end transport protocol is beneficial because
networking equipment becomes less complex to maintain, resulting in
lower operational costs. This benefit, however, is countered by the fact that
IP is a connectionless protocol, that is, IP packets may take different paths
as they traverse the network from source to destination. This can result in
variable and unpredictable delay in a best-effort IP network.
The IP protocol was originally designed to reliably get a packet to its
destination with less consideration to the amount of time it takes to get
there. IP networks must now support many different types of applications.
Real-time applications, such as voice and video, require low latency
(delay) and loss. Otherwise, the end-user quality may be significantly
affected or in some cases, the application simply does not function at all.
Consider a voice application. Voice applications originated on public
telephone networks using Time Division Multiplexing (TDM) technology,
which has a very deterministic behavior. On TDM networks, the voice
1. QoS typically deals with the measurement of parameters associated with a specific treatment. For a long
time, Quality of Service was also used to indicate the overall experience of the user or application.
However, because the objective assurance of meeting a specific parameter can sometimes result in
different levels of overall quality, the term Quality of Experience (QoE) is now used to indicate the
overall experience. For example, meeting a specific delay or jitter objective on a network might be
thought of as QoS, while QoE would deal with the user's perception of voice quality on a network
with that amount of delay and jitter.

Chapter 10 QoS Mechanisms 231
traffic experienced a low and fixed amount of delay with essentially no

loss. Voice applications require this type of behavior to function properly.
Voice applications also require this same level of “TDM voice” quality to
meet user expectations.
Take this “TDM voice” application and now transport it over a best-effort
IP network. The best-effort IP network introduces a variable and
unpredictable amount of delay to the voice packets and also drops voice
packets at network congestion points. Because of this unpredictability, the
best-effort IP network does not provide the behavior that the voice
application requires. QoS mechanisms can be applied to the best-effort IP
network to make it capable of supporting VoIP with acceptable, consistent
and predictable voice quality.
IP networks are not single purpose, that is, they are not used to exclusively
carry one type of traffic such as voice (VoIP). They also transport other
types of traffic, such as best effort Internet traffic, network Operations,
Administration and Management (OAM) traffic and other applications
requiring better than best effort treatment. When supporting voice (VoIP)
over an IP network, QoS mechanisms ensure that VoIP packets do not
receive excessive delay, delay variation or loss that will result in
unacceptable voice quality to the end user. The voice application, originally
designed for the TDM network, still has the same requirements and user
expectations that must also be met over the IP network with the help of
QoS mechanisms.
QoS and Network Convergence

Since the early 1990s, there has been a movement towards network
convergence, that is, transport all services over the same network
infrastructure. Traditionally, there were separate networks for each type of
application. However, many of these networks are being consolidated
(converged) over IP networks to reduce operational costs or improve profit
margins by enabling new applications or services.
Not too long ago, an Enterprise may have had a private TDM-based voice
network, an IP network to the Internet, an ISDN video conferencing
network, an SNA network and a multiprotocol (for example, IPX and
AppleTalk*) LAN. Similarly, a service provider may have a TDM-based
voice network, an ATM metro and SONET backbone network, and a frame
relay, TDM or ISDN access network.
Today, all data networks are converging on IP transport because the
applications have migrated towards being IP-based. The TDM-based voice
networks have also begun moving towards IP. When the different
applications had dedicated networks, QoS technologies played a smaller
role because the traffic was similar in behavior and the dedicated networks
were fine-tuned to meet the required behavior of the particular application.

With the converged network, different types of traffic are mixed and each
application often has very different performance requirements. These
different traffic types often react unfavorable together. For example, a voice
application expects to experience essentially no packet loss and a minimal,
constant amount of packet delay. The voice application operates in a
steady-state fashion with voice channels (or packets) being transmitted in
fixed time intervals. The voice application receives this performance level
when it operates over a TDM network. Now take the voice application and
run it over a best-effort IP network as Voice over IP (VoIP). The best-effort
IP network has varying amounts of packet loss and delay caused by the
amount of network congestion at any given point in time. The best-effort IP
network provides almost exactly the opposite performance required by the
voice application. Therefore, QoS technologies play a crucial role to ensure
that diverse applications can be properly supported in a converged IP
network.
Overview of QoS Mechanisms

The fundamental QoS mechanisms implemented in routers consist of:
Classifier
Marker
Policer
Shaper
Scheduler
Queue Management
Each of these mechanisms has a different effect on the QoS a packet will
receive and are subsequently described in more detail.
Classifier
Packets entering an interface are classified based on some filtering criteria
specified by local or network-wide QoS policy. This is done to properly
identify the application for subsequent marking with the appropriate class
of service identifier (CoS marking), after which, the packets are sent to a
rate enforcer (policer). Classifiers may filter based on OSI Layers 2-7
information, although routers most commonly support classification based
on OSI Layers 2-4 criteria.
The classifier is also useful for security purposes and by applying multiple
filters in combination, for example Layers 2, 3 and 4 filters, one can
improve the likelihood that the application classified is not an unauthorized
application or user attempting to get better QoS than permitted by the
network’s application or user QoS policies. Real-time applications initiated
from fixed function hosts, such as voice gateways, are the simplest to

classify because their IP addresses are static and rarely changed. Mobile or
easily movable devices, for example IP phones, require more complex
classification and authentication techniques since their addresses are
typically dynamically assigned.
Finally, many real-time applications use dynamically assigned port
numbers, so care must be taken to select the right combination of filters to
properly identify the application. It is important to properly classify real-
time packets so they are marked properly. This ensures that the routers in
the network provide them with the appropriate QoS treatment.
Real-Time Packet Classification

Real-time packets can be classified based on Layers 2-7 information. All
routers can classify based on the following Layers 3-4 information in the IP
header:
Source Address
Destination Address
TOS Field
Source Port Number
Destination Port Number
Marker
Once IP packets are classified, they are marked to indicate the class of
service to which they belong. This marking is done in the DiffServ/TOS
field in the IPv4 packet header and the Traffic Class Octet in the IPv6
header. This marking is important because it will indicate to routers how
the packets should be treated across the network. If a real-time packet is
marked improperly, a router may introduce higher delay or jitter, or drop
(packet loss) the packet under different network operating conditions.
The original definition of this field was referred to as the Type of Service
(TOS) field. In 1999, the Internet Engineering Task Force (IETF) created a
new QoS architecture called IP Differentiated Services (DiffServ) and had
redefined this TOS field to now be called the DiffServ Field. Since the TOS
field definition has changed several times over the years, there is much
confusion surrounding this fields definition, so a bit of history is warranted.
Old TOS Field Definition

The older TOS definition in RFC 1349 consists of two subfields, namely,
the 3-bit IP Precedence field and the 4-bit TOS field. The least significant
bit of the field must be set to zero. Figure 10-2 illustrates the different sub-
fields.

Figure 10-2: Old TOS field definition

Of these two subfields, the TOS field was rarely used and network
deployments used only the IP Precedence field. Refer to Figure 10-3.
Legacy routers supporting IP QoS typically implement only the IP
Precedence field. The three IP Precedence bits result in eight possible
classes of service. The TOS terminology has unfortunately been highly
misused in the industry and when one refers to TOS, for example, a
network or router supporting “TOS”, they really mean IP Precedence.
Figure 10-3: IP Precedence in TOS field
DiffServ Field Definition

The DiffServ field uses the same TOS field (byte) location in the IP packet
header, but the definition of the bits and their purpose are now quite
different. The six most significant bits (MSBs) are referred to as the
DiffServ Codepoint (DSCP) and the two least significant bits (LSBs) are
used by routers that support explicit congestion notification (ECN). If a
router does not support ECN, then the router sets the two least significant
bits (LSBs) to zero and they are ignored by the router. ECN enables a
router to inform its neighboring routers that it is experiencing congestion,
so that the neighboring router may take action to reduce congestion, such
as start dropping packets that are eligible for discard (for example, best
effort packets or packets marked with a higher drop precedence). ECN
provides similar functionality that the BECN (backward ECN) protocol
used in frame relay networks.
The value of the DSCP determines the standard DiffServ class to which the
traffic belongs and the type of treatment, called a DiffServ per hop
behavior (PHB), the traffic will receive. Out of 64 possible values for the
DSCP, only 32 are currently defined for public use while the other 32 are

defined for local or experimental use. Figure 10-4 shows the new TOS
field.
Figure 10-4: DiffServ - new definition of TOS field

Twenty-one of the 32 DSCPs have been standardized by the IETF, leaving
eleven DSCPs that can be used for user-defined purposes.
Policer (Meter/Remarker/Dropper)
Once the incoming packets are classified, the policer uses a configured rate
and burst size to determine which packets are conformant and which are
not. Depending upon the implementation, a router may have a single rate or
dual rate policer or a combination of both. In a single rate implementation,
there is a committed information rate (CIR) whereby CIR, conformant
packets are assured delivery. In a dual rate implementation, there is a CIR
and an excess information rate (EIR) that determines the amount of excess
(CIR-non conformant) traffic allowed into the network. The dual rate
policer is used for traffic that varies in packet size or arrival time, that is,
bursty traffic. The single rate policer is often used for applications that
transmit at regular intervals, e.g. VoIP.
If incoming packets are not CIR conformant, they can be either remarked
to indicate higher drop precedence for a dual rate policer with a non-zero
EIR or dropped outright for a single rate policer or dual rate policer with
EIR set to zero. Since voice traffic is sent at a constant rate, a single rate
policer is sufficient. Video traffic can use either a single or dual rate policer,
since some video applications send packets at a constant rate while others
send packets at a variable rate.
Shaper
A shaper provides a smoothing effect on bursty traffic so it can be delivered
more efficiently over lower speed interfaces. The policer provides some
shaping (buffering) and some routers implement a secondary shaper,
especially over WAN interfaces. Shaping, effectively buffers (delays)
packets before being sent. Therefore, the delay added by shaping must be
accounted for in the end-to-end delay budget for the real-time application.
Shaping is generally not recommended for real-time applications.

Scheduler
The scheduler determines how packets are queued out an interface. There
are two classes of schedulers, namely, priority scheduler (priority queuing)
and weighted schedulers. Priority schedulers simply continue transmitting
packets until their queues are empty, resulting in the least amount of packet
delay. Weighted schedulers transmit packets based on an assigned weight.
For example, the weight could indicate a percentage of time a queue is
emptied before the next queue is serviced. There are many forms of
weighted schedulers, for example, Weighted Fair Queuing (WFQ) and
Weighted Round Robin (WRR), as well as variants of these, for example,
Deficit WRR (DWRR) and Class Based WFQ (CBWFQ).
Voice applications should use a priority scheduler. Video applications
could use a priority scheduler. However, some weighted schedulers may
also be able to support video applications. Schedulers have a direct impact
on packet delay and jitter.
Queue Management
Queue management determines how queued packets are handled as the
number of packets in a queue increases. The queue becomes fuller as more
packets from multiple sources traverse the same interface. There are two
basic forms of queue management, namely, tail drop and active queue
management (AQM). Tail drop simply drops arriving packets when the
buffer is full (or when a provisioned buffer depth is exceeded). AQM
randomly drops discard eligible packets when one or more buffer depths
are exceeded. Examples of AQM are random early discard (RED),
weighted RED (WRED) and multilevel RED (MRED).
Queue management has an effect on packet loss. AQM methods, in
general, are not recommended for real-time applications unless the
application can detect packet loss and readjust its transmission rate. For
example, some video applications can detect packet loss and switch to a
lower bit rate codec. When packet loss is no longer detected, the video
application can then switch back to the higher bit rate (higher quality)
codec. AQM must never be used with voice applications.
As you can see, there are many QoS mechanisms that can be used and each
has a different impact on delay, jitter and loss. Each QoS mechanism must
be tailored to real-time application. The following sections will cover this
in more detail.
Implementing QoS Mechanisms versus Adding Bandwidth

In some cases, it is possible to add sufficient bandwidth such that network
elements and connections rarely become over utilized (congested). Without
any network congestion, packets pass though the network unimpaired by
packet loss or excessive delays. Unfortunately, bandwidth discontinuities in

the network, for example, LAN to MAN/WAN, are potential congestion

points resulting in variable and unpredictable QoS. It is often impractical
(unaffordable) to add bandwidth to a network due to the additional OpEx
(monthly charge for WAN bandwidth) or Capex (new hardware to support
higher bandwidth) that is not incurred when implementing QoS
mechanisms.
DiffServ QoS Architecture

DiffServ is a network architecture that defines different nodal forwarding
classes, their identification and their performance. These are called
DiffServ Per-Hop Behaviors (PHBs). Each PHB is identified by one or
more IETF-standardized DSCPs. The next sections describe each of the
DiffServ PSCs and their usage.
Class Selector DiffServ PHB Group

The Class Selector (CS) DiffServ PHB group consists of eight classes and
uses the same bit positions as the IP Precedence field in the older TOS
definition. The CS PHB group was defined to provide some level of
backward compatibility with legacy routers. The CS PHB group specifies
that there must be at least two different forwarding classes provided for
packets marked with any of the eight DSCP values. Historically, CS7
(‘111000’) and CS6 (‘110000’) marked packets are used for network
control applications and the CS0 (‘000000’) marked packets are used for
“best effort” applications. The remaining six CS DSCPs may ‘inherit’ the
PHB group performance specified in other DiffServ PHBs (described later)
or create custom forwarding behaviors. The CS DSCPs are often used to
mark signaling or control traffic for real-time applications.
The CS PHB group may be implemented using a priority or weighted
scheduler with tail drop queue management, traffic is policed to the CIR
and excess traffic may be dropped or shaped.
Table 10-1 lists the Class Selector PHB group DSCP values.

CS Code CS Code
Point Name Point Value
(in binary)
CS7 '111000'
CS6 '110000'
CS5 '101000'
CS4 '100000'
CS3 '011000'
CS2 '010000'
CS1 '001000'
CS0 '000000'
Table 10-1: Class Selector PHB group DSCP values
Expedited Forwarding DiffServ PHB

The Expedited Forwarding (EF) DiffServ PHB provides the lowest latency
and lowest loss class and is ideally suited for VoIP, and some forms of
video such as video conferencing. The EF DSCP is represented by the
binary value ‘101110’.
The EF PHB is typically implemented using a priority scheduler with tail
drop queue management, traffic is policed to the CIR and excess traffic is
dropped.
Assured Forwarding DiffServ PHB Group

The Assured Forwarding (AF) DiffServ PHB group consists of four
classes. Note that there is no implied ‘priority’ ordering among the four AF
PHB classes. Each AF PHB class can be engineered to the desired
performance by tuning the QoS mechanisms.
Each AF class has three drop precedence levels resulting in twelve different
DSCP values. Routers use these drop precedence values to determine
which priority to discard packets under network congestion. Under
congestion, the AQM mechanism randomly discards packets marked with
high drop precedence first. If congestion still exists once these packets are
depleted, packets marked with medium drop precedence are then randomly
discarded.
Table 10-2 lists the Assured Forwarding PHB DSCP values.

AF DSCP per class (in binary)

Drop
Precedence Class 1 Class 2 Class 3 Class 4
'001010' '010010' '011010' '100010'
Low
(AF11) (AF21) (AF31) (AF41)
'001100' '010100' '011100' '100100'

Medium
(AF12) (AF22) (AF32) (AF42)
'001110' '010110' '011110' '100110'

High
(AF13) (AF23) (AF33) (AF43)
Table 10-2: Assured Forwarding PHB DSCP values
The AF PHB is implemented using a weighted scheduler with AQM queue
management, traffic is policed to the CIR, and excess traffic within the EIR
is remarked to higher drop precedence and may subsequently be dropped or
shaped. The AF PHB group may be used for some forms of real-time
traffic, e.g., streaming audio and video and variable rate video
conferencing, but should never be used for voice applications.
Default Forwarding DiffServ PHB

The Default Forwarding (DF) DiffServ PHB provides ‘best effort’
performance. Since the DF PSC performance is unpredictable under
varying network loading (congestion) conditions, it is not recommended
for real-time applications. The DF DSCP is represented by the binary value
‘000000’.
DSCP Configuration Considerations

Routers and application hosts sometimes implement DSCP markings
differently. While the DiffServ field is an eight bit value, the DSCP is only
a six bit value. Some routers or hosts require you to configure an eight bit
value (DSCP + ‘00’). Some devices require you to configure only the
actual six bit DSCP value. In this case, the devices would append two zeros
to the six bit DSCP value. For example, the six bit DSCP value for
Expedited Forwarding (EF) is '101110' (binary), 2E (hexadecimal) or 46
(decimal). The eight bit DiffServ field value for Expedited Forwarding
(DSCP+00) is '10111000' (binary), B8 (hexadecimal) and 184 (decimal).
Furthermore, the Microsoft Windows* operating system adds the two zeros
to the six bit DSCP the opposite way that routers do. Using the previous
example with the Expedited Forwarding DSCP ('101110'), routers would
create the eight bit DiffServ field value by appending two zeros to the
DSCP value (DSCP+00) resulting in ‘10111000’ (184 decimal). Microsoft
Windows QoS Application Programming Interfaces (APIs) prepend two
zeros to the six bit DSCP (00+DSCP), resulting in ‘00101110’ (46

decimal), which is quite different from the eight bit value created by the
router. While the Windows approach may be masked by the application, the
application developer must keep these differences in mind to ensure that
Windows-based real-time applications are properly marked with the correct
DSCP value.
Ethernet IEEE 802.1Q

Ethernet is a critical Layer 2 technology that essentially all real-time
applications will use at some point in the network. Therefore, it is
important to understand its QoS mechanisms.
802.1Q is an IEEE Ethernet standard that adds four additional bytes to the
standard IEEE 802.3 Ethernet frame. IEEE 802.1Q describes a marking
method to identify the class of service over Ethernet and a Virtual LAN
identifier (VLAN ID) for traffic separation and classification. Essentially
all Enterprise Ethernet switches support the 802.1Q standard with the
exception of the lowest cost ones. See Figure 10-5.
Figure 10-5: 802.1p User Priority and VLAN ID in 802.1Q Tag
VLAN ID field
The VLAN ID is used to group certain types of traffic based on common
requirements. Packets marked with a particular VLAN can be classified
and the appropriate QoS mechanisms applied, for example, all packets in
VLAN 10 are IP telephony packets and are given DiffServ Expedited
Forwarding treatment.
802.1p User Priority field

The 802.1p user priority field provides three bits that can be used to
identify up to eight classes of service for packets traversing Ethernet
networks. Note that IEEE 802.1Q standard, unlike DiffServ, does not
define what type of QoS PSC to apply for packets marked with different
802.1p values. Table 10-3 includes the definitions for each of the 802.1Q
fields.

802.1Q Field Description
EtherType Always set to 8100h for Ethernet 802.1Q
Three bit priority field Value from 0-7 representing user priority
(802.1p)
Canonical Field Always set to 0 for Ethernet

Identifier (CFI)
Twelve bit VLAN ID VLAN identification number

Table 10-3: 802.1Q Field Definitions
Due to the lack of QoS capabilities defined for Ethernet, Enterprises
typically implement the more robust IP QoS provided by the DiffServ
architecture. Therefore, when implementing QoS, a mapping function is
performed between DiffServ and Ethernet 802.1p to identify the DiffServ
PSCs at the Ethernet layer. Refer to Chapter 3.
Finally, when supporting VoIP, shared-media Ethernet hubs must never be
used because they add significant packet loss, unpredictable delay and
jitter. QoS mechanisms, such as 802.1p and VLA can be used for VoIP
traffic over Ethernet networks. If the Ethernet switches support Layer 3
capabilities, then QoS mechanisms such as DiffServ are the preferred
methods to provide QoS.
Host DSCP or 802.1p Marking

Some VoIP devices such as IP phones and media gateways can premark
packets with a DSCP or 802.1p value prior to transmitting them to the
network. The routers and Ethernet Layer 2 switches would then ‘trust’ the
DSCP or 802.1p marking and apply the appropriate QoS to the packets.
This approach allows for simple network administration. However, care
must be taken with this approach because a rogue user may mark their
nonvoice packets with the voice DSCP or 802.1p marking and get better
QoS than allowed per the network QoS policy. The security of this
approach can be enhanced by adding additional filters on the routers to not
only classify on the DSCP or 802.1p value, but also classify on other items
such as the UDP port number or IP address to determine that the
application is truly a voice applications. Note that some products can
classify on both DSCP and Ethernet 802.1p values so this allows for
additional classification choices to improve security.
Packet Fragmentation and Interleaving

Packet fragmentation divides large data packets into multiple, smaller
packet fragments, typically the size of one or two voice packets. The router

then interleaves the transmission of data packet fragments and voice

packets to ensure that voice packets experience a low and fixed amount of
delay. Most routers use a default maximum packet size, or Mean
Transmission Unit (MTU), of 1500 bytes, which can take a considerable
amount of time to transmit over a low bandwidth (< 1 Mbps) WAN
connection. Consider a 1500 byte data packet being transmitted over a 64
kbps WAN connection. It would take 188 ms to serialize this data packet
from the router onto the 64 kbps connection. This same serialization delay
is again added as the packet is received at the router on the other side of the
connection. In general, a desirable one way delay goal for a voice packet
(to achieve high quality voice) is 150–200 ms. Over a 64 kbps connection,
the data packet uses up most, if not all of the entire one way delay budget
for the voice packet before the first voice packet is ever transmitted!
For example, a voice packet enters a router, followed by a large data packet
that is followed by a second voice packet. The first voice packet begins to
get transmitted. Then, the first fragment of the data packet (equal in size to
a voice packet) is transmitted. Next, the second voice packet is transmitted.
Finally, the next fragment of the data packet is transmitted. If no more
packets enter the router, then the data packet fragments will continue to be
transmitted until the complete data packet is transmitted.
PPP Fragmentation and Interleaving

Most routers support PPP (Point to Point Protocol) Fragmentation. PPP
fragmentation splits larger data packets into multiple, packet fragments and
encapsulates them into PPP frames before queuing and transmission. The
packets are interleaved, as previously described, so the maximum delay a
voice packet will experience is one or two voice packet times, depending
upon the fragment size. PPP fragmentation occurs only over the PPP
connection between routers. Once the packet fragments are reassembled in
the receiving router, they are no longer fragmented unless they traverse
another PPP connection.
IP Fragmentation and interleaving

All routers support IP fragmentation. IP fragmentation, like PPP
fragmentation, splits larger data packets into multiple, smaller IP packets,
whose size is determined by the configured IP MTU (Maximum
Transmission Unit) size. Also, like PPP interleaving, the voice and data
packets are interleaved during transmission to minimize the transmission
delay for the voice packets. Unlike PPP fragmentation, however, packets
subject to IP fragmentation remain as small packets all the way through to
the destination. This results in overall reduced throughput for the data
packets because of the additional overhead of an IP header for each smaller
packet.

Other methods to achieve QoS

Depending upon the application, it is sometimes possible to implement
very simple QoS mechanisms to achieve good performance for real-time
applications. In specific use cases, good QoS could be achieved by simple
methods described below.
Port Prioritization
When a VoIP gateway or IP PBX is installed in the network, it typically is
assigned a static IP address and connects to a specific port on the Ethernet
Layer 2 switch that rarely, if ever, gets changed. In this application, the L2
switch is configured to receive and transmit all incoming traffic from this
port, ahead of all other traffic entering other switch ports. If the next hop
device is a router, then it can classify and mark the voice packets with the
appropriate DSCP value for use by other routers across the network.
If the device attached to the Ethernet Layer 2 switch port is an IP phone,
then port prioritization is not recommended since IP phones may be
unplugged and moved and another user may attach to the port and receive
inappropriate or unauthorized QoS. For example, if a PC were connected to
a port configured to use port prioritization, then all of the PC's traffic
would be given high priority treatment in the switch. See Figure 10-6 for a
port prioritization example.
Figure 10-6: Port prioritization example

Traffic Prioritization using VLANs

All voice traffic can be placed into one VLAN and all other data traffic into
another VLAN. The “Voice” VLAN traffic can be prioritized over the
“Data” VLAN traffic. In some cases, this may be the easiest method if all
Ethernet switches support the 802.1Q standard for VLANs and do not
support DiffServ. VLANs can also be used to separate traffic for security
purposes. See Figure 10-7 for a VLAN prioritization example.
Figure 10-7: VLAN prioritization example
IP Address Prioritization
VoIP traffic can also be prioritized by its IP address. This approach is ideal
for devices with statically assigned IP addresses that rarely, if ever, change.
IP PBXs, VoIP gateways and call servers are VoIP devices that would have
their IP addresses statically assigned. A network administrator can
configure the routers to filter (classify) and prioritize all packets originating
from or destined to these IP addresses. See Figure 10-8 for an example of
IP address prioritization.

Figure 10-8: IP address prioritization example

In this example, any traffic entering the router from IP address 10.10.10.1
will be received and forwarded ahead of traffic entering other switch ports.


There are many mechanisms used to achieve good QoS. Each mechanism
affects real-time application performance differently, depending upon the
configuration of the mechanisms. The TOS field definition has changed
over time and is now standardized by the DiffServ architecture. The
DiffServ code point (DSCP) determines the DiffServ PHB that is applied to
the packets in that class.
There are four standardized PHBs, namely, the Expedited Forwarding (EF)
PHB, Assured Forwarding (AF) PHB group, Class Selector (CS) PHB
group and Default Forwarding (DF) PHB. The EF PHB and CS PHB group
are best suited for voice media and signaling applications, respectively
while the EF PHB or AF PHB group can be used for different video
applications.
DiffServ configurations vary by product so it is important to understand the
differences in implementations. Ethernet 802.1Q provides the VLAN ID
and 802.1p user priority bits to identify real-time traffic transported via
Ethernet frames.
Packet fragmentation and interleaving is required for converged networks
supporting real-time applications to minimize serialization delay over low
bandwidth WAN connections. When implemented with PPP, the
fragmentation is locally significant. When implemented with IP MTU size
adjustment, the fragmentation is global.
Finally, there are other, simple forms of QoS such as port or IP address
prioritization that can be used for hosts dedicated to a port or with a static
IP address such as a voice gateway.

References
RFC 3246, “An Expedited Forwarding PHB,” IETF, http://
www.ietf.org/rfc/rfc3246.txt
RFC 3168, “The Addition of Explicit Congestion Notification
(ECN) to IP,” IETF, http://www.ietf.org/rfc/rfc3168.txt
RFC 2597, “Assured Forwarding PHB Group,” IETF, http://
RFC 1349, “Type of Service in the Internet Protocol Suite,” IETF,
RFC 2475, “An Architecture for Differentiated Services,” IETF,
RFC 2474, “Definition of Differentiated Services Field (DS Field)
in IPv4 and IPv6 Headers,” IETF, http://www.ietf.org/rfc/
rfc2474.txt
RFC 2686, “Multi-Class Extensions to Multilink PPP,” IETF, http://
RFC 1990, “PPP Multilink Protocol (MP),” IETF, http://
ATM Forum Traffic Management Specification v4.1, ftp://
ftp.atmforum.com/pub/approved-specs/af-tm-0121.000.pdf
IEEE 802.1Q, Virtual Bridged Local Area Networks, http://
standards.ieee.org/getieee802/download/802.1Q-2003.pdf
Introduction to Quality of Service (QoS), http://
www.nortelnetworks.com/products/02/bstk/switches/bps/collateral/
56058.25_022403.pdf
RFC3260, “New Terminology and Clarifications for DiffServ,”


249
Section IV:
Packet Network Technologies
Earlier sections dealt with the requirements of real-time applications and
about the ways that TDM and SONET handle these demands. You also
learned about the protocols that handle transport, call setup, and flow
priority management needed for real-time packet networking. Section IV
covers various transport and access technologies that can be used to
provide differentiated service to your network traffic. It is important that in
a converged network environment the network operator understands the
media and protocols that underlie network operation, and that help transfer
and maintain Quality of Service (QoS). The selection of technologies and
protocols should provide a seamless fabric from end of the network to the
other.
The section begins by describing the incumbent technologies
Asynchronous Transfer Mode (ATM) and Frame Relay. These technologies
have a large installed base and are thus very important in providing real-
time services over converged networks. ATM was designed to provide a
high degree of QoS and inherently ensures packets arrive with their QoS
bounds. Frame relay, which came into popularity just prior to ATM, has
continued to extend its functionality through the MPLS/Frame Relay
Alliance (formerly known as the Frame Relay Forum) to aid in providing
real-time differentiated services. While the specifics of these technologies
fills volumes, we will offer a brief description of the basic characteristics of
ATM and Frame Relay, and will then focus our attention on their real-time
capabilities.
While ATM and FR technologies continue to be important, today's
networks are migrating to IP/Multiprotocol Label Switching (MPLS)
cores. Chapter 12 addresses MPLS, providing the basic concepts and
functions that help MPLS provide real-time service. The chapter introduces
MPLS concepts associated to Label Switch Path (LSP) creation, how these
paths are set up using MPLS signaling protocols, and label stacking and
integrating DiffServ into MPLS EXP (experimental) bits.
Chapter 13 is about Optical Ethernet (OE), a relatively new technology that
is growing in prominence. OE combines a Layer 2 protocol (Ethernet) with
Layer 1 protocols like SONET and Dense Wave Division Mulitplexing
(DWDM). OE has the ability to ride over long-haul fiber as well as a
number of added extensions that allow it to emulate ordinary Ethernet
LANs. It also incorporates some of the redundant functionality associated
to optical, such as resilient packet ring.

250
We continue our look into Layer 1 in Chapter 14 on Network Access. This

chapter explains several common access technologies: 2G and 3G cellular,
wireless LAN, cable modem, and DSL. This covers a lot of ground! We
have two goals for this chapter: first, to demonstrate that the network can
carry real-time applications with each kind of access technology, and
second, that with careful engineering, many of these technologies can and
do operate on the same network. This is an important aspect of network
convergence. We hope to encourage engineers to think about how new
services should be provided, as well as how they can optimize the
performance of packet infrastructure combined with the access
technologies already present in their networks.
The final chapter in this section describes IPv6. This topic is forward-
looking, and is of special interest to telecom professionals working in the
Asia market. Interest in IPv6 is growing and will continue to accelerate as
more and more devices become IP aware, like your refrigerator, coffee
maker, and oven. The chapter illustrates the features of IPv6 by identifying
similarities and differences with IPv4.
This is the final theoretical section of the book. Once you have mastered
these topics, you will be ready to consider implementation techniques and
network examples engineered to provide resiliency and differentiated
service for multiple traffic types and applications.

251
Chapter 11
ATM and Frame Relay
Sinchai Kamolphiwong
Shardul Joshi
Timothy Mendonca
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts Covered
Layered protocol
ATM interfaces
ATM architecture
ATM adaptation layer
QoS and services in ATM networks

252 Section IV: Packet Network Technologies
Voice and telephony over ATM

Frame relay, FRF.1.2, FRF.11.1 and FRF.12
Frame relay conversion of voice into FRF.11 packet
Codecs supported by FRF.11
Use of Silence Information Descriptor (SID) and Voice Activity
Detection (VAD)
Use of Frame relay fragmentation FRF.12 to reduce serialization
delay on a slow speed link
Three areas where FRF.12 may be used effectively
Seven major engineering best practices
Introduction
The ITU (International Telecommunication Union) has chosen ATM
(Asynchronous Transfer Mode) as the switching and multiplexing
technology for carrying all signals in a high speed network. To support
such goals, network architecture was moved away from circuit switching to
a range of packet switching systems. ATM uses fixed-length packets, called
cells, and is a virtual connection-oriented system. The cell length of 53
bytes (five byte header + 48 byte information field) is an engineering
compromise, to accommodate conflicting requirements of a whole range of
traffic types, be it computer data or real-time traffic such as voice or video.
An ATM network consists of a set of ATM switches to multiplex/
demultiplex traffic streams. Each ATM switch is connected by point-to-
point ATM links. Each ATM link can accommodate several VPs (virtual
paths) and each VP may comprise a number of VCs (virtual channels). This
allows the aggregation of dissimilar types of traffic streams to be
accomplished in one ATM link.
Frame relay is a connection oriented protocol that is a precursor to ATM.
The sections in this chapter that address frame relay will not go into the
basics of the technology, but will examine the evolution of the frame relay
protocol to ensure its viability in tomorrows networks. More specifically,
the chapter will evaluate the use of the MPLS and Frame Relay Alliance
(formerly known as the Frame Relay Forum) Implementation Agreements:
FRF.11.1, the Voice over Frame Relay Agreement, and FRF.12, the Frame
Relay Fragmentation Agreement. The chapter evaluates how the two
specifications work in isolation and how efficiently they work together.
The main advantages of ATM are as follows:
ATM is a connection-oriented network, with each connection setup
associated with its QoS (quality of service) requirements, for
example, delay, loss and cell delay variation. With a high
guaranteed QoS offered by ATM, real-time communications, for

Chapter 11 ATM and Frame Relay 253
example, voice and video traffic, are suitable for carriage over ATM
networks, as shown in Figure 11-2.
With ATM, the incoming traffic channels are aggregated using
statistical multiplexing into one communication link. High system
utilization is easily obtained.
ATM provides services for traffic types of multipriority services.
ATM offers the opportunity for all traffic sources to use resources
fairly, regardless of distance and the number of connections.
Voice
Voice
Data channel
ATM Switch
Multimedia Voice channel Multimedia

computer computer
Video channel
Telephone channel
Figure 11-2: Real-time communication over ATM networks
Layered protocol
To compare ATM to other protocols, OSI (Open Systems Interconnection)
is an appropriate model to use as a reference. The ATM layer is above the
physical layer (Layer 1), and provides transport functions required for the
switching and flow control of ATM cells. In this context, “transport” refers
to the use of ATM switching and multiplexing techniques at the data link
layer, for example, Layer 2 of OSI model, as shown in Figure 11-3, to
convey end-user traffic from source to destination within a network.

Application Application
Presentation Presentation
Session Session
Transport Transport
Network Network
Data link AAL AAL

Layer
ATM Layer ATM Layer
Physical Physical
Figure 11-3: ATM in the data link layer of OSI model

ATM is a transport technology and it must interoperate with other network
technologies. The ATM Forum1 considered how ATM technology will
interoperate with the installed base of Ethernet and Token Ring equipment,
data networking protocols, and legacy applications. This is a valid concern,
since the ATM architecture differs fundamentally from legacy LAN
technologies, for example, Ethernet is connectionless while ATM is a
connection-oriented technology. Therefore, to use ATM for practical data
networking, there must be some way of adapting existing network layer
protocols (working on legacy LANs), such as IP, to the connection-oriented
paradigm of ATM. To meet this end, ATM (both AAL and ATM layer) is
inserted into the MAC (Media Access Control) sublayer, under the LLC
(Logical Link Control) sublayer of IEEE 802.2, as shown in Figure 11-4.
IEEE 802.2 Logical Link Control (LLC)

Data Link Layer
CSMA/CD Token Bus Token Ring DQDB
802.4 802.5 802.6
FDDI ATM MAC
802.3
Sublayer
Physical Layer
Figure 11-4: Overlay of ATM in IEEE 802.2

A widely used network is the Internet that employs the Internet Protocol
(IP). The TCP/IP protocol suite has been integrated into all of the most
1. ATM Forum (http://www.atmforum.com/) is an implementer's agreement body to define the ele-

ments of ATM services.

popular operating systems. Recently, ATM has often been used as a link-
layer technology for both local and international regions of the Internet. A
special AAL (ATM adaptation layer) type, called AAL-5, has been
developed to allow TCP/IP to interface with ATM, as shown in Figure 11-
5. Fundamentally, the network layer sees ATM as a data link protocol. At
the IP-ATM interface, AAL-5 prepares ATM transport for IP datagrams.
Application Layer (HTTP, FTP, etc.)
Transport Layer (TCP or UDP)
Network Layer (IP)

AAL-5 (ATM Adaptation Layer)
ATM Layer
ATM Physical Layer
Figure 11-5: Internet-over-ATM protocol stack
ATM interfaces
The ATM standard defines two main types of interface in ATM networks:
User-to-Network Interface (UNI)
Network-to-Network Interface (NNI)
User-to-network interface
ATM can be used within both a private network and a public network,
referred to as private User-to-network interface (UNI) and public UNI,
respectively, as shown in Figure 11-6. The public UNI is used to
interconnect an ATM user (or ATM terminal) with an ATM switch
deployed in a public service provider's network, while the private UNI is
used to interconnect an ATM user with an ATM switch that is managed by
a private organization, for example, computer center. Both UNIs share an
ATM layer specification, but may utilize different physical media. The
primary distinction between these two UNIs is physical reach. There are
also some functional differences between the public and private UNIs, due
to the requirements associated with each interface. For example, the
administrative function of private UNI for all domains in an organization,
may follow a local management scheme, which may not seriously consider
interconnection issues.
Network-to-network interface
The Network-to-network (NNI) is the interface between ATM switches.
There are two types of NNI, public NNI and private NNI. The public NNI,
also known as the Broadband Intercarrier Interface (BICI), defines an
interconnect interface between public ATM switches. The private NNI

(PNNI) defines an interconnect interface between ATM switches managed

by private networks.
The term “UNI” is used generically to indicate interfaces to both public
and private UNI. Similarly, the term “NNI” can refer to either private or
public NNI. The term “ATM switch” refers to both public and privates
switches. Where necessary, a specific term is used.
Public UNI
ATM
User Public ATM Network ATM
User
ATM Private Public Public
User ATM ATM ATM
Switch Switch Switch
ATM
User ATM
Private NNI User
Private Public NNI Public UNI

ATM
Private UNI Switch
Figure 11-6: ATM over private and public UNI and NNI
ATM architecture
The B-ISDN protocol reference model has been defined in ITU-T
Recommendation I.121, as shown in Figure 11-7. The model contains both
horizontal and vertical structures. The horizontal layer consists of four
main layers:
Higher layer that specifies functions for applications.
ATM Adaptation layer (AAL). AAL is concerned with a number of
processes necessary to transform the user data stream into a format
suitable for ATM, such as segmentation/reassembling of higher
layer Protocol Data Unit (PDU) into ATM cells. The AAL is
divided into two sublayers, the convergence sublayer (CS) and the
segmentation sublayer (SAR)
ATM layer specifies the ATM structure and functions at the cell
level (more details of AAL and ATM layers will be given in the
next section).
Physical layer specifies media technology dependent issues.

Higher Layer
ATM Adaptation Layer (AAL)

Convergence sub-layer (CS)
Segmentation and reassembly sub-layer (SAR)
ATM Layer
Physical Layer
Transmission convergent sub-layer
Physical media-dependent sub-layer
Figure 11-7: ATM protocol layer
ATM layer
There are two different formats of ATM cell header, one for use at the
User-to-Network Interface (UNI), and the other one for use at the Network-
to-Network Interface (NNI), as shown in Figure 11-8.
Bit Bit
8 5 4 0 8 5 4 0
GFC VPI 1 VPI 1

Cell
VPI VCI 2 VPI VCI 2
header
VCI 3 VCI 3
(5 bytes)
VCI PTI CLP 4 Byte VCI PTI CLP 4 Byte
HEC 5 HEC 5
. .
Payload (48 bytes) . Payload (48 bytes) .
. .
53 53
(a) (b)
CLP Cell loss priority PT Payload type
GFC Generic flow control VCI Virtual circuit identifier
HEC Header error control VPI Virtual path identifier
NNI Network-network interface UNI User-network interface
Figure 11-8: ATM cell headers: (a) UNI (User-Network Interface) and (b) NNI
(Network-Network Interface)
At the UNI, the header contains a four bit generic flow control (GFC) field,
a 24 bit label field containing virtual path identifier (VPI) and virtual
channel identifier (VCI) subfields (eight bits for the VPI and sixteen bits
for the VCI), a 2 bit payload type (PT) field, a 1 bit priority (PR) field, and
an eight bit header error check (HEC) field. The cell header for an NNI cell
is identical to that for the UNI cell, except that it lacks the GFC field; these
four bits are used for an additional four VPI bits in the NNI cell header, as
shown in Figure 11-8 (b).

ATM virtual channel connection

A single ATM physical link can support multiple Virtual Paths (VP), and
each VP can contain multiple Virtual Channels (VC). In most practical
cases, VPs are used between switches because of manageability of the
ATM network. Figure 11-9 shows the relationship between a physical
transmission link, VP, and VC.
1 VP1 1
VC
VC 2
VP1 2
1
1 VP2 VC
VC 2 VP2 2
Physical Transmission Link

1
1
VC 2
VP3 VP3 2 VC
3
3
VC: virtual channel VP: virtual path

Figure 11-9: Relationship of physical transmission link, virtual path (VP),
and virtual channel (VC)
ATM is a connection-oriented network, and it must create an explicit link
path between end points before transmitting any information. A virtual
channel connection (VCC, sometimes called virtual channel link [VCL]) is
a logical link, a unidirectional connection between two points, which can
be a pair of ATM switches or an ATM switch and the end user. A VCC is a
logical link connection identified by a unique VCI (virtual channel
identifier) value in the ATM cell header. Although, a VCC is
unidirectional, VCCs are always used in pairs, one VCC for each direction.
Thus, bidirectional communication requires a pair of VCCs that share the
same physical path through the network. The concepts of “VC” and “VCC”
are almost the same. The acronym VC is mostly used in a generic context
while VCC or VCL is used in a more specific way.
A virtual path (VP) defines a number of virtual channels, as a unit of
observation, for unidirectional traffic between an ATM end-system and an
ATM switch. Connections relying on either VC or VP are logically equal.
A virtual path connection (VPC, or a virtual path link [VPL]) defines a
logical end-to-end connection between two ATM end users or ATM
switches. A VPC can be established permanently, semipermanently, or
dynamically. A virtual path identifier (VPI) in the ATM cell header is used
to identify any VPCs. Similar to VC and VCC, the acronym VP is often
used in a generic context while VPC or VPL is used in a more specific way.
The primary function of the ATM layer is VPI/VCI translation. As ATM
cells arrive at ATM switches, the VPI and VCI values contained in their

cell headers are examined by the switch to determine which out-port should
be used to forward the cell. In the process, the switch translates the VPI and
VCI values of the original cell (received from the input-port) into new
outgoing VPI and VCI values, which are used in turn by the next ATM
switch to send the cell toward its intended destination. The table used to
perform this translation is initialized during the establishment of the call.
An ATM switch may either be a VP switch, in which case it only translates
the VPI values contained in cell headers (as shown in Figure 11-10), or it
may be a VP/VC switch, in which case it translates the incoming VCI value
into an outgoing VPI/VCI pair (as shown in Figure 11-10). In a VP switch,
all VCI numbers accommodated in a particular VP are not changed even if
the VPI number of the VP may be changed. Since VPI and VCI values do
not represent a unique end-to-end virtual connection, they can be reused at
different switches through the network. The VPI and VCI are local labels
between each switch pair for a given connection. This is important, because
the VPI and VCI fields are limited in length and would be quickly
exhausted if they were used simply as destination addresses.
ATM Switch
ATM Switch VP Switch
VCI 1 VPI 1 VPI 7

VP Switch
VCI 1 VPI 1 VPI 7 VCI 1 VCI 1
VPI 2 VPI 8
VCI 2
VCI 1
VCI 1 VCI 1
VPI 2 VPI 8 VCI 2 VPI 3 VPI 9
VCI 2 VCI 2
VCI 3
VCI 3
VCI 1
VCI 2 VPI 3 VPI 9 VCI 1
VCI 2 VPI 10
VCI 3
VCI 2 VCI 1 VCI 5 VCI 6
VC Switch
Figure 11-10: VP and VC switching in ATM switch

Virtual paths are an important component of traffic control and resource
management in ATM networks. Advantages of using VP switching are as
follows:
Simplifying call admission control (CAC)
Statistical multiplexing of the carried VCs makes effective use of
the bandwidth of a VP
Dynamic paths and routing are feasible because the VP logical
structure can be rearranged easily
Logical transport is separated from physical transport. In other
words, the logical is independent of the actual physical connection

VPs may be routed through an ATM switch by reference to the VP number,

or they may terminate in the ATM switch. A VP entering an end-point
always terminates at the end-point.
Figure 11-11 shows a sample scenario of how VPI/VCI translation in ATM
switches works. In this example, ATM switch two maintains a table of VPI
translations between VPI-in and VPI-out ports, for example, if VPI-in is
five, it will be translated to seven for VPI-out. All ATM switches in this
sample scenario are VP switches, only VPI values are changed.
VPI=7, VCI=1,2,3
VPI=7, VCI=1,2,3
A
VPI-in VPI-out
VPI=5, VCI=1,2,3 B
ATM 5 7
switch 1
VPI=9, VCI=3,4 ATM

switch
2
VPI-in VPI-out
7 5
9 7 VPI=7, VCI=3,4
ATM
VPI-in VPI-out switch 3
7 3
VPI=3, VCI=3,4
Figure 11-11: A sample scenario of VPI/VCI translation in ATM switches

In conclusion, the main functions of the ATM layer are as follows:
ATM cell header processing (for example, PTI field, HEC)
VCI/VPI translation (for example, cell switching from in-port to
out-port)
ATM cell multiplexing (for example, multiplexing among different
ATM connections)
Flow control (for example, GFC, traffic shaping)
ATM is connection-oriented, which implies the need for ATM-specific
signaling protocols and addressing structures, as well as protocols to route
ATM connection requests across the ATM network. However, these topics
are out of our scope of this book.

AAL (ATM adaptation layer)

The AAL relays information received from higher layers into ATM cells in
the ATM layer. The AAL provides mapping of application to ATM service
types. AAL plays a key role in the ability of an ATM network to support a
variety of applications and services. There are five types of AAL, AAL-0,
AAL-1, AAL-2, AAL-3/4, and AAL-5. The AAL is divided into two
sublayers, the convergence sublayer (CS) and the segmentation sublayer
(SAR), as shown in Figure 11-12.
Service
AAL 0 AAL 1 AAL 2 AAL 4 AAL 5 types
CS CS CS SSCS SSCS A
CS A
CPCS CPCS
L
SAR SAR SAR SAR SAR
ATM Layer
Figure 11-12: Sublayers in AAL (ATM adaptation layer): CS (convergence

sublayer) and SAR (segmentation and reassembly sublayer)
The main functions of AAL are as follows:
Segmentation of AAL-PDU to ATM cells at the source node and
reassembly of ATM cells at the destination node
Detection of lost, erroneous, and misinserted cells. There is no
re-transmission mechanism.
Recovery of the source clock frequency, if required (for example,
AAL-1).
Transfer of timing between source and destination, if required (for
example, AAL-2).
Handling of cell delay variations, if required (for example, AAL-1).
AAL is implemented in the ATM end systems (for example, entry and exit
nodes in ATM networks), not in the intermediate ATM switches. Thus, the
AAL layer is analogous to a transport layer for applications.
AAL structure
The user data (for example, user-PDU) from a higher layer is first
encapsulated in a common part convergence sublayer protocol data unit
(CS-PDU) in the convergence sublayer (CS), as shown in Figure 11-13. In
this sublayer, the CS-PDU header and trailer are added. Typically, the
CS-PDU size is much too large to fit into the payload of a single ATM cell.

As a result, the CS-PDU has to be segmented at the ATM source and

reassembled at the ATM destination. This function is handled by SAR
sublayer. The SAR sublayer segments the CS-PDU and adds an AAL
header and trailer to form the SAR-PDU in a block of 48 bytes. The
SAR-PDU is passed to the ATM layer and will be a payload of an ATM
cell.
.
Variable length (up to 64 Kbytes) CS-PDU padding
CS-PDU User-PDU CS-PDU trailer

header
Convergence
CS-PDU SAR-PDU User-PDU sub-layer A
header
A
SAR-PDU
Trailer SAR sub- L
SAR-PDU layer
48 bytes
ATM cell ATM
ATM Payload header ATM Payload Layer
ATM cell
Physical Layer
Figure 11-13: AAL structure
AAL-0
AAL-0 is the null function (CS and SAR are each an empty function). Cells
from the higher layer are transferred, through the AAL-0 service interface,
directly to the ATM layer service.
AAL-1
AAL-1 has been standardized in both the ITU-T and ANSI since 1993, and
is incorporated in the ATM Forum specifications for circuit emulator
services (CES). AAL-1 supports constant bit rate (CBR) services with a
fixed timing relation between source and destination users synchronous
traffic (for example, uncompressed voice). The AAL-1 service is offered by
most ATM equipment manufacturers. AAL-1 provides the following
services to the AAL user:
Transfer of service data units with a constant source bit rate and the
delivery of them with the same bit rate
Transfer of timing information between source and destination,
explicit time indication is used by inserting a timestamp in the
CS-PDU
Source clock recovery at the receiver by monitoring the buffer
filling (if needed)
Detection of lost or misinsertion cells

The SAR sublayer defines a 48 bytes protocol data unit (SAR-PDU). The
SAR payload contains 47 bytes of user data (of which one byte can be used
for a pointer), four bits for a sequence number (SN), and four bits for
sequence number protection (SNP). The SNP field is used for a CRC
(cyclic redundancy check) value to detect errors in the SN field.
AAL-2
AAL-2 is designed for a service to handle variable bit rate (VBR). AAL-2
needs a different mechanism from AAL-1, since VBR traffic behavior
differs from CBR. Due to a variable bit rate of VBR traffic, cell interval
time is not a constant value. Maximum delay time must be defined in the
AAL-2 packetization mechanism.
As stated above, VBR traffic is not a constant bit rate. There is a chance
that the SAR-PDU may not be filled during a particular time interval. The
delay may be large if the SAR-PDU waits until the payload is completed.
So, the SAR-PDU contains the following information, six bits of SAR-
PDU Header, sixteen bits of SAR-PDU Trailer, and 362 bits of SAR-PDU
Payload. Based on a new service requirement in the AAL layer, the ATM
Forum and ITU -T Study Group (SG) 13 discusses a new AAL-2 to provide
efficient transport of low bit rate voice that allows a very small transfer
delay across the network.
AAL-3/4
Originally, AAL-3 and AAL-4 were separated. AAL-3 tended to support a
connection-oriented service over ATM, while AAL-4 was a connectionless
operation. However, the rest of functions were similar. Therefore, AAL-3
and AAL-4 were combined. As a result, AAL-3/4 supports both connection
oriented and connectionless services. The SAR-PDU consists of sixteen
bits of the header, sixteen bits of the trailer, and 44 bytes of the payload.
AAL-5
AAL-5 is a low-overhead AAL, that is mainly used to transport IP
datagrams over ATM networks. With AAL-5, the header is empty. Similar
to other AALs, the SAR function takes care of segmenting the CS-PDU
into blocks of 48 bytes, but no SAR overhead is added in AAL-5.
The PAD ensures that the CS-PDU is an integer multiple of 48 bytes. The
length field identifies the size of the actual CS-PDU payload, so that the
user data can be retrieved at the destination.

From a service point of view, AAL-3/4 and AAL-5 offer the same layer
functionality. The main differences between these two service types are as
follows:
The AAL-5 performs minimum error control mechanisms in
comparison to the AAL-3/4
The AAL-5 does not offer a multiplexing capability
AAL-5 is the most widely used AAL. Currently AAL-5 offers a service for
the transport of IP networks, and a frame relay service. AAL-5 is
considered in the ATM for possible use, to transport real-time multimedia
information.
QoS and services in ATM networks

To meet the needs of a variety of users, be it for voice, data, or video, ATM
provides four classes of services:
Constant bit rate (CBR) is intended for fixed-bandwidth
transmission at Peak Cell Rate (PCR) negotiated between the user
and the network.
Variable bit rate (VBR) is intended for a user requiring a
negotiated Sustainable Cell Rate (SCR), and a maximum cell rate
bounded by PCR, greater than SCR. There are two sub-classes in
VBR, Real-Time VBR (RT-VBR) and Non–Real-Time VBR
(NRT-VBR). RT-VBR has more precisely defined requirements for
throughput and delay than NRT-VBR.
Available bit rate (ABR) is a lower priority service than CBR and
VBR, but provides an agreed Minimum Cell Rate (MCR) and will
accommodate bit rates in excess of MCR, if network bandwidth is
available. The service is intended for connection-less traffic (for
example, LAN traffic or internetworking TCP/IP traffic).
Unspecified bit rate (UBR) is the service of lowest priority, also
known as “best effort.” There are no specified traffic parameters,
except PCR and CDVT (Cell Delay Variation Tolerance), no flow
control and no specified QoS (such as cell loss, cell delay and
variance). This service is also intended for connection-less traffic
(for example, LAN traffic or internetworking TCP/IP traffic).

Figure 11-14 illustrates how the bandwidth is used in ATM networks by

the four service classes [24].
Priority level
Low
UBR Traffic
ABR Traffic
VBR Traffic
Load Utilization CBR Traffic
UBR Traffic
High
1.0
ABR Traffic
VBR Traffic
CBR Traffic
Time
Figure 11-14: Bandwidth usage of the four service classes in ATM networks
The VBR and CBR classes are higher priority classes used to transport
real-time or high quality audio and video data. The CBR and VBR services
guarantee the negotiated throughput (and therefore the necessary
bandwidth), the maximum cell delay, and variance. Hence, a switch first
allocates link bandwidth to these classes. The remaining bandwidth, if any,
is given to ABR and UBR traffic. To enable the ABR service to function
effectively, a suitable closed-loop flow control mechanism must be
implemented. To that end, the ATM Forum proposed a rate-based flow
control scheme, which is a closed-loop control mechanism. With the rate-
based scheme, the network controls the transmission rate of the sources to
maximize the network performance. Thus, at times when resources are
plentiful, the network will allow a source to increase its rate of
transmission, but at other times, when the traffic is heavy, the source rate
will be throttled to a safe value. In contrast, the UBR service has no flow
control mechanism (or open loop) and does not specify traffic-related
service guarantees, but may be subject to a local policy in individual
switches and end systems. It is a “best effort” service.
The characteristics of the four ATM services are summarized in Table 11-
1.

Functions CBR VBR ABR UBR

Bandwidth guarantee Yes Yes Yes -
Cell delay variation guarantee Yes Yes - -
Cell rate guarantee Yes Yes Yes -
Article I. Congestion - - Yes -
feedback notification
Table 11-1: Characteristics of ATM service classes
Table 11-2 summarizes traffic and QoS parameters for each ATM service
class. Traffic parameters are used for negotiation during connection setup.
QoS parameters, in general, are defined in terms of the measurements of
end-to-end connection.
Parameters CBR VBR UBR ABR

RT-VBR NRT-VBR
PCR and CDVT(5,6) Yes Yes Yes Yes(3) Yes(3)
Traffic SCR, MBS, N/A Yes Yes N/A N/A
Parameters CDVT(5,6)
MCR4 N/A N/A N/A N/A Yes
CDVT(5) Yes Yes Yes Yes(3) Yes(4)
CDVT(5,6) N/A Yes Yes N/A N/A
QoS Parameters Peak-to-peak CDV Yes Yes No No No
Mean CTD No No Yes No No
Max CTD Yes Yes No No No
CLR Yes(1) Yes(1) Yes(1) No Yes(2)
PCR: Peak Cell Rate CDV: Cell Delay Variation CTD: Cell Transfer Delay CDVT: Cell Delay Variation Toleran
CLR: Cell Loss Ratio SCR: Sustained Cell Rate MCR: Minimum Cell Rate MBS: Maximum Burst Size
Notes: (1) The CLR may be unspecified for CLP=1
(2) Minimized for sources that adjust cell flow in response to control information
(3) May not be subject to CAC and UPC procedures
(4) Represents the maximum rate at which the source can send as controlled by the control information.
(5) These parameters are either explicitly or implicitly specified for PVCs or SVCs.
(6) Different values of CDVT may be specified for SCR and PCR.
Table 11-2: The traffic and QoS parameters for the ATM service classes
However, some QoS parameters are not negotiated during connection
setup, for example, Cell Error Ratio (CER), Severely Errored Cell Block
Ration (SECBR), and cell misinsertion rate (CMR).
The QoS parameter CDV should not be confused with the connection
traffic parameter CDVT. Even though CDV is a QoS parameter, it is not
used for negotiation. CDV is introduced by cell multiplexing when cells
from two or more connections are multiplexed (to the same output
channel). Cells of a given channel may be delayed while cells of other
channels are being inserted at the output of multiplexer. In practice, the
upper bound of CDV is measured in CDVT. The value of CDVT is chosen
such that the output cell flow conforms to a bandwidth enforcement
mechanism.
A user of an ATM connection (a VCC or a VPC) is provided with one of a
number of QoS classes supported by the network. At VPC establishment,

the QoS is determined from a number of QoS classes supported by the

network. It should be noted that a VPC may carry a number of VC links of
various service classes that require different QoS parameters. For instance,
a VPC may carry different VCC types, for example, one may be VBR and
others may be ABR and/or UBR. To do so, the QoS of the VPC must meet
the most demanding QoS of the VC links carried as defined in “B-ISDN
Asynchronous Transfer Mode Functional Characteristics”. The service
class associated with a given ATM connection is indicated to the network at
the time of connection establishment and will not change for the duration
of that ATM connection (in other words, it stays for the life time of the link
connection).
Table 11-3 shows example applications that are suitable for each ATM
service class.
Applications CBR VBR ABR UBR
Real-Time-VBR None-Real-Time-
(RT-VBR) VBR (NRT-VBR)
Critical data
LAN Interconnection,
LAN Emulation
Data transport/Interworking
(IP-FR-SMDS)
Circuit emulation - PABX
POTS/ISDN - video conference
Compressed audio
Video distribution
Interactive multimedia
=Excellent =Good =Fair =Poor
Table 11-3: Example of applications for ATM service classes

Because ATM is connection-oriented, a communication channel must be
created before sending any information. This means that a communication
path between a source and a destination needs to be established. This
request will be sent along all ATM switches on the selected paths, to ask for
resource allocation and ensure user requested QoS can be guaranteed. A
connection request is accepted or rejected based on the availability of
resources. It is the process of Call Admission Control (CAC).
Negotiated characteristics of an ATM connection are specified by the
traffic contract. The traffic contract is an agreement between a user and a
network. QoS (for example, cell delay variation, cell loss ratio), traffic
descriptors (for example, peak cell rate), and conformance definition (for
example, conforming cells) are described in the traffic contract.
The next step is resource allocation at the call level, according to the
desired traffic contract. If the requested resources agree with the allocated
resource, the path generation step will be executed. Otherwise, alternative
paths will be selected. The allocated resources on the path will be reserved,

virtual channel and virtual paths are assigned. During these steps, however,
if any conditions can not be done, the alternative path selection process will
be performed. If no alternative path is found, the call will be rejected.
After CAC process is complete (see Figure 11-15), and the requested path
is created, data may be sent over this communication channel. To ensure
user contracts, which have been agreed upon during the CAC, the cell level
process will monitor and enforce traffic according to the contact
parameters. Any violating users may cause a rejection of the established
channel. To that end, the Generic Cell Rate Algorithm (GCRA) is used to
specify the arrival cell as either, conforming or nonconforming. The cell
that is conforming can be admitted, where a cell with nonconform may be
blocked. A widely used mechanism to shape the traffic is 'leaky bucket' as
shown in Figure 11-16.
Path bandwidth
Call rejection Network
request
level
No
Call setup request
Is alternate Yes
path selection Call
successful? level
No Agree
with network load
conditions?
Yes
Path generation phase CAC
(Call
Admission
Agree with
Control)
No
load condition of trunks in
the path?
Yes
Connection
No
Is link phase
allocation phase
successful?
Yes
VC and VP assignment
phase
Connection
establishment
Cell
Traffic Policing
level
Figure 11-15: Call admission control (CAC) flow

Traffic Traffic
source source
AAL Layer
MUX
ATM Layer
Physical Layer
UNI
Cell with token
VCI/ VPI
Cell buffer
ATM
Switch
Continuous-state leaky
bucket Token generator
Figure 11-16: Traffic shaper and leaky bucket.
Voice and telephony over ATM

The ultimate requirement of the next generation network is to handle
packetized voice and data in a converged manner. There are numerous
technical and commercial justifications as to why separate investments in
voice and data infrastructure is not feasible. Packet voice may be applied to
“Voice over the Internet”, but the primary interest will be in ‘voice’ carried
over business quality packet infrastructure. Next generation networks are
not just a PSTN replacement, but at a minimum they must provide the
equivalent voice quality and reliability of today's PSTN.
Traditionally, voice is carried over leased lines and Time Division
Multiplexing (TDM) circuits. ATM is designed to support time delay
sensitive traffic, for example, using AAL-1 or AAL-2. It is appropriate to
deploy ATM to replace the TDMs. The use of AAL-1 was subsequently
extended to allow replacement of 64 K circuits (or traditional digital voice
circuits), providing a means to convey voice on ATM backbones instead of
TDM infrastructures. ATM support for CBR based on AAL-1 is known as
Circuit Emulator Service (CES). Two types of CES have been standardized
as follows:
Unstructured circuit emulator service
Structured circuit emulator service

The unstructured circuit emulator service, also known as a circuit emulator

service, uses the AAL-1 service connection to carry a full T1 or E1 link
between two points in the network.
For example, the connection may be used to carry end-to-end transport
circuits between PABXs, as shown in Figure 11-17.
AAL-1: Circuit Emulator
T1/E1 T1/E1
PABX ATM Switch ATM Switch PABX
Voice Codec Voice Codec
AAL AAL
ATM ATM
ATM Network
PHY PHY
Figure 11-17: Circuit emulator service (CES) using AAL-1

The structured circuit emulator service employs AAL-1 to create a
connection with N x 64 kbps circuits, as opposed to a full T1 or E1.
Overall, the structured circuit emulation service offers improved
granularity for establishing fractional T1 or E1 circuits over ATM
networks.
However, although AAL-1 is offered by many vendors, AAL-1 should not
be considered an optimum solution for voice and telephony over ATM
(VTOA) for the following reasons:
Only a single user of the AAL can be supported (per virtual
channel, configured as a point to point permanent connection)
Reducing delay requires significantly more bandwidth than
necessary, especially when its peak is high
No allowance for partially filled cells (always 48 bytes of payload),
bandwidth is used even when there is no traffic
Voice is always 64 kbps (using G.711 codec) or bundles of 64 kbps
(N users x 64), this means that each connection requires at least 64
kbps
No standard mechanisms exist in the AAL-1 structure for
compression, silence detection/suppression, idle channel removal
The bandwidth can not simply change to meet new application
requirements
As mentioned above, AAL-1 has some limitations and may not operate
efficiently to carry low bit rate voice traffic. The ATM Forum and ITU-T
study Group (SG) 13 discussed a new service in AAL layer that can carry
low bit rate voice information. The primary requirement on the new AAL,

is to provide efficient transport of small native packets over ATM networks

in such a way that allows a very small transfer delay across the ATM
network, while still allowing the receiver to recover the original packets. To
this end, new AAL-2 with multiplexing capability was created. New AAL-
2 provides the following advantages when compared with AAL-1:
Efficient bandwidth usage through VBR service
Support for voice compression and silence detection/suppression
Support for idle voice channel detection
Multiple user channels with varying bandwidth on a single ATM
connection
VBR ATM traffic class that allows the bandwidth changed
Figure 11-18 shows an example of how new AAL-2 multiplexes several
voice channels into one channel (one virtual channel (VC) in ATM
context). Each 64 kbps voice channel is compressed by using the
CS-ACELP scheme with silence suppression. During a silent period there
is no data to send. New AAL-2 is suitable for compressed wireline
telephony, wireless telephony, and wireless data. Some performance results
have been evaluated.
Voice 1 Voice 2 Voice 3

64 kbps PCM 64 kbps PCM 64 kbps PCM
Codec Codec Codec

Voice payload
Voice channel
AAL Layer 1 2 3 1
ATM cell header

ATM Layer
ATM cell payload
Figure 11-18: Voice and telephony multiplexing over ATM using AAL-2
Frame relay and FRF.11/12

Frame relay is a connection-oriented protocol that is widely used in many
Enterprise networks for connectivity from customer edge (CE) to provider
edge (PE). The Frame Relay Forum (now known as the MPLS and Frame
Relay Alliance) has been very active through the years developing and
expanding frame relay technology to remain current with business
requirements. It is important to note that frame relay is essentially an
access technology in today's environment. The PE will encapsulate or inter-
work the frame relay frame.

The two major Frame Relay Forum specifications important to time

sensitive traffic are, FRF.11 and FRF.12. FRF.11 is the Voice over Frame
Relay Implementation Agreement. This specification deals with how to
transport voice data over a frame relay network. FRF.11's focus is to extend
the application support of frame relay to include digital voice payloads.
FRF.12 is the Frame Relay Fragmentation Implementation Agreement.
This specification deals with fragmenting long frames into a sequence of
shorter frames. One of the primary benefits of this is it reduces delay and
delay variation for time sensitive packets when non–real-time and time
sensitive packets are being sent across the same interface.
FRF.11.1
As mentioned above, FRF.11.1 is the Voice over Frame Relay
Implementation Agreement. This specification deals with a number of
different concepts. This section concentrates on the primary piece
concerning transport of voice within the frame relay payload, support of
various codecs and the effective utilization of low-bandwidth connections.
Transporting voice via frame relay

In voice over frame relay networks, a new edge device must be introduced
to the voice coding. These devices are collectively called, voice frame relay
access devices (VFRAD), and are positioned between the PBX and the
frame relay network. The VFRAD will consolidate the normal FRAD
(frame relay access device) and voice functionality. The VFRAD will
multiplex all of the traffic types onto the frame relay connection. While the
FRF.11 spec supports the majority of common compression algorithms, it
is important to ensure that the VFRAD will support the selected codec. The
VFRAD uses the frame relay user-to-network interface (FRUNI) as the
preferred interface type for voice, voice signaling, and data.
FRF.11 also includes a support for silence information descriptor (SID),
which is a frame that identifies the end of voice activity and relays the
general comfort noise parameters. The SID frames are only available with
PCM and G.726 (ADPCM). The SID supports a variety of voice activity
detection (VAD) and silence suppression plans.
If VAD is used, the SID frame may be transmitted after the last voice
frame. The receiver will identify the acceptance of the SID, as an explicit
indication of the end of the voice frames. The SID frames are also sent
between voice frames to maintain the comfort level parameters with the far
end. You should not use SID independently from VAD. The codecs that are
supported through the FRF.11 specification are listed in Table 11-4.
Highlighted, are the primary choices currently used today's networks.
Further information can be found in the FRF.11 specification.

Codecs
G.729 G.728 G.723.1 G.726/G.727 G.711
CS-ACELP LD CELP MP-MLQ ADPCM PCM
Table 11-4: Codecs
Transporting voice with DLCI

Each frame relay PVC (DLCI) represents one logical stream or traffic flow,
and it will generally interconnect two logical points in a network. The
concept of a sub-channel extends this ability, allowing the formation of
multiple streams within a single PVC. With sub-channels, any given PVC
may support up to 255 sub-channels or streams, creating up to 255 logical
traffic lanes within the connection between two network points. The
content, also called the payload, of individual PVC is transparent to a frame
relay network, so the implementation of sub-channels remains compatible
with existing frame relay services. As such, it becomes solely the
responsibility of the end-devices to handle and manage the use of sub-
channels within the PVC content.
This use of sub-channel allows for a greater degree of traffic separation.
Sub-channels can be identified for fax, voice and data communications.
The traffic is sent across the DLCI through first in first out (FIFO)
scheduler, however, the use of FRF.12 allows time-sensitive traffic to stay
within its QoS parameters. The use of sub-channels will also reduce the
operating expense of having multiple frame relay links into the carrier
network. There will be no need to have separate links for each type of
traffic, increasing efficiency of the links in use and again, reducing the
number of connections.
This is just a high-level view of the FRF.11.1 functionality. A more in-
depth view, and specifics associated to the codecs selected, can be found in
the functional specification. It is important to remember that the
specification allows for the use of a number of different codecs. The use of
sub-channels allows for the multiplexing of different virtual traffic streams
into one DLCI. FRF.11 also incorporates the use of SID to ensure a more
efficient use of the bandwidth available. These items, combined with
FRF.12, ensure the viability of frame relay to provide voice and data
services into the future.
FRF.12 frame relay fragmentation

The FRF.12 specification, as the name states, deals with the ability to
fragment long frames into a sequence of shorter frames, which can be
reassembled by the destination data terminal equipment (DTE) or data
circuit-terminating equipment (DCE). Frame fragmentation is particularly

important when delay and delay variation needs to be controlled, as in real-

time or time sensitive applications. FRF.12 should be employed when a
mix of data and voice traffic is being sent across a single interface. When
voice traffic is partitioned, there will be no need to use fragmentation, as
the voice packet size will be constant. Frame fragmentation can also aid in
reducing serialization delay, associated to low speed interfaces, thereby,
maintaining a constant delay variation and increase frame interleaving.
Flexibility is added into the function by allowing the operator to determine
the size of the fragments, and must be configured on a per interface basis.
Slow speed interface is a relative term, which will be dependent on the link
speed, the use of channelization, and the amount of traffic associated to the
interface.
Applications of fragmentation
There are three applications for fragmentation defined in the specification:
Locally across a frame relay UNI interface between the DTE-DCE
peers
Locally across a frame relay NNI interface between the DCE peers
End-to-end between two frame relay DTE peers
Frame relay User-to-Network Interface (UNI) and frame relay Network-to-
Network Interface (NNI) share the same type of data fragment format. By
that, the header information for the first two types of fragmentation match.
This contains a two-octet header that precedes the frame relay header
information.
End-to-end fragmentation's header information is a quite a bit more
complex. It contains the same information as in the UNI and NNI
fragmentation schemes, however, adds a number of other information into
the header. Missing is the network layer protocol ID (NLPID), which is
assigned to identify this fragmentation header format and identifies the data
content. The unnumbered information (UI) bits are also missing and are
associated to multiprotocol encapsulation.
Note: Some customers may have a requirement to communicate via
frame relay to a web site that is not their own and does not support
fragmentation. In these cases the customer may not be willing to
implement fragmentation.
Frame relay between DTE and DCE peers

UNI (DTE-DCE) fragmentation is strictly local to the interface, and the
fragment size can be optimally configured to provide the proper delay and
delay variation, based upon the logical speed of the DTE interface. You
must remember that in frame relay a DTE must connect to a DCE,
therefore, this type of connection would be between the customer edge

device and the provider edge device. Since fragmentation is local to the
interface, the network can take advantage of the higher internal trunk
speeds by transporting the complete frames, which is more efficient than
transporting a larger number of smaller fragments. UNI fragmentation is
also useful when there is a speed mismatch between the two DTEs at the
ends of a VC (virtual circuit).
Frame relay NNI between DCE peers

As the name would imply, this type of fragmentation occurs between frame
relay networks, as NNI is Network to Network Interface. These inter-
network links are usually fewer in number and fragmentation allows
interleaving of frames, allowing the delay to be minimized. When this type
of fragmentation is done between NNIs, all of the DLCIs associated to the
interface must be fragmented, according to the specification.
End-to-end fragmentation between two DTE peers

End-to-end between two frame relay DTEs is the final type of
fragmentation method defined in the FRF.12 spec. These DTEs may be
interconnected by one or more frame relay networks. When using end-to-
end, the fragmentation procedure is transparent to frame relay networks
between the transmitting and receiving DTEs. End-to end is restricted to
frame relay PVCs. The primary use is when the associated DTEs are
exchanging traffic using slower speed interfaces. Many times this
fragmentation scheme is used when the associated User-Network Interface
(UNI) does not support fragmentation.
Seven engineering best practices

When ATM and frame relay were originally released, they were both
called “Fast Packet” technology. This was based on X.25 (very slow),
which was the leading packet standard at the time. ATM was developed for
all services including real-time services, while frame relay was specifically
for data.
Frame relay can carry VoIP effectively, however, it can be a big challenge,
there is no defined QoS in basic frame relay services, only congestion
notification. There is a notion of “QoS aware” frame relay networks, but
many use proprietary technology or are based off of the new X.141 suite of
specifications. The service is referred to as Dynamic Packet Routing
System (DPRS) and uses a Dijkstra-based algorithm, similar to PNNI, to
provide end-to-end QoS. Nortel has worked with the Frame Relay Forum
to standardize this in FRF.1.2 (PVC User-to-Network Interface (UNI)
Implementation Agreement). The newest frame relay networks provide IP
aware DLCI(s) without having to use FrDTE for IP encapsulation.
Special engineering and care needs to be taken when planning to
implement VoIP over a frame relay network. When dealing with frame

relay, concerns include, speed of the service, access rate, segmentation,

shaping, policing, pacing and architecture of PVCs.
In Enterprise Networks, frame relay is generally deployed as a point-to-
network technology, therefore, in most cases, the central site will be a
different (higher) speed than other locations. It is this speed difference that
creates some of the problems. In frame relay, there is a committed
information rate (CIR) that most people reference (noted in Figure 11-19).
However, CIR is just an average value. The important parameters of frame
relay to properly engineer or understand, are the time interval and burst
rate. Most carriers and switches use a time interval of one second, which
simplifies calculations.
There are at least seven major engineering best practices that should be
addressed in a frame relay network. The first issue is segmentation, which
deals with the problem of large data frames. This is a function of both the
data packet size and speed of circuit. If the data packet is sufficiently large
and circuit speed is slow enough, a single data frame that gets in front of a
voice packet can introduce enough jitter in the voice flow to disrupt voice
quality. Segmentation breaks larger frames into smaller frames for
transmission over facilities. A standards implementation of segmentation is
FRF.12 (Frame Relay Fragmentation Implementation Agreement).
The second and third have to do with the frame relay contract itself, and are
known as shaping and policing. Shaping is the function of the customer
premise equipment, limiting the amount of traffic passed on to the frame
relay network. Shaping is done at the egress point of the customer
premises, and assures that traffic is sent out to the specifications of the
contract. In other words, if the contract is for a CIR of 256K, then it assures
that this rate is not exceeded, based on the corresponding time interval.
This is done by different mechanisms, such as leaky bucket, but it generally
involves the use of buffers.
If the network receives traffic in excess of its contracted CIR, it has the
option to either mark excess packets as discard eligible or discard them
immediately (known as policing). However, due to the nature of frame
relay being a point-to-network technology and different sites having
different speeds, shaping does not completely solve the quality problem for
real-time applications like VoIP. That is the role of pacing.

Boston
Branch Office
St Louis
Branch Office
Seattle SRG
Branch Office FT-1 (512Kb)
SRG
FT-1 (384Kb)
SRG
FT-1 (256Kb)
Frame Relay
Network
Los Angeles
New Campus Site
FT-1 (128Kb) Houston
T-1 (1.5Mb)
Branch Office
SRG
Succession 1000E
Figure 11-19: Typical frame relay network with centralized switching
Policing is a similar function done at the ingress point of the frame relay
network. Its function is to determine if traffic is within, or exceeds, the
specified contract. If it exceeds the contract, it has the option of either
marking the traffic Discard Eligible (DE) or discarding it. If the customer
premises does not successfully shape the traffic it may be discarded,
affecting the voice QoE.
The fourth best practice is referred to as pacing. Pacing spreads the packets
over the time interval evenly. This addresses the problem where the high
speed end of a circuit (in this case Los Angeles) has a time interval that is
large and a number of data packets may be transmitted ahead of voice
packets, in the specified time interval. These data packets will then get
ahead of voice packets at the egress point of the network (for instance at
Houston), which will in turn create jitter.
The fifth and sixth potential problems in frame relay have to do with the
number of PVCs you allocate in the network, and how they are architected.
Most customers who implement frame relay for VoIP and multimedia
applications, have already implemented it for data. In data networks, it is
common to build the network with centralized switching (fifth potential

problem), which has a single PVC (sixth potential problem) from each site
into the main site for all applications (see Figure 11-19). This is done to
minimize the number of PVCs, which in turn, reduces the cost
substantially. Note that in a data network, this has very little impact.
Figure 11-20: Frame relay network with full mesh, separate PVCs for voice
and T-1 access
When implementing real-time protocols, especially VoIP, it is
recommended that the frame relay network be built in a full mesh, as
depicted in Figure 11-20. A full mesh has two characteristics that make it a
better solution for VoIP applications. First, it provides more bandwidth in
the network so that traffic from different locations do not have to contend
for bandwidth and priority with traffic from other sites. Second, it
eliminates the additional serialization and queuing delays of the centralized
switching approach.
The sixth potential problem in a frame relay network occurs when a single
DLCI is allocated for all applications. Since frame relay has no real QoS
mechanisms, this increases the chance of poor performance for real-time
protocols. It is generally recommended that at a minimum, VoIP be
allocated its own DLCI. Although this will cost more, it is the only way to

truly assure that voice will get its bandwidth with the quality that is
expected. (see Figure 11-20.)
The seventh has to do with the access speed of the circuit. Many times
customers will get a frame relay circuit from the carrier with a CIR of
256 Kb, and assume that CIR is the only thing that affects performance. In
fact, in a data network, this is true. However, in a real-time network, the
access speed also comes into play. For instance a 256 Kb CIR circuit will
operate better on a full T-1 access link than on a fractional T-1(FT-1) link.
This is due to the fact that the quicker the egress point of the network can
offload packets, the less chance there is for jitter problems due to the
additional serialization delay introduced by a fractional T-1. In other words,
on a full T-1, the clock speed is at 1.5 Mbps, where a FT-1 with four DSOs
(256 K) allocated to the 256 K CIR, the clock speed will be at 256 K.


This chapter introduces ATM. ATM has been chosen as technology for
providing the access network aggregation. In a high speed network, there
will need to be support for Quality of Service (QoS) for differentiated
services, and the ability to handle various end-to-end requirements across
multiple service provider networks. ATM has promised to meet such
requirements.
The chapter starts by discussing the ATM protocol layer in reference to
other protocol layers. This will allow us to figure out how ATM overlays its
protocol structure. ATM provides two main types of interface, User-to
Network Interface (UNI) and Network-to-Network Interface (NNI). The
main difference between these interfaces is the number of functions and
requirements, where NNI requires more parameters.
The chapter then follows with a discussion on ATM architecture. The
unique features of ATM networks are described, for example its
connection-oriented nature, which, in turn, contributes to the complexity of
ATM protocols. Virtual channel terminology is introduced. There are two
types, Virtual Path (VP) and Virtual Channel (VC). In ATM context, both
of them are a logical link, and a unidirectional connection between two
points. A VC and VP are identified by unique virtual channel identifier
(VCI) and virtual path identifier (VPI) values in the ATM cell header.
When setting up a connection, the requesting node informs the network of
the type of service required. ATM provides five types of service in ATM
adaptation layer (AAL), AAL-0, AAL-1, AAL-2, AAL-3/4, and AAL-5.
Depending on application requirements, for example, type of traffic (data,
voice, or video) and QoS, a requested service needs to be chosen
appropriately.
One of the primary benefits of ATM networks is that they can provide users
with a guaranteed QoS. ATM networks offer a specific set of service
classes, constant bit rate (CBR), variable bit rate (VBR), available bit rate
(ABR), and unspecified bit rate (UBR). The user must request a specific
service class from the network for that connection. Service classes are used
by ATM networks to differentiate between specific types of connections.
The VBR and CBR classes are higher priority classes, used to transport
sensitive, real-time or high quality audio and video data. ABR and UBR are
a best effort services suitable for connectionless traffic (for example, LAN
or IP traffic).
To deliver such QoS guarantees, ATM switches implement a function
known as connection admission control (CAC). Whenever a connection
request is received by the switch, the switch performs the CAC function.
That is, based upon the traffic parameters and requested QoS of the
connection, the switch determines whether setting up the connection

violates the QoS guarantees of established connections (for example, by

excessive contention for switch buffering).
Finally, ATM has demonstrated its features of carrying real-time traffic,
voice and video for example. Using voice and telephony over ATM
(VToA), is an optimum solution offered by AAL-2 service type. Currently,
AAL-5 offers a service for the transport of IP networks, as well as real-time
multimedia information effectively.
In conclusion, ATM delivers important advantages over existing LAN and
WAN technologies, including the promise of scalable bandwidths at
unprecedented price, and performance points and QoS guarantees, which
facilitate new classes of applications such as voice, video, and multimedia.
Frame relay is still a viable technology moving forward for access type of
connections. The technology provides the Enterprise, and even certain
carrier's customers, the flexibility to offer feature rich real-time ability. You
should have learned that FRF.11 and FRF.12, while implemented
independently, act more efficiently when working together on interfaces
where there are mixed (real and non–real-time) traffic types. You should
also understand when these specifications would be used independently,
and why. Finally, you should have an understanding of the capabilities
present within these two specifications.
The reader should understand why segmentation and FRF.12 are important
to understand when implementing real-time networks. The reader should
be able to articulate the difference between shaping, policing and pacing
and how they affect real-time flows. The reader should understand the
importance of having a full mesh and separate DLCIs for real-time
implementations and the downside of using single DLCIs and a centralized
switching approach. Finally, the reader should understand and articulate the
importance of having a full T-1 access over a fractional T-1 access and how
this impacts performance of real-time flows.

References
K. Asatani and S. Nogami, “Standardization of Network Technologies and
Services,” IEEE Communications Magazine, pp.82-90, August 1995.
S.L. Sutherland and J. Burgin, “B-ISDB Internetworking,” IEEE
Communications Magazine, pp.60-63, August 1993.
F.A. Tobagi, Fast Packet Switch Architectures for Broadband Integrated
Services Digital Networks, Proceedings of IEEE, V.78, N.1, pp.133-166,
January 1990.
ITU-T Recommendation I.150, B-ISDN Asynchronous Transfer Mode
Functional Characteristics, International Telecommunication Union
Telecommunication Standardization Sector (ITU-T), 1993.
R. Händel, M. N. Huber, and S. Schröder, ATM Networks: Concepts,
Protocols, Application, 2nd Edition, Addison-Wesley Pub. Co., Inc., 1994.
K. Sriram, “Methodologies for Bandwidth Allocation, Transmission
Scheduling, and Congestion Avoidance in Broadband ATM Networks,”
Computer Networks and ISDN Systems, V.26, pp.43-59, 1993.
K. Genda, N. Yamanaka, Y. Arai, and H. Kataoka, “A High-Speed-Retry
Banyan Switch Architecture for Giga-Bit-Rate BISDN Networks,”
Communication System, V.7, pp.223-229, 1994.
G. Gallassi, G. Rigolio, and L. Fratta, Broadband Assignment in Prioritized
ATM Networks, IEEE GLOBECOM, pp.852-856, 1990.
ITU-T Recommendation I.363.1, ITU-T, 1993.
ITU-T Recommendation I.363.2, B-ISDN ATM Adaptation Layer Type 2
Specification, ITU-T, 1997.
Jan Höller, “Voice and Telephony Networking over ATM,” Ericsson
Review, No.1, 1998.
M. S. Chambers, H. Kaur, T. G. Lyons, and B. P. Murphy, “Voice over
ATM,” Bell Labs Technical Journal, October-December, 1998, pp.176-190.
D.W. Petr et al., Efficiency of AAL2 for Voice Transport: Simulation
Comparison with AAL1 and AAL5, IEEE INFOCOM'99, pp.896-901.
ITU-T Recommendation G.729, Coding of Speech at 8 kbps using
Conjugae-Structre Algebraic-Code-Excited Linear Prediction (CS-
ACELP), ITU-T, November 1995
H. Saito, “Performance Evaluation of AAL-2 Switch Networks,” IEICE
Trans on Communications, V.E82-B, No.9, September 1999, pp.1411-
1423.
G. Mercankosk, J. E. Siliquini, and Z. L. Budrikis, Provision of Real-time
Services over ATM using AAL type 2, Mobicom'99, pp.83-90.

C. Liu et al., Packing Density of Voice Trunking using AAL-2,

GLOBECOM'99.
G. Mercankosk, J. E. Siliquini, and Z. L. Budrikis, Provision of Real-time
Services over ATM using AAL type 2, WONMOM'98, pp.83-91.
M. McLoughlin and K. Mumford, Adapting Voice for ATM Networks: A
Comparison of AAL1 versus AAL2, General DataComm, 1997.
ATM Forum. ATM User-Network Interface Specification, Version 3.0.
Prentice Hall, NJ, 1993.
ATM Forum, Traffic Management Specification, Version 4.0, 1996.
S. Kamolphiwong, S. Sae-Wong, and M. Unhawiwan, “ATM Technology
for IMT-2000 Mobile Communication Networks,” NECTEC Technical
Journal, Vol. II, No.8, pp.96-107.
S. Kamolphiwong and T. Kamolphiwong, Throughput Improvement of
Voice and Telephony over ATM, Proceedings of 2000 Asia-Pacific
Symposium on Broadcasting and Communications, Bangkok, Thailand,
December 2000, pp.393-397.
S. Kamolphiwong, A.E. Karbowiak and H. Mehrpour, “Flow Control in
ATM Networks: A Survey,” Computer Communications, V21, N.11,
August 1998, pp. 951-968.
Mark McCutcheon et al., Video and Audio Streams Over an IP/ATM Wide
Area Network, Department of Computer Science, University of British
Columbia, Technical Report 97-3, June 1997.
Uyless Black. ATM: Foundation for Broadband Networks, chapter 9, pp.
226-227. Prentice-Hall Inc., Englewood Cliffs, NJ, USA, 1995.
R. Braden, D. Clark, and S. Shenker. Integrated services in the internet
architecture: an overview. RFC1633: Integrated Service, Jul 1994.
H. John. Baldwin et al., “AAL-2 - A New ATM Adaptation Layer for
Small Packet Encapsulation and Multiplexing,” Bell Labs Technical
Journal, Spring 1997, pp.111-131.
N. Gerlich and M. Ritter, Carrying CDMA Traffic over ATM using AAL-2:
A Performance Study, Technical Report No. 188, University of Wurzburg,
1997.
Frame Relay Forum, Voice over Frame Relay Implementation Agreement
FRF 11.1, December 1998.
Frame Relay Forum, Frame Relay Fragmentation Implementation
Agreement FRF 12, December 1997.


285
Chapter 12
MPLS Networks
Ali Labed
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLSMPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
MPLS architecture
Label switching
The label format
MPLS signaling protocols
LSP setup
Integrating MPLS and DiffServ
Label merging
Label stacking

Introduction
Multiprotocol Label Switching (MPLS) represents an overlay network that
operates on top of the existing IP network, but depends greatly on the
underlying IP network for its operation. In conventional IP forwarding,
each hop performs a forwarding table lookup on each packet to determine
the appropriate next hop for the destination address. A longest match is
applied to perform the lookup, hence, the forwarding of each packet is
expensive. By using a short, fixed length label, a lookup can be quickly
performed using hardware acceleration.
An MPLS network (see Figure 12-2) is composed of two types of nodes,
Label Edge Routers (LER) and Label Switch Routers (LSR). Label Edge
Routers, located at the edge of the MPLS network, are responsible for
classifying traffic. A class of traffic with the same destination and that
needs to be treated the same way, is called a Forwarding Equivalency Class
(FEC). A label is added to the packet that identifies how the next router
should forward it. Each Label Switch Router (LSR) along the path uses the
label instead of the IP address to make forwarding decisions. Just like ATM
Virtual Path Identifier / Virtual Channel Identifier (VPI/VCI), the label has
local significance. Each router maintains a table that maps incoming labels
to outgoing labels, to provide the packet with the correct label for the next
router. At the egress edge of the MPLS network, the LER removes the label
and forwards the packet using normal IP.
LSR1 LSR2
LSP1
LER1
LSP2 LSR5
LER2
LSR3
LSR4
Figure 12-2: MPLS network example

The label is inserted as a Layer 2 element. The IP logic supports the
creation of the LSP path, however, once the LSP is established, the IP logic
is never referenced again to forward the LSPs’ packets. Data is switched
along the Layer 2 path, exchanging labels in a similar manner as ATM
exchanges VPI/VCI numbers. A label distribution protocol is used to

Chapter 12 MPLS Networks 287
distribute labels to MPLS nodes along the path. Once the labels are in place
at each router, the path through the network is the same for all packets of
the same Forwarding Equivalency Class. Such a path is called a Label
Switched Path (LSP).
Traffic trunks and flows

The concept of traffic trunks and flows are also used in connection with
MPLS networks.
A flow is related to a session in that, it is the aggregate of all traffic
between two endpoints that supports the session. A collection of flows
comprises a traffic trunk.
LSPs can be used to carry Traffic Trunks. A Traffic Trunk can be split
between several LSPs, between the same MPLS edge nodes, for load
sharing purposes. On the other hand, an LSP can carry more than one
traffic trunk (such LSPs are called E-LSPs as shall be seen later).
Traffic trunks are similar in nature to ATM Virtual Circuits and to the
definition given to “trunk” in a digital (TDM) switch, in that it is set up
between two switching points to carry traffic between those two points. In
the MPLS network, a traffic trunk may be entirely carried within a single
LSP or it may be spread across multiple LSPs.
Motivations to move to MPLS

MPLS is pursued for various reasons. The first important reason is that
MPLS enables traffic engineering (TE). TE allows a service provider to
ensure user satisfaction, while optimizing the use of the network resources.
In IP networks, all packets for the same destination follow the same,
shortest path. This creates situations where certain parts of the network are
overloaded while others are free. In this context, it is difficult to build
traffic-engineering capabilities. MPLS, with its potential for establishing
paths that follow explicit routes, allows a better engineering of the network.
A second important reason is related to restoration times (after occurrence
of a failure). MPLS allows preestablishing backup LSPs. These LSPs can
be end-to-end or local, in case faster recovery is sought. These mechanisms
guarantee that backup paths are available when failures occur, and also
provide less variable and faster restoration times than IGPs in IP networks
do. For more details on MPLS restoration, please go to Chapter 18.
The label
A label is an identifier used to identify a Forwarding Equivalency Class
(FEC). At the Ingress LER, an arriving packet is augmented (encoded) with
a label that identifies the FEC to which that packet belongs. Usually, the

Ingress LER assigns an arriving packet to a FEC based on the destination

address of that packet and the forwarding treatment to apply to the packet.
The intermediate routers (the LSRs) take forwarding decisions by
performing Layer 2 lookup, which can be faster than Layer 3 lookup. Also,
labels (as VPI/VCI in ATM networks) can allow the creation of end-to-end
fixed paths between pairs of LERs. The label header (Figure 12-3) has the
following fields:
TTL S EXP LABEL

8 1 3 20
Figure 12-3: Label header

The label field (twenty bits) carries the actual value of the MPLS
label.
The EXP (experiment) field (three bits) can affect the packet
queuing and discard algorithms applied to the packet as it is
transmitted through the network.
The S (Stack) field (one bit) supports a hierarchical label stack. S=0
means that the LSP is at the bottom of the stack. S=1 to indicate the
opposite.
The TTL (time-to-live) field (eight bits) provides conventional IP
TTL functionality.
The label encoding can be done in two ways. The label can be encoded in
an existing data link or network layer header, for example, a label could
correspond to an ATM VPI/VCI, a Frame Relay DLCI, and a DWDM
wavelength for optical networking. In the second encoding type, the label
is encoded in an MPLS “shim” header (see Table 12-1), which is placed
between Layer 2 and Layer 3 headers (for example, in the case of Ethernet
and PPP).
Layer 3 header MPLS Label Layer 2 Header

Header
Table 12-1: MPLS shim label header
Protocol components of MPLS

Figure 12-4 illustrates the protocol components of MPLS. Signaling is used
to create the LSP. The signaling protocols that could be used include,
Reservation Protocol (RSVP) and its Tunnelling Extensions (RSVP-TE),
Label Distribution Protocol (LDP) and its extension Constraint Routed

Label Distribution Protocol (CR-LDP), and an enhancement to Border

Gateway Protocol, used in the context of Virtual Private Networks.
CONTROL PLANE
Signaling (label distribution)

RSVP-TE, CR-LDP
Routing
OSPF-TE, IS-IS-TE
DATA PLANE
Shim-Header
Layer 2
ATM, Frame-Relay, Ethernet
Optical Layer
Figure 12-4: MPLS protocols

The Interior Gateway Protocols (IGP), Open Shortest Path First (OSPF)
and Inter-Gateway-To-Inter-Gateway protocol (IS-IS), with Traffic
Engineering extensions (OSPF-TE and IS-IS-TE), can be used to support
both the delivery of the label distribution protocol packets and the
maintenance of network topology information needed to accommodate
dynamic reroute capabilities. Furthermore, since Interior Gateway
protocols define the reachability of the next hop address, MPLS learns
routing information from them (the same function is provided by PNNI on
ATM networks and by ASTN on Optical networks). In this way, MPLS
does not supplant these protocols to introduce a new network, but rather
extends the capabilities of the network.
The transport of packets can be accommodated by IP networks through the
use of a shim header, by ATM networks through the use of a Virtual Circuit
label1 (VPI/VCI pair), or through the use of separate wavelengths in an
optical network. An MPLS network allows these different kinds of
transport mechanisms to be integrated into a single network.
1. ATM and MPLS services can operate on the same ATM switch without interaction. This is sometimes
called Ships in the Night.

How to build the LSR’s MPLS forwarding table

Label bindings, for example, the mapping (FEC, label), are distributed in
the opposite direction of the LSP from downstream nodes to upstream
nodes. The following explains the rationale behind that. An LSR A that
receives a labelled packet, selects the label to associate to that packet
before forwarding it the next LSR, B, in such a way that LSR B, upon
receiving the packet, identifies its FEC from the received label. Hence,
LSR B (downstream LSR) should have previously informed LSR A
(upstream LSR) about the binding (FEC, Label).
Each LSR includes in its MPLS forwarding table these bindings, namely
the following mapping, incoming label <=> outgoing label, next-hop
(output port).
The MPLS forwarding table is actually composed of two tables, the
Incoming Label Map (ILM), which maps each incoming label to the
second table, and the Next Hop Label Forwarding Entry (NHLFE). The
latter includes the [outgoing label, next-hop (output port)].
Label switched paths setup

In order for LSRs/LERS to build the MPLS forwarding tables, there needs
to be a way for downstream LSRs to communicate their bindings to
upstream LSRs.
LSPs are unidirectional and can be established either dynamically or
statically. LSPs are established statically by provisioning MPLS on routers.
This can be done either manually or through an interface on a network
management system. In this case, the LSPs routes are fixed.
The dynamic case splits in several alternative strategies, “Independent LSP
Control” and “Ordered Control.” In the first strategy, when an LSR
identifies a FEC (a destination, for instance), it binds a label to that FEC
and distributes that binding without being solicited by its peers. Therefore,
the binding and distribution decisions are made by the LSRs independently
of each other. In the second strategy, “Ordered Control,” an LSR binds a
label to a particular FEC, if it has previously received a label binding for
that FEC, from the next LSR for that FEC.
Engineering the network is very difficult in the case of Independent LSP
Control. Ordered Control allows the establishment of LSPs that follow pre-
established routes (explicit routes). This allows LSPs that are established
on routes where enough bandwidth is available to have a load shared
network, and allows backup LSPs to be diversely routed from the LSPs
they protect.
In order to distribute labels (for downstream LSRs to distribute their FEC
to labels bindings to upstream LSRs), MPLS networks use label
distribution protocol such as RSVP-TE (Reservation Protocol with

Tunnelling extensions) or CR-LDP (Constraint Routed Label Distribution

Protocol).
LSP setup using explicit routes

Label switched paths are created with strict explicit routes, where every
hop is specified in advance and the routing information is carried in the
setup message. We show here the corresponding LSP setup procedure.
Ingress LER builds the LSP setup message and includes in it the
explicit route information. Then, it sends that message to the next
hop specified in that explicit route.
Every LSR downstream checks the availability of the requested
resources, reserves the resources (pending acceptance by all
downstream routers), extracts the next hop from the explicit route
information, and forwards the LSP setup message to that next hop.
When that message reaches the egress LER and after successful
reservation at that LER, it generates a path establishment message
(“binding”) back to the ingress router. That message follows the
route of the initial message, but in the other direction.
Labels are selected at the downstream routers and relayed back to
the upstream routers
Once the LSP is established, data is inserted in the LSP at the Ingress LER,
and then switched at Layer 2 until it reaches the egress LER. The latter,
then, affects a Layer 3 lookup to send the traffic out of the MPLS network.
We can also establish a path using, “loose explicit routes”, where some of
the hops can be specified, with the intermediate hops being set up
dynamically. In this way, it will take the shortest path between the explicit
route points (using the underlying routing protocol).
Some important points about LSP establishment are as follows:
LSPs can only be built to destinations in the routing table
Routing table development is iterative
Each router determines its direct connections. Each router develops a
network topology independent of the label distribution protocol. The best
route depends on the underlying protocol and the metric selection available
to it.
LSP setup, example using RSVP-TE signaling

LDP is a hop-by-hop label distribution protocol that was extended to CR-
LDP in order to facilitate QoS. RSVP was intended to establish end-to-end
bandwidth reservation over the IP network and was extended to RSVP-TE

in order to enable traffic engineering and implementation of QoS

mechanisms.
An example showing RSVP-TE is illustrated Figure 12-5. This example
uses explicit routing to specify a fixed path for the LSP. The time axis is
vertical and runs from top to bottom.
LER LER
1 2
LSR2 LSR2
PATH (ERO [R1, S1, S2, R2])

PATH (ERO [R1, S1, S2, R2])
PATH (ERO [R1, S1, S2,
RESV (label=Lr2)
Ls2 >
RESV (label=Ls2)
Ls1 >
RESV (label=Ls1)
Figure 12-5: Example using RSVP-TE

The ingress LER sends a PATH message with a label request toward the
next hop. That message contains the explicit route object (ERO), which
carries the route on which the LSP is to be set up. Each LSR, in turn, will
forward the PATH message along to the next hop, guided by the explicit
route in the PATH message. At the far end, the egress LER terminates the
PATH message and sends a RESV message, back through the network with
its label information. Each router, in turn, populates its MPLS mapping
table with the outgoing label that it receives from the downstream router
and the incoming label that it allocates for this LSP. It forwards the RESV
message back to the upstream router, sending with it, the label it allocated
for that LSP. When the RESV message reaches the ingress LER, the LSP is
set up.
Integration MPLS and DiffServ

DiffServ allows the application of a different forwarding behavior to
different streams. Multiple classes of service (up to eight) can be defined,
and packets entering the (DiffServ) network will be classified in one of
these classes. At each hop, these packets receive a forwarding behavior

(called PHB: Per-Hop forwarding Behavior), corresponding to their

DiffServ class.
In networks where optimization of transmission resources is sought,
DiffServ mechanisms may be complemented with MPLS Traffic
Engineering mechanisms [RFC3564].
In MPLS/DiffServ networks, a straightforward way to accomplish this is
for nodes to use the packet's label to associate arriving packets with their
pre-configured PHB (Per-Hop Behavior). This type of LSP is called Label-
inferred-LSP (L-LSP). Alternatively, where an LSP may carry several
DiffServ classes, the packet's label cannot be used to infer the PHB, and the
EXP field in the packet's header is used. This type of LSP is called E-LSP
for EXP-inferred LSP.
Label merging
Several LSPs can be merged at a common LSR. This takes place by
assigning the same outgoing label and outgoing port, to packets arriving on
any of the merged LSPs. Beyond that LSR, the information packets that
belonged to different LSPs, is lost.
Label stacking
One of the important scalability features of MPLS is the ability to put an
LSP inside another LSP. It is implemented by allowing packets to have
more than one label at a time. The labels form a stack and the router always
forwards packets based on outermost or “top” label. The stack creates a set
of nested tunnels. At the tunnel exit, the outermost label is popped off the
stack and the packet proceeds based on next label without further lookup.
The protocol permits nesting to an arbitrary depth. This can be used to
create a hierarchical network, where LSPs from lower levels are
encapsulated into an LSP for transit through the backbone at that level. See
Figure 12-6.

LER1
LSP_1.1
LER3
Pushes 2nd label on packets
The 2nd label represents LSP2
LSP_1.2
Forwards based on
LSP_2 (outer) 2nd label.
Pops (outer) 2nd label

from packets
LER4 LER2
Figure 12-6: Label stacking example


The architecture of an MPLS network consists in LERs and LSRs. LERs
label packets entering the MPLS network or strip the label from packets
leaving the MPLS network. LSRs forward labelled packets to the next hop
based on the incoming label.
The protocols specific to an MPLS network are the label distribution
protocols. One of their roles is to setup (signal for) LSPs.
Setting up an LSP consists of building entries in the MPLS forwarding
table, of every MPLS node traversed by that LSP.
Several LSPs can be merged at an MPLS node, in which case their packets
are forwarded from that MPLS node on, or on the same LSP. LSPs can also
be nested in other LSPs.
The most important motivations to move to an MPLS network, are that
MPLS enables traffic engineering (TE) and allows for fast restoration
times.
DiffServ capabilities can be augmented with those of MPLS, in order to
benefit from the MPLS traffic engineering capabilities. The treatment that a
packet receives at a node, is determined either by the label for L-LSPs or by
the EXP field for E-LSPs.

References
RFC 3031, E. Rosen et al., “Multiprotocol label Switching Architecture,”
IETF, January 2001, http://ietf.org/rfc/rfc3031.txt
RFC 3564, Le Faucheur, W. Lai, “Requirements for Support of
Differentiated Services-aware MPLS Traffic Engineering,” IETF
RFC 3032, “MPLS Label Stack Encoding,” IETF
RFC 2702, MPLS-TE, D. Awduche et al., “Requirements for Traffic
Engineering Over MPLS,” IETF, September 1999, http://www.ietf.org/rfc/
rfc2702.txt
RFC 3034, “Use of label Switching on Frame Relay Networks
Specification,” IETF
RFC 3272, Internet-TE, D. Awduche et al., “Overview and Principles of
Internet Traffic Engineering,” IETF, May 2002, http://www.ietf.org/rfc/
rfc3272.txt
MPLS-RC, MPLS Resource Center, http://www.mplsrc.com/

297
Chapter 13
Optical Ethernet
Peter Kealy
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Figure 13-1: Protocol reference diagram
Concepts covered
Optical Ethernet basics
Resilient packet ring
Optical Ethernet services
Introduction
Ethernet is an easy to understand technology and is extremely cost-
effective. For these reasons, 98% of local area network (LAN) connections
are now Ethernet based.

Optical Ethernet leverages the ubiquity of Ethernet, to create a framework

for delivering next generation profitable services. It combines the
flexibility, simplicity, and cost effectiveness of Ethernet, with the
reliability, speed, and reach of optics, to allow users to extend their LAN
environment across the MAN and WAN.
Optical Ethernet networks can be built using a number of different types of
metro Ethernet transport networks, Ethernet over Fiber, Ethernet over
Resilient Packet Ring (RPR), Ethernet over DWDM, and Ethernet over
SDH. Regardless of the transport network utilized, each of these building
blocks leverages the simplicity and cost effectiveness of Ethernet to deliver
feature-rich profitable services. While a few service providers may be able
to use a single building block to build their evolving network, the majority
of network implementations require a mix of these building blocks to meet
the diverse needs.
What is optical Ethernet?

Optical Ethernet is a new networking architecture that combines the power
of Optical and the utility of Ethernet to deliver higher bandwidth at lower
cost. Optical Ethernet is a new way of building an integrated business
service providers and optical network, all based on one common language–
Ethernet technology. By eliminating the need for translations between
Ethernet and other transport protocols (for example, T1, DS3, and ATM),
Optical Ethernet effectively extends a corporation's or data center's
Ethernet LAN beyond its four walls. This enables a radical shift in the way
businesses deploy their computing and network resources. See Figure 13-2.
Figure 13-2: LAN in the WAN paradigm

Chapter 13 Optical Ethernet 299
The fundamental definition of Optical Ethernet is based on taking the

power of the Ethernet, evolved in the Enterprise, and combining it with the
power delivered in those core service provider optical networks. Putting
them together redefines the access network and the network end-to-end.
The Optical Ethernet has the characteristics of a campus, in terms of
performance and speed, and reliability and multisite support, in terms of
service provider networks. The branch is just another wiring closet. It is
fundamentally a new paradigm in how we deliver services and connect
sites together. Enterprises can build private Optical Ethernets, use a service
provider managed service, or use a mixture of these two options. Optical
Ethernet can be deployed on a MAN or WAN basis, between major cities.
Optical Ethernet changes the MAN and WAN from being an infrastructure
that is one or two orders of magnitude slower than in-building networks, to
one that matches these speeds. Because there are very low speed networks
over distance, and high-speed local area networks, we have to do a lot of
unnatural things between these two environments. Optical Ethernet also
changes the MAN/WAN from being a connection-oriented network to a
connectionless network, matching the way LANs work. When you equalize
the bandwidth between these two networks and go to Ethernet operation
end-to-end, a lot of the issues with WANs and MANs are eliminated. One
of the first advantages of Optical Ethernet, is it equalizes the bandwidth and
protocols, which makes building the network and managing it easier.
How does an optical Ethernet network operate?

End users of an Optical Ethernet network connect to the network using
simple, inexpensive, ubiquitous Ethernet interfaces. Optical Ethernet takes
Ethernet traffic and transports it directly over a fiber infrastructure.
How fast is optical Ethernet?

Currently, the Optical Ethernet access mechanism allows end users to
connect at speeds from 1 Mbps to 10 Gbps on each interface.
Ethernet over fiber

Ethernet over Fiber (EoF) is a very cost effective, high-performance
networking solution that can span distances as great as seventy kilometers.
Based on the longstanding IEEE 802.3 Ethernet standard, EoF is one of the
key Optical Ethernet building blocks that can be utilized by both Enterprise
and service providers. Ethernet over Fiber is primarily deployed in a point-
to-point or mesh network topology, and delivers packet services over dark
fiber, typically in the Metro Area Network (MAN). It can also be utilized in
a hybrid configuration, along with an Ethernet over Resilient Packet Ring
(RPR) solution.

Resilient packet ring

Ethernet over RPR is deployed in a ring topology and can take advantage of
existing SONET infrastructure. Based on the emerging 802.17 IEEE
standard for RPR, Ethernet over RPR across a SONET ring can be thought
of as “data-optimizing” SONET, which was originally designed to carry
TDM traffic. RPR technology allows a SONET ring to act like the
backplane of an Ethernet switch.
The ability to deliver multiple RPRs and TDM traffic on the same physical
SONET infrastructure is one of the most important capabilities of Ethernet
over RPR. RPR enables simplification of network provisioning,
elimination of SONET bandwidth waste, improvement of bandwidth
efficiency, and link-layer protection.
Simplification of network provisioning

Ethernet over Resilient Packet Ring simplifies network provisioning with
Optical Ethernet UNI concept-defining, service-level agreement attributes,
such as port flow rate policing, port burst rate, customer separation, MIB
statistics, and automatic topology discovery.
Elimination of SONET bandwidth waste

By allowing the SONET ring's bandwidth to be used in both directions,
instead of only one, Ethernet over Resilient Packet Ring solves the problem
of SONET bandwidth waste.
Improvement of bandwidth efficiency

Improving bandwidth efficiency through spatial reuse (see Figure 13-3)
(utilizing the shortest path packet flow around the ring) and statistical
multiplexing (allowing new data on the ring from multiple sources), allows
bandwidth efficiency to improve with Ethernet over RPR technology.

Figure 13-3: Example of spatial reuse
Link-layer protection
Ethernet over Resilient Packet Ring takes advantage of SONET's link-layer
protection and provides less than 50 ms fail-over, in event of a fiber cut and
maintained ring function when a node is disabled.
RPR technology
RPR packets are mapped into a synchronous payload envelope (SPE)
before being added to the SONET WAN ring. Individual packets are
delineated within the STS payload (SPE). TDM traffic, sharing the ring
with RPR traffic, is mapped into separate STS connections. Each RPR
within the OC-N SONET ring is provisionable in sizes of STS-1
(51 Mbps), STS-3c (155 Mbps), and STS-12c (622 Mbps).
Layer 1 SONET protection is disabled for the STS-Nc RPR, allowing both
directions around the ring to carry unique, independent traffic. This feature
doubles the effective bandwidth available for data traffic, allowing true line
rate such as 100 Mbps on an STS-1 RPR or 1 Gbps on an STS-12c RPR.
Layer 2 RPR protection techniques provide carrier grade transport
protection of less than 50 ms. Spatial reuse transports data packets on the

ring in the shortest direction and discards packets at the destination node,
utilizing only the bandwidth in the segment between the source and
destination node. This feature increases the effective bandwidth available
on the ring allowing, service providers the ability to oversubscribe the STS-
Nc RPR by several times.
Point to point, multicast, and broadcast applications are supported
efficiently through the connectionless architecture of all modules
provisioned with access to a common STS-Nc RPR. Data is transported
around the ring in a drop and continue technique, avoiding multiple copies
of the same packet consuming ring bandwidth. Automatic topology
discovery updates all nodes when a new node is added to the ring. Class of
Service queuing supports 802.1p prioritization, enabling service providers
the ability to support time sensitive data applications, such as Voice over IP
and streaming video, along with best effort Internet or e-mail data
applications.
Transparent domains
802.1Q Q-tagged VLANs are used in customer Enterprise networks to
virtually separate LANs (4096 VLANs supported in L2 header).
Service Providers support multiple Enterprise customers who may use the
same VLAN identifiers. RPRs must be able to separate customer traffic,
even if they utilize the same VLAN identifiers. Transparent Domain
identifiers TDIs provide this mapping to maintain privacy as well as
transparency. See Figure 13-4.
RPRs support two kinds of traffic management, Transparent and Mapped:
Transparent–All frames are mapped to a TDI regardless of VLAN
on ingress, and are switched transparently through the RPR
regardless of whether the received frames have 802.1Q VLAN tags
or not. End-customer VLAN tags are carried unaffected through the
network and will exit the network the same way. Because Packet
Edge performs packet encapsulation, each TD is capable of
supporting 4096 unique VLAN IDs.
Mapped–Frames are filtered based on VLAN on ingress and can be
retagged at RPR egress. When configured in “Mapped” mode, the
carrier has the flexibility to configure a mapping of end-customer
802.3Q VLAN tags. This provides options to filter on incoming
end-customer VLANs. That is, only certain packets may be mapped
to a specific VPN, depending on the incoming VLAN tag. As well,
Packet Edge provides the ability to re-tag a customer packet as it
exits the network. For example, a particular customer’s Q-tagged
traffic is received within its defined TD (for example, Q-TAG 6
within TD 451). Leaving, the network the service provider can

retag for the end-customer (for example, the packet now leaves with
Q-TAG twelve within TD451).
Customer 1
Ethernet LAN Customer 1
VLANS 1 - 4096 Ethernet LAN
VLANS 1 - 4096
Service Provider
Transparent Domains
RPR 1 RPR 2
Customer 2
Ethernet LAN Customer 2
VLANS 1 - 4096 RPR 3 Ethernet LAN
VLANS 1 - 4096
Figure 13-4: Transparent domains
Ethernet over DWDM

Ethernet over Dense Wave Division Multiplexing (EoDWDM) utilizes
WDM as its core transport, including both DWDM and Coarse Wave
Division Multiplexing (CWDM). Both CWDM and DWDM increase the
capacity of a single strand of fiber by multiplexing more than one
wavelength of light on it. Ethernet over DWDM delivers massive
bandwidth for point-to-point, mesh, or ring implementations.
Ethernet over DWDM is particularly effective in high-bandwidth, extreme-
performance scenarios, such as information storage solutions or data center
interconnection. Ethernet over DWDM's level of reliability depends on the
implementation, specifically protected or unprotected circuits. For
protected circuits, reliability is very high because there is no loss of traffic
in event of a fiber cut, as the traffic is split between “working” and
“protection” wavelengths that are diversely routed in a ring. In the case of
unprotected circuits, the signal is lost in cases of a fiber cut and will rely
exclusively on any client-device protection (for example, SONET and
MLT).

Optical Ethernet services

This section discusses Optical Ethernet Connectivity Services that are
available.
Ethernet private line

Ethernet private line provides dedicated bandwidth and guaranteed
throughput, across a point-to-point connection, over an optical
infrastructure. Ethernet Private Line is analogous to a “circuit-like” service,
such as T1 service, which is permanently reserved and dedicated for an
Enterprise customer.
Ethernet virtual private line

Ethernet private line provides a secure point-to-point connection within or
between metros over a shared packet-based infrastructure. It is essentially a
point-to-point VPN that functions like a frame relay or ATM service. The
sharing of network resources reduces the overall cost per Mbs for the
Enterprise, while retaining SLA requirements, QoS and bandwidth
flexibility.
Ethernet private LAN

Ethernet Private LAN provides any-to-any connectivity among one
Enterprise's multiple sites. A dedicated amount of bandwidth is shared only
among one customer's sites. Ethernet Private LAN allows an Enterprise to
outsource some of its metro backbone connectivity requirements, without
the direct cost of ownership and management.
Ethernet virtual private LAN

Ethernet virtual private LAN provides a packet-based service that delivers
secure any-to-any connectivity across a shared infrastructure, to unite
dispersed LANs, as if they were a single local campus. This service allows
greater bandwidth flexibility at a lower cost than possible with a frame
relay–type service.
These four key connectivity services allow service providers to enable a
broad set of other value added or enabled services. Enabled Services are
implemented on top of these connectivity services. In addition to the
connectivity services above, several sub categories of Optical Ethernet
services are enabled such as, Bandwidth Brokerage, Internet Data Center
Connectivity, Traffic Aggregation, and Point of Presences Interconnection.
All of the above carrier-grade services can be offered simultaneously on the
same Optical Ethernet network.

Internet access services

This framework provides an introduction to these services for either a
service provider or an end user's Enterprise perspective. Note that this is
not an exhaustive list, but just an introductory framework for delivering
profitable Optical Ethernet services. With a network built upon the above
Optical Ethernet connectivity services, the network has the capability to
deliver enabled services to the Enterprise. These enabled services include
such things as Internet Access, VPNs, inter-VLAN routing services,
Internet firewall services, etc. Below are additional examples and details.
Internet Access Services give businesses high-speed, cost-effective, and
secure access to their service provider for Internet, intranet, and extranet
applications. This allows Enterprises to remove the complexity of Internet
access, because it's now outsourced to the service provider at very high-
performance levels. The Internet access can be point to point from the
Enterprise to the network, enabling the Enterprise to manage Internet edge
technology, such as firewalls, caches, and proxy services, or the service
provider can centralize and share the complex service application devices
to provide comprehensive firewalling, security, NAT, and other Internet
management services for the Enterprise. All of the Internet Access Services
can be connected from a VPN, enabling all sites on the VPN to have direct
access to the Internet.
LAN extension
LAN extension provides seamless, end-to-end connectivity for an
Enterprise to extend their LAN between metro areas. The Enterprise's
traffic is collected in the metropolitan area network (MAN), and the
network's Optical Ethernet functionality is extended across a high-
performance long-haul network, which is completely transparent to the
Enterprise user. Even though the service now spans a much longer distance,
it still functions as a LAN for the Enterprise.
Voice & video

Voice & Video provides the high speed large bandwidth requirements to
meet the stringent requirements to handle both voice and video traffic.
While the majority of today's traffic is data, the toughest network
requirements still come from voice and video demanding low latency,
reduced jitter and guaranteed bandwidth connections. The Optical Ethernet
solutions are built to handle these critical requirements across the metro
and network backbones.
All of the above services can be transported simultaneously on the same
Optical Ethernet network.


Optical Ethernet networks are simple, fast, and reliable.
Optical Ethernet provides the following features:
Flexible bandwidth management
Reduced network complexity
Support for new, high capacity bandwidth services
Rapid service deployment
A new network model, which enables the redesign and
simplification of Enterprise data networks
For Enterprises, Optical Ethernet enables an entirely new networking
model that is simple, lower cost, higher performance, and with massive
bandwidth. This gives Enterprises a great deal of flexibility in how they
deploy their computing resources. The requirement for complicated routing
infrastructure and the restriction of WAN access disappears.
For service providers, Optical Ethernet enables them to move up the value
chain by offering multiple wholesale and value-added outsourcing services.
This networking structure makes possible for the service providers,
including Application Service Providers (ASP) and Hosting Service
Providers (HSP), the opportunity to become a virtual part of an Enterprise's
network.
In addition, residential customers can also benefit from an Optical Ethernet
implementation. Megabits of bandwidth are affordable enough for the
residential customers to get all their communication and entertainment
services. This technology enables bandwidth-intensive applications such as
movies-on-demand, networked video gaming, and more.

307
Chapter 14
Network Access: Wireless, DSL,
Cable
Rob Dalgleish
Dave Anderson
Robert Cirillo, Jr.
Peter Chapman
Session Gateway
H.248 / MGCP View From

Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP QoS
MPLS Packet
AAL1/2 AAL5 Cable Resiliency
SONET / TDM
Concepts covered
Access mechanisms and their characteristics
An overview of cellular coding techniques
Nomadicity and mobility
Comparison of wireless access technologies

DSL technology overview

DSL architectures
Cable access mechanisms
DOCSIS*
Packet cable call signaling
Mobile IP
Authentication and security issues with wireless
Introduction
The process of providing a high bandwidth physical layer connection over
distance, through physical channels, with properties that are time-varying
and nondeterministic, is an age old problem.
To push the limits of maximum bandwidth while minimizing cost
inevitably results in compromised quality in the form of random loss of
information. Information loss can be detected and corrected by
retransmission in Layer 2 or above, but this trades one type of problem for
another. Error detection and retransmission causes random variation of
transmission delay and can degrade performance of real-time services.
This chapter looks at the transmission characteristics of wireless and
wireline physical media and how the different ways of implementing them
can affect the end to end performance of real-time services. Table 14-1
provides a comparison of access technologies.
2G/3G Wireless Cable ADSL

Wireless Ethernet Modem
Layer 2 TDMA/ Broadcast Broadcast ATM

Parent CDMA Ethernet Ethernet
Technology DOCSIS
Distance Yes Yes Yes Yes

Bandwidth
Variation
Mobility Mobile IP, Hand off N/A N/A

Issues HLR, VLR
Table 14-1: Comparison of access technologies
Degrees of freedom — tethered, nomadic and mobile access

Cable and ADSL broadband provide remote users with tethered access to
broadband services using wireline transmission systems. This can also
provide access for nomadic users who plug in their device with an RJ45

Chapter 14 Network Access: Wireless, DSL, Cable 309
connection and can negotiate connectivity with an authentication, security

and billing process, to enforce a business agreement between the user and
service provider.
Wireless LAN (WLAN) access provides a more convenient connection for
nomadic broadband users, but convenience comes at the cost of
complexity. Wireless local area networks in their simplest form, extend the
Ethernet LAN protocol over a wireless connection and work well if
coverage is contained and the user does not move. Users moving between
WLAN cells require Layer 2 switching to maintain a continuous
connection. Moving beyond the subnet requires techniques such as mobile
IP to maintain connectivity.
Only wireless wide area networks can achieve true mobility with complex
hand off mechanisms designed to offer “always on” capability and
seamless connectivity based on a single, globally routable address, namely,
a user’s phone number.
Physical challenges to bandwidth and distance

All wireless and wireline information transmission media are challenged
by the same laws of physics that cause information bearing signals to
degrade with distance in the following ways.
Attenuation, noise and interference

All practical transmission media absorb more energy with increasing
distance. This relationship is linear over guided media, such as copper
cable or fiber, and a power relationship with open radio propagation from a
point source. At some distance, the ratio of wanted signal to all background
noise falls below a level where it can be reliably received. This is the
coverage range limit of the transmission medium. Background noise is the
sum of thermal, impulsive and other user interference.
Fading and multipath

Waves propagating in open or guided media undergo varying degrees of
reflection, and the recombination of the original and multiple reflected
signals can dramatically change the characteristics of the transmitted signal
and complicate the process of recovering it at the receiver. The reflective
and attenuation properties of a transmission system or radio path are
characterized by the Channel Model. This is defined as a list of received
powers in terms of amplitude and time offset. Reference channel models
are defined by standards bodies, to guide design and performance
benchmarking of commercial radio equipment. This allows metrics, such
as receiver sensitivities from different manufacturers to be compared. The
recombination of signals over different fixed delay lengths produces an
inherently time varying signal, with periods of deep fades sometimes
known as Rayleigh fades, requiring receiver diversity to maintain a signal

on one path for short periods, when the other is too weak to use. For added
reliability, some models use time varying delay characteristics.
Multiplexing and capacity

It is economically desirable that as many users as possible share a
transmission medium or common bearer. User signals can operate
continuously and be partitioned in terms of frequency (Frequency Division
Multiple Access [FDMA], Time Division Multiple Access [TDMA] or
Code Division Multiple Access [CDMA]), or a combination of two or
more.
If the transmission media is fully contained, such as cable or fiber, and time
multiplexing is used with no interaction between users, then classical
dimensioning rules such as Erlang B1 apply for circuit data service users to
relate blocking probability (user disadvantage) with carried traffic (which
translates to revenue when billing is for traffic carried). Erlang B-like
hybrid approaches are also usable for packet data traffic, after calibration
from more detailed packet flow simulations.
Media with interference between users on the same shared bearer and with
interference from nearby adjacent bearers become more complex to
analyze. The capacity limit is not as hard edged for this situation because of
quality loss with increasing load due to interference.
Wireless systems for broadband delivery

The challenges that wireless networks present to real-time services are as
follows:
Variability in channel quality and bandwidth of a Radio Frequency
(RF) connection
The requirement to engineer the capacity of the RF link and/or
implement some channel prioritizing QoS mechanisms if it is to be
shared with multiple users
The implications of mobility or nomadicity in establishing and re-
locating a user’s attach point to the network
The variety of public and private wireless solutions available and
the interoperability of those solutions.
Engineering the link to maximize the number of users becomes a real-time
issue in both direct and indirect ways. Oversubscription of a link can cause
a direct increase in overall delay, jitter and packet loss. The indirect means
are a little more interesting. The desire to increase channel capacity affects
the selection of codecs. Lower bit rate codecs allow more users, but they
also have more distortion. Beyond the initial increase in distortion, lower
1. Erlang B is a trunk sizing tool for voice switch to voice switch traffic.

bit rate codecs also achieve their lower bit rates by using more complex
algorithms that make certain assumptions, such as those about the media
and the packet loss rate. Other codecs may not make those same
assumptions. When a user with a low bit-rate codec talks to a user with
another codec, additional distortion is introduced by each transcoding.
Having to deal with a lossy channel, such as an RF interface, impacts Real-
Time performance, both directly in terms of the packet loss rate and
indirectly in the techniques used to mitigate packet loss. Packet loss
concealment algorithms all require a buffer of packets to allow
interpolation to occur. If the RF interface is designed to accommodate re-
transmission on error, additional buffering will occur, at the cost of real-
time performance.
Because of the impact of delay on voice quality (see Chapter 3), this must
be accommodated in an overall network design to ensure a given level of
voice performance. Low bit-rate codecs require larger packet buffers and
more processing and so they introduce larger delays. Buffering an RF
channel to accommodate retransmission adds delay, and more complex
packet loss concealment algorithms mean more delay. As an example of
how significant these delays can become before accounting for the impact
of transcoding or long distance transmission, the accommodations made
for minimizing bandwidth and accommodating packet loss on a cellular
connection can easily result in a one way delay of 100 ms.
Wide area mobility networks

Some background to the way that cellular systems work, is useful in
understanding how they connect with an IP based packet network. There
are several generations of technology in current use, each with slightly
different interconnection requirements.
First and second generation wireless networks

Throughout the 1980s, traditional cellular wireless networks evolved from
circuit switched wireline networks to continuous dedicated connections for
any user at any location while they were moving. This was achieved in
technologies such as AMPS2, TDMA3 and GSM4, by continually tracking
the user according to signal strength to or from each cell site, and switching
the connection through the cell site with the best signal. This was achieved,
in practice, with a conventional circuit switched core network interfaced to
the PSTN5, with traditional signaling approaches such as R1, R2 and
ISUP6. The core was connected to a radio access network and controlled by
2. Advanced Mobile Phone system

3. Time Division Multiple Access
4. Global System for Mobile communications
5. Public Switched Telephone Network

a central entity (Base Station Controller) that managed the fine details of
radio resource management and switching calls between cell sites. This
approach is still effective for voice networks with dedicated narrow band
bearers, and is well suited to the requirements of low, consistent delay and
always-on connectivity during the call. Separate frequencies are used to
separate uplink and downlink traffic and to allow continuous and
simultaneous two-way connections.
The need for ubiquitous coverage and always-on connectivity at reasonable
cost per user, requires that the coverage per cell be maximized, in order to
minimize the number of cell sites required. This encourages very
sophisticated signal processing at the base station radio and user terminal,
to stretch radio capacity and coverage as far as possible. A side effect of
this increased processing power is the increased ‘user perceived’ delay in
the system. The signal processor at each end must store each frame,
implement the encoding and error correction processes (interleaving,
convolutional coding and baseband equalization), and forward the signal.
First and second generation wireless networks have no direct connection
with a packet network. Voice communications with the Public Switched
Telephone Network (PSTN) are achieved by transcoding the wireless codec
used for the RF portion of the path to G.711, which is used in all of the
switching equipment in the PSTN. Normal PSTN switching is used to
deliver the voice data stream to the appropriate central office switch that
connects it to the wireline phone connected to that switch.
Cellular hand off

All cellular technologies enable full mobility. Early AMPS systems did this
by negotiating a retuning between the mobile and the accepting cell as the
donor cell signal deteriorated due to the mobile moving away. A slight
blanking of the speech path (that is data loss) could occur. Schemes that
performed the negotiation and signalled prior to the retune, optimized this
approach making the cell hand off barely noticeable. Intersystem hand offs
at the edge of one cellular border into a neighboring system were similarly
performed. In this instance, additional inter-system messaging was
required, and the blanking could be slightly longer, but was still optimized
to be inconsequential. CDMA7 and W-CDMA8 3G systems introduced the
concept of soft hand off that, among other benefits, improves hand off
performance such that cell hand off is imperceptible and has no packet loss.
With soft hand off capability, a mobile can be in traffic mode with more
than one base station at a time. The Walsh9 code is decoded by up to three
6. R1, R2 and ISUP are examples of trunk signaling protocols in use in the public switched telephone
network
7. Code Division Multiple Access
8. Wideband Code Division Multiple Access (third generation)
9. Unique code issued by the base station to separate the signal from each user.

base stations at a time. The list of base stations changes as the mobile
moves, creating a seamless hand off. Regardless of the mechanism and
technology, cellular systems are well optimized to support hand off of real-
time applications, given their origins of circuit switched voice.
Data on cellular — 2.5G Cellular networks

Low bandwidth data traffic was sent over these second generation cellular
networks on common channels, such as in GSM10, or on the same low
bandwidth circuit switched channels as used for voice (GPRS11). To
overcome this, bearers more suited to packet data, were added to the 2G
networks to become 2.5G networks in the form of GPRS (which evolved
from GSM) and 1xRTT(which evolved from IS95 CDMA). The common
theme is that an IP centric core overlays the circuit core to carry packet data
traffic. GPRS uses the concept of a Serving GPRS Support Node (SGSN)
and Gateway GPRS Support Node (GGSN) to manage the user and data
network connections, while 1xRTT uses PDSN12 Foreign Agents and
Home Agents to accomplish the same thing. Radio bearer types are added
that allow available resources to be efficiently time shared across a
population of users with active sessions, but with long dormant periods.
These implementations provide non–assured QoS with data rates up to
153.6 kbps. They are generally marketed for typical, non–real-time packet
applications, such as web browsing and e-mail. The resource scheduler of
2.5G networks use variable bandwidth to share capacity amongst multiple
users. Figure 14-2 illustrates a typical 2.5G cellular network.
10. Global System for Mobile communications

11. General Packet Radio Service
12. Packet Data Serving Node

Typical 2.5G Cellular Network
Home
Location SS7
Base Station Register signaling
Controller
(vocoder)
Mobile
circuit core Switching voice PSTN/ PBX
Centre
Home
Agent
PDSN/ IP core Internet / Internet

Foreign Agent WAN WAN
Radius
AAA
Figure 14-2: Typical 2.5G cellular network

It was also necessary to increase throughputs per user and RF capacity per
cell site, beyond those achievable with circuit switched voice networks,
without occupying more spectrum. GPRS does this by increasing the
proportion of payload kbps per radio frame at the cost of less error
correction. Enhanced GPRS (EGPRS), or EDGE13, further extends user
throughput with a more complex modulation scheme, at locations where
interference is low enough to support it. 1xRTT defines CDMA radio
bearers with higher user throughputs, up to 153 kbps/user, but with less
spreading gain for the same 1228 kHz chip rate. W-CDMA defines bearers
up to 384 kbps with 3840 kHz chip rate.
Radio bearers optimized for packet data also leverage more powerful
coding schemes to improve tolerance of errors at the cost of increased
receiver complexity and more delay. On top of this, frame error detection,
retransmission and correction are implemented at Layer 2, to maintain
relatively error free performance between Layer 2 access points, when the
physical radio bearer is unreliable. This allows the radio coverage and
capacity to be similar to circuit switched voice, while maintaining higher
throughputs for data users, but incurs the cost of increased delay variation
due to random retransmissions. Applications such as real-time video-
conferencing require low delay jitter and low bit errors. For lowest delay,
Layer 2 error detection and retransmission are disabled and more power is
13. Enhanced Data Rates for GSM Evolution

transmitted to maintain very low error rates without the need for
retransmission.
Bandwidth allocation in the time domain is also a challenge for wireless
networks. For example, IS2000 CDMA 1xRTT allocated a 9.6 kbps
fundamental channel to a user, once the network determined that data
would be transmitted. The packet control function (PCF) in the base station
controller has a per user buffer. Thresholds are monitored, and once the
active threshold is passed, 9.6 kbps radio resources are assigned. The user
will retain these resources and the air interfaced link while data is being
actively transmitted. Once the user’s activity slows, PCF timers will
determine that the user has gone idle and will revoke the 9.6 kbps channel.
A supplemental channel (SCH) is applied when a user is active with a
fundamental channel (FCH) and the PCF buffer indicates that more
bandwidth is required. The resource manager can then assign supplemental
resources and radio link capacity to support up to 153.6 kbps. This
supplemental resource is managed on behalf of all subscribers in a sector,
and it is a shared resource. The result is that if many simultaneous users
require high bandwidth in addition to their fundamental channel, the
supplemental channel resource is shared. This means that the bandwidth
can be granted and then after some time, taken away from a particular user.
This centralized management of bandwidth resources is unique to public
cellular systems and attempts to trade off end user performance with
overall network efficiency. This mechanism works relatively well for
bursty services, such as browsing, e-mail and instant messaging, but does
not lend itself well to real-time services.
3G networks
Full 3G cellular technologies, such as UMTS14, are designed to address
these real-time application challenges. The UMTS 3GPP15 standard has air
interface services defined for voice and for data, but UMTS further defines
data as packet data services and circuit data services. These are more
specifically PS16 (unspecified delay) and CS17 (low delay) services, each
available in 64,144 and 384 kbps allotments.
WWANs18 are evolving further, by implementing very fast downlink
scheduling and time multiplexing of high rate data-bursts of a few
milliseconds, to users, when it is known that their channel conditions are
optimal. This requires continuous monitoring and reporting of the channel
conditions seen by every user, to allow the central radio controller to
14. Universal Mobile Telecommunications Service

15. 3rd Generation Partnership Project
16. Packet data Service
17. Circuit data Service
18. Wideband Wide Area Network

determine who should get the next burst, after also considering how much
data each user needs to send and their relative priority. Technologies such
as 1xEV-DO and 1xEV-DV19 are built on CDMA2000*, and High Speed
Downlink Access (HSDPA) is a Release 5 addition to the WCDMA
standard. These techniques are capable of achieving up to ten times higher
throughput per user, and three times higher kbps capacity per sector, under
ideal conditions of all users requiring continuous throughput, and those
users with the best channel conditions getting the most data. If user priority
or fairness policies are enforced to divert capacity to users in suboptimal
conditions, or there are a very large number of active sessions with low
utilization, then this bearer type is less effective.
OFDM (Orthogonal Frequency Division Multiplexing) has potential to
further improve the spectral efficiency of WWANs. By avoiding the need
for tight power control, OFDM can be more suited to data traffic with long
sessions, but short activity bursts. This is proposed for 3GPP Release 6.
Wireless local area networks

The concept of wireless local area networks began with early Time
Division Duplex (TDD) digital voice systems, such as CT2 and DECT in
Europe, to offer low cost voice services. It was not popular, but when a
radio interface was added to the IEEE 802.xx Ethernet standard, the
wireless LAN came into popularity. This technology broadcasts data to a
population of mobile terminals, all on the same frequency, for both uplink
and downlink using the Ethernet protocol. The technology assumes that
users are very close to the access point to minimize radio transmitter
complexity and cost. Air interface technologies are simplified versions or
combinations of those used in wide area networks such as TDMA, CDMA
and FDMA, to isolate pairs of access points and their users from each
other.
Users sharing a common access point have no direct interference protection
other than time separation, as managed by the collision detection process,
inherent in the Ethernet standard (CS/MA)/CA. Packet loss will inevitably
occur due to collision, and this will be compensated by retransmission at
Layer 2. Lack of any significant signal processing results in very low
latency between Layer 2 entities, which constrains delay jitter due to Layer
2 retransmissions. The effect of these collisions and retransmissions is to
limit achievable utilization of the radio capacity to no more than 30% of the
rated capacity.
An 802.11b access point has a nominal maximum throughput of 11 Mbps,
shared between uplink and downlink. A more realistic capacity after Layer
2 retransmissions and rate downgrade for cell edge users is 3 Mbps. This
19. Single carrier (1x) Radio Transmission Technology (1.25) MHz Evolution - Data Only and Data &
Voice

scales proportionately for 802.11a and 802.11g with the theoretical

maximum throughputs of 54 Mbps.
The very simple radios of WLAN devices can be made to work outdoors in
license exempt spectrum, but they do so inefficiently and with uninsured
results. Tolerance to multipath reflections of radio signals is low. However,
IP based wireless standards more suitable for outdoor nomadic users, such
as 802.16 and 802.20, are being finalized and equipment introduced.
Wireless LAN hand off

Despite wireless LAN being considered here as a nomadic technology,
more so than a mobility technology, there is the inherent ability to hand off
from one access point to another. Typically, once the PC driver for the
802.11 NIC card sees RF power drop below a specified value, the card will
rescan for other, stronger access points, and will associate with them,
resulting in a hand off. This process allows a terminal to seamlessly roam
within an area of contiguous coverage. However, addressing can be a
problem. If the new access point is on the same Ethernet segment as all of
the others, and that segment is on the same subnet off the hosting router,
then all is well. If the new access point in a hand off is on a new subnet, the
previously assigned address will no longer be routable, and some
mechanism is required to re–invoke DHCP to assign a new address. Newer
operating systems have media sense linked DHCP clients that will cause
this to happen automatically, but the implications on real-time applications
are dropped packets for some duration.
Networking solutions supporting nomadic and mobile users
Mobile IP (IP Mobility Support)

Mobile IP RFC2002, and its revisions, are widely viewed as the solution to
intersubnet seamless mobility. The premise is that the terminal receives a
local address at the attach point to the network, (either itself or from proxy
from a foreign agent) and that traffic to and from its permanent home IP
address is IP within IP (technically this is called IP-in-IP encapsulation),
tunnelled to the local temporary address. The protocol also includes
messaging mechanisms to update the home agent as the local address
changes, due to mobility. This affects the hand offs as the mobile terminal
moves in the network. Many access points implement mobile IP foreign
agent directly in the access point. Mobile IP has a need to be engineered
and optimized in any deployment, to ensure that real-time applications do
not suffer from packet loss during mobility. Sufficient buffering is required
at the application layer to ensure that mobility does not result in packet
loss. Based on the need to sense a change in the physical attach point and
then send and receive mobile IP solicitations and registrations, mobile IP
hand offs can take seconds. While real-time applications may be able to be

tuned to work with this, it is clear that mobile IP is not inherently designed
for real-time hand offs.
A perfect example of a nomadic wireless network is 802.11 wireless LAN.
Due to the power constraints imposed on license exempt spectrum, the
coverage area of a single access point is small, roughly 50 m. Given this, a
cluster of access points will cover an area, but contiguous outdoor coverage
of wireless LAN is impractical. For this reason, wireless LAN users
typically connect at each locale as needed, and when finished, pack up and
move to another and reconnect. Seamless hand off, either in idle or traffic
mode, may occur within a cluster of access points but not between clusters.
As a result, wireless LAN users are considered to be nomadic, reattaching,
authenticating, establishing a security association and potentially a billing
association, at each use. This very truth can make real-time services
difficult for a few reasons. First, there is no consistent Layer 3 address that
the user can be reached at that spans all connections. Second, there is no
certainty of bandwidth or QoS at each attach point, since it is a best effort
network.
Managing end-to-end QoS

Ideally, any real-time over packet approach requires built in quality of
service mechanisms to ensure allocation and preservation of resources end-
to-end, that can sustain the bandwidth and latency delay requirements of
the application. This approach needs a way to obtain priority at Layer 2 as
requested by the application. Loose connection of end-to-end QoS between
Layer 2 and the upper layers exists in protocols like IP TOS header and
with MPLS. As wireless LAN is essentially wireless Ethernet, some access
points allow a terminal, such as a VoIP phone terminal, to use 802.1Q,
allowing prioritization of real-time traffic over the air interface. End-to-end
QoS requires linking this to Layer 3 QoS schemes, which are not fully
available.
Conclusion
Traditional circuit switched cellular networks designed for voice are
optimized for real-time services, including mechanisms for hand off,
billing and user authentication. IP-centric wireless solutions, like wireless
LAN, offer the same challenges to real-time applications, as do wired IP
networks, with the additional requirement to support authentication and
billing for nomadic users. 2.5G and 3G cellular networks have designed
solutions to offer the performance of 2G networks for real-time services,
while providing IP evolution in the access and core network components,
offering data services and to taking advantage of IP network economics.

xDSL technology
Background and overview

With the advent of personal computing in the late 1970s, came the need for
establishing data communication connection methods inherent with the
emerging decentralized computing model. For collocated PCs,
interconnection was generally straightforward, using serial-based host to
mainframe connections, followed later by LAN technologies, such as token
ring, LocalTalk and Ethernet. Remote connections, however, remained a
challenge. Technology choices were limited to either low bit rate serial
connections via analog modems, or dedicated leased lines running at bit
rates from 56 kbps to a full T1 span at 1.544 kbps.
Though analog modem technology evolved over the years to provide
higher and higher bit rates, the limitations posed by the underlying
facilities carrying the traffic limited their potential. Namely, the Public
Switch Telephone Network (PSTN) and the phone lines provided to its
customers were neither designed nor optimized to carry data traffic. As
CPU processor speeds increased, so did the bandwidth requirements for
modern network-based operating systems and their applications. The 56
kbps ceiling imposed by voice-band modulated data transmission methods,
quickly became inadequate for most needs, and dedicated data facilities
were still an expensive proposition.
Early on, researchers recognized the limitations of voice-band data
transmission and set out to find a solution. Knowing that the copper wires
that provide dial tone to our homes are capable of carrying signals well into
the megahertz range, a new method of utilizing the frequency range above
3.3 KHz had to be devised.
To enable broadband communications over standard copper loops,
researchers developed new modulation techniques to achieve the
bandwidth required. These methods, and the technology that they enabled,
became collectively known as Digital Subscriber Line (DSL). The DSL
technology family is composed of many variants, each using a specific
modulation technique to achieve the performance characteristics required
for the intended network application. Two common DSL variants are
asymmetric digital subscriber line (ADSL) and high bit-rate digital
subscriber line (HDSL).
Asymmetric digital subscriber line

Asymmetric Digital Subscriber Line (ADSL) bit rates for upstream/
downstream data transmission are determined based on the rationale that
most ADSL subscribers (consumers with personal computers) need more
bandwidth for data downloading than for data uploading. Therefore the

modulation used by ADSL compromises the upstream bit rate in favor of a

higher downstream bit rate.
ADSL allows the coexistence of voice and data over the same copper loop.
Plain Old Telephone System (POTS) splitters (low pass filters) are inserted
at the customer premises, to provide separate jacks for connection of the
DSL modem and telephone set. Sometimes this functionality is integrated
into the DSL modem itself. These splitters are often simply low-pass
filters, used to separate the high frequency DSL signals from the voice
bandwidth, which interferes with normal voice service.
ADSL employs a modulation technique known as Discrete MultiTone
(DMT). DMT uses a multicarrier signal, where the data stream is
distributed over 256 subcarrier channels, each channel modulated using
quadrature amplitude modulation (QAM). This carrier signal is transmitted
at frequencies well above the range used to carry the voice signal, and
allows for simultaneous voice and data transmission on the same pair of
wires. The DMT technical specification is described in ANSI Standard
T1.413, and is theoretically capable of data transmission speeds of up to
8064 kbps (8 Mbps) downstream and 1024 kbps (1 Mbps) upstream.
Practical downstream/upstream rates however, due to the condition and
characteristics of the wire connecting the users premises to the service
provider, are almost always much lower.
High bit rate digital subscriber line

High bit rate Digital Subscriber Line (HDSL) is one of the oldest of the
xDSL technologies and one of the most extensively deployed. Unlike
ADSL, HDSL data rates are symmetric, providing approximately 1.5 Mbps
of data bandwidth in each direction. This has made HDSL, and other
symmetric DSL variants, the perfect low cost alternative to dedicated T1
spans for many businesses. Even telecommunication companies are using
HDSL in lieu of T1 spans in many cases, since HDSL does not require the
stringent line conditioning and testing of a traditional T1 facility.
HDSL uses a Phase Amplitude Modulation (PAM) line coding technique,
known as 2B1Q (two binary represented as one quaternary). This
modulation technique allows the HDSL signal to extend up to 15,000 feet
over four-wire (two pair) 24 AWG copper cabling, without the use of
repeaters. 2B1Q is a robust modulation method that is generally immune to
common line performance issues, such as multiple cable splices or bridge
taps. In contrast to the DMT modulation used in ADSL, 2B1Q utilizes the
frequency spectrum normally occupied by voice transmission. Hence,
HDSL lines do not support simultaneous voice and data operation.
Other DSL technologies

The variants of copper access technology, such as DSL, ADSL, HDSL,
VDSL, SDSL, IDSL, RADSL, are many, and often are confusing, as

vendors and standards committees race to develop new variants, in hopes of

gaining higher performance or a market advantage. While a comprehensive
technical description of each DSL flavor is beyond the scope of this
chapter, Table 14-2 summarizes the performance characteristics of the
DSL technologies you are most likely to encounter.

Max.
Downstream Upstrea
Name Description Loop Notes
Speed m Speed
Length1
ISDN DSL 144 kbps 144 kbps 18,000 Ft. Symmetric access.
two wire (one pair)
IDSL operation. ISDN
BRI
Asymmetric 1.5 kbps– 16–640 18,000 Ft. Asymmetric

DSL 8.192 Mbps kbps downstream /
ADSL upstream speeds.
two wire (one pair)
operation.
Symmetric 1.544 Mbps 1.544 10,000 Ft. Typically

(Single) Mbps provisioned at 768
SDSL DSL kbps. two wire
(one pair)
operations.
High bit rate 1.544 Mbps 1.544 15,000 Ft. Symmetric access.
DSL Mbps Requires four wire
HDSL (two pair)
operation.
Rate- 64 kbps – 16–768 18,000 Ft. Asymmetric

Adaptive 8.192 Mbps kbps access. two wire
DSL (one pair)
RADSL operation. Often
provisioned at
768kbps.
Very High 13–50 Mbps 1.5–6 N/A Emerging standard

Speed DSL Mbps ratified by ITU-T
in 2002. Typically
VDSL requires FTTC
(Fiber-to-the-
Curb) distribution.
1. Maximum loop length using 24 AWG cooper wire. Typically, the shorter the loop, the more reliable the
transmission. Shorter loop lengths than those stated are often required to achieve the highest data rates
as stated in the table. This is especially true of the rate-adaptive technologies such as ADSL/RADSL.
Table 14-2: DSL upstream and downstream link speeds and distance limitations

DSL access topology

Figure 14-3 provides a very high level view of a DSL Network, using
ADSL as an example. Since ADSL allows voice and data to coexist on the
same copper loop, it is typically found serving residential areas.
Telephone Central Office
Local Telephone
Switch
Access to Copper Loop
PSTN Main
Distribution
Frame
Central Office
CPE POTS
POTS Splitter
Splitter
House
Access to Wiring
Internet
DSLAM ADSL
DSL Access Single Modem
Multiplexer Cooper Loop
Figure 14-3: Typical DSL connection to the home

From the perspective of the central office telephone switch and the access
lines served from it, nothing has changed. Physically, however, a Central
Office (CO) POTS splitter device must be installed to combine the POTS
and data signals onto the local loop before, leaving the CO. Depending on
the equipment vendor’s platform, this POTS splitter functionality may be
built into the access device itself, or it may require an external shelf for the
correct terminations. This POTS splitter function is only a requirement for
those DSL technologies (ADSL/RADSL) targeted towards residential use,
supporting voice and data over a single loop.
Serving the central office data portion of the connection, is a digital
subscriber line access multiplexer (DSLAM). This device is responsible for
concentrating the data traffic from multiple DSL loops, onto the backbone
network for connection to either the public internet or a private network.
Though Figure 14-3 depicts a central office to subscriber scenario,
customers are just as likely to be served from a remote access site
(typically a freestanding cabinet on a concrete pedestal), containing voice
lines and a mini-DSLAM to provide voice and data services.

DSL network components

A DSL network is composed of many of the same elements found in the
typical data network, with the exception of a DSLAM, as the access
network edge device and the common use of a DSL subscriber aggregation
platform for logical termination of multiple, diverse data streams.
Depending on the service provider and the underlying access technology,
subscribers can also be required to use certain protocols, such as point-to-
point over Ethernet (PPPoE) for authentication and access to the network.
Using ADSL as an access technology, the following sections describe the
functional role of each major element required in the network.
ADSL modem
The ADSL modem is the customer premises equipment used to terminate
the DSL connection. Physically, it features an RJ-11 socket for connection
to the local loop, and an RJ-45 socket for the 10Base-T connection to the
subscriber’s personal computer (PC). For subscriber convenience, it may
also feature a second RJ-11 socket for POTS connection to a standard
telephone set. Since ADSL uses ATM to carry traffic at Layer 2, the
modem is also responsible for encapsulating the IP packets into ATM
AAL5 cells. Depending on the type of connection, the modem can be
configured to use a RFC1483 (multiprotocol encapsulation over AAL5)
bridged or routed connection, or the more versatile combination RFC2364/
RFC2516 (PPP over AAL5/PPP over Ethernet) to carry traffic into the core
network. As a CPE (customer premises equipment) device, a network
compatible ADSL modem can either be purchased by the subscriber or
obtained directly from the service provider as a purchase or rental.
DSLAM
The DSLAM20 provides backhaul services for packet, cell and circuit-
based applications, through concentration of the DSL lines onto 10Base-T,
100Base-T, T1/E1, T3/E3 or ATM outputs. For ADSL, the connection is
typically ATM cell-based from the subscriber modem through the network
core. DSLAMs are found on the edge of the provider network, located in
either the central office or a remote cabinet. Physically, DSLAM sizes
range from large, multicard shelves (five to seventeen U spaces) to small
1U boxes found in remote access cabinets.
Subscriber aggregation
With ATM being a point-to-point, connection-oriented link, there exists the
requirement to terminate the subscriber endpoint and reassemble the ATM
cells into IP data packets. While it is possible for an ATM-enabled core
router to terminate these ATM VPI/VCI connections, as a practical matter,
20. Digital Subscriber Line Access Multiplexer

it would be administratively intensive to provision and maintain the

hundreds or thousands of connections feeding into the core. Manually
provisioned logical terminations also deprive the subscriber of the ability to
dynamically connect into the network of their choosing, be it a choice of
Internet Service Providers (ISPs) or access to a private corporate network.
Recognizing these issues, carriers typically deploy a broadband subscriber
aggregation platform dedicated to terminating these connections and
routing data to the appropriate destination.
See Figure 14-4 for a pictorial representation of subscriber aggregation.
ATM DSL Aggregation

VC DSLAM
C 0.1 Platform
00
Subscriber A
DS-3 ATM
ATM VCC 0.101 ISP
Subscriber B 02
0.1
VCC
M
AT
ISP
Subscriber C
Figure 14-4: Subscriber aggregation

This aggregation box is located in the core, placed between the DSLAM
and the various ISP (or private network) choices offered by the DSL service
provider. Often these devices support additional services beyond residential
broadband aggregation. These other features commonly include IP routing,
VPN creation/management, NAT, QoS, packet filtering and firewall
services.
DSL network architecture

Figure 14-5 illustrates a typical DSL network, including the end-to-end
connectivity required to enable data termination, aggregation and routing.

Subscriber
Internet
(ISP A)
Loop DSL Aggregation
10Base-T DSLAM
Platform
ADSL
Modem
ATM/Frame ADSL ATM IP Internet
DS-3/OC-3 DS-1/DS-3 (ISP B)
ADSL
Subscriber
IP
PPPoE
ATM ACME Co.
ADSL (Private Network)
RADIUS Server
Figure 14-5: DSL network architecture

In this ADSL deployment scenario example, the subscriber is connected to
the DSL modem with a 10Base-T cable. In order to emulate the Internet
dial model, including authentication/authorization, auto-configuration and
dynamic destination capabilities, point-to-point over Ethernet (PPPoE) is
used as the encapsulation method to carry traffic between the subscriber
and the broadband aggregation platform. This requires either a software
shim to be loaded onto the subscriber’s computer, enabling (PPPoE)
functionality as defined in RFC-2516, or PPPoE protocol to be
incorporated in a DSL router located on the customer’s premises. PPPoE
software drivers are commercially available for Windows, Mac OS* and
Linux platforms, and are generally provided by the ISP as part of the
service subscription.
At the PC level, the network session is encapsulated by the PPPoE shim
and sent over the local 10Base-T Ethernet segment to the ADSL modem.
The modem replaces the Ethernet link and physical layers with ATM and
ADSL, forwarding the traffic to the DSLAM. The DSLAM terminates the
ADSL physical layer, mapping multiple subscriber connections into an
upstream ATM trunk. This OC3 or DS3 trunk connects to the DSL
aggregation platform, which strips off the encapsulation provided by the
PPPoE shim. Once the subscriber is authenticated, traffic is routed onto the
appropriate ISP’s backbone.
Note that using PPPoE with ADSL is not an absolute requirement.
Although less versatile than PPPoE, some providers continue to use
RFC-1483 (multiprotocol encapsulation over AAL5) bridging for data
transmission from the subscriber to the network core. While this method
does not provide the dynamic service selection and other features of a
PPPoE connection, the encapsulation still allows for dynamic IP address
assignment using DHCP and is transparent to the end user, making it easier
for the service provider to install and administrate.
In this scenario, the subscriber’s PC simply forwards the IP traffic across
the Ethernet link to the ADSL modem. The modem encapsulates the data

using RFC-1483 bridged encapsulation [IP/1483/ATM] for forwarding

across the ADSL link. Arriving at the network core, the broadband
aggregation device strips off the 1483 encapsulation, routing the traffic
onto the ISP’s backbone. It acts as a learning bridge, supporting ARP and
filtering. It is important to note that bridging does not natively support the
per-session authentication and accounting, as was the case for PPPoE, so in
most cases subscribers are implicitly authenticated.
Real-time Networking and ADSL

With the widespread deployment of ADSL for residential broadband data
services, it is particularly important to consider technological
characteristics that can impact real-time network applications. For instance,
although ADSL uses ATM as its underlying transmission protocol, the
AAL5 ATM variant that ADSL employs has no built-in Quality of Service
(QoS) or prioritization mechanisms to ensure reliable delivery of real-time
data. Time and delay sensitive applications such as Voice over IP are not
distinguished from other data types (for example, HTTP or FTP traffic),
and may degrade the perceived quality of voice transmission as each
application competes for the available bandwidth.
Another aspect of ADSL that may compromise real-time applications, is
the asymmetric nature of ADSL technology itself. It is not uncommon for
carriers to provision upstream data rates as low as 128, 256 or 382 kbps.
While this bandwidth may be sufficient for some voice applications, it may
be wholly inadequate to support video streams (one-way and two-way) and
other real-time data types.
Cable access technology
Cable network
The cable network was originally designed for one way communication of
broadcast information, primarily television signals for the purpose of
information and entertainment. In this context, broadcast traffic means that
the same material is sent to every user, usually residences connected to the
cable.
The technology was designed to provide a large number of television
channels in one direction, from the cable operator to the residence. In
addition to carrying television signals, it usually also carries sound signals.
With the advent of the Internet, and to enable cable service providers to
offer services above and beyond broadcast mode entertainment, the
requirements on the cable technology and infrastructure significantly
changed. The requirement is now to provide all of the foreseeable two way
communication services to and from residences and in some cases,
businesses.

Examples of these services are as follows:

Primary line voice (including emergency service and operation
without external power)
Second line voice (voice without the requirements above)
Internet access
Interactive entertainment
Multiuser game playing
Video conferencing
On-demand video streaming
In addition, the infrastructure is designed and standardized to provide these
services, as well as further services, not yet envisioned. The term
multiservice operator (MSO) is now used in the industry to describe a cable
service provider that offers these additional services.
Digital cable and bandwidth utilization

Analog TV signals occupy a bandwidth of about 6 MHz (8 MHz in
Europe) and a cable can typically carry about ninety analog channels
(actual maximum is 121 but rarely are this many analog channels
provided). Cable operators soon ran out of capacity for channels, due to
limitations on the spectrum. This was primarily caused by physical
limitations imposed by the copper coaxial cable used. To accommodate
even more channels, a digital technique was introduced to use the spectrum
available on the cable more efficiently. Figure 14-6 shows the spectrum
usage.

Spectrum usage for cable service
Downstream is organized into separate 6MHz (8Mhz in

Europe) channels, the majority carrying broadcast mode
video. A number of these channels are allocated to data.
Upstream 5-42 MHz Downstream 54-860 MHz
Figure 14-6: Spectrum usage
Bandwidth and frequency allocation

Current cable systems usually employ a physical structure known as hybrid
fiber coax, which means that the trunk from the head end to the distribution
amplifier is fiber and the distribution from the amplifier is coax.
The following frequencies are used (see Figure 14-6):
Upstream is 5-42 MHz.
Downstream is 54-760 MHz (the upper limit varies somewhat
according to equipment in use)
Using digital technology, coupled with compression (typically MPEG), it is
possible to send ten channels in the same 6 MHz of spectrum. In addition to
enabling better utilization of the cable capacity, it also provides protection
against some characteristics that degrade the video quality.
Extension of cable technology to provide interactive services

With the advent of digital technology to cable distribution, the possibility
of using the cable for two-way individualized, rather than broadcast,
information has become possible.
The industry has formed an organization, known as Cable Television
Laboratories, Inc.21 CableLabs (www.cablelabs.com) was formed to
provide these standards and to provide other industry-wide benefits. The
21. Founded in 1988 by members of the cable television industry, Cable Television Laboratories, Inc.
(CableLabs*) is a nonprofit research and development consortium that is dedicated to pursuing those
technical advancements into their business objectives.

specifications for digital signal distribution over cable are known as

DOCSIS (Data Over Cable Service Interface Specification), and the
standards for packet information over digital cable are known as
PacketCable.
Rights management
In order to offer different selections of program channels to different users,
and to be able to offer users individual programs on a billable basis, known
as pay-per-view, it is necessary to protect the program material and to
provide access to it for individual users. In addition, because it is not
always possible to control physical access to the cable, it is necessary to
provide some form of control of access. This is done by applying high-pass
filters to the tapped circuits of the primary distribution circuit with analog
scrambling techniques. With the advent of digital transmission, encryption
is enabled with digital keys.
Maximum distance 100 miles,

but typically 10 – 15 miles
Line extender
Bridger amplifier
Head end
CMTS
Bi-directional Hybrid Fibre/Coax
Tap
To residences
Figure 14-7: Cable network components
Cable network components

A cable network consists of the following components (see Figure 14-7):
Head-end–The operator’s equipment that configures and inserts the
information on to the cable.
Residential equipment–Equipment consisting of a set top box for
digital video and a cable modem for interactive services. The cable
modem will interface with the cable on one side and Ethernet on the
other. It may also include an MTA (Multimedia Terminal Adapter,

defined below) that will enable connection to a standard analog

telephone.
Cable modem termination system–Equipment at the operator’s
end of the network that inserts and extracts the individual data
signals onto the cable infrastructure. Hybrid fiber coax distribution
system, including all amplifiers, line extenders and other equipment
needed to ensure the reliable distribution of the signal along the
cable.
Cable modem–Equipment at the user premises that inserts or
extracts the individual digital data. Typically, it will connect to the
cable and to Ethernet for connection to other user equipment (for
example, a personal computer).
RF modulation and signaling

In order to carry the digital signals along the coaxial cable, it is necessary
to modulate an RF (Radio Frequency) carrier to make the signals
compatible with the other broadcast signals that coexist on the cable.
Because the television signals each occupy 6 MHz channels, the digital
signals are also organized on 6 MHz channels. This simplifies frequency
and channel assignment.
Upstream and downstream

Upstream is defined as flow from the residence to the operator’s head end
and downstream is defined as flow from the head end to the residence.
Downstream technology
Typically, a few hundred users can share a 6 MHz downstream channel and
one or more upstream channels. The downstream digital modulation
system is the same as that used for digital television, using 64 or 256 state
QAM (Quadrature Amplitude Modulation), and it can provide up to 40
Mbps. Information for each user is separated by a time division multiple
access system, usually referred to as TDM. Because the signals for all users
are generated by the CMTS (Cable Modem Termination System), they are
all synchronized in relation to each other, so separation of each user’s
traffic is reliable.
Upstream technology
DOCSIS 1.0 and 1.1. In DOCSIS versions 1.0 and 1.1, the upstream
channels can be up to 3.2 MHz wide and can deliver up to 5.12 (DOCSIS
1.0) Mbps per channel. Because a number of users share the upstream RF
channel, a media access control (MAC) layer coordinates shared access to
the upstream bandwidth.

In Upstream, the access is a combination of frequency division multiple

access and time division multiple access, (normally referred to simply as
TDMA) and is distinguished from the code division multiple access
(CDMA) introduced with DOCSIS 2.0. The term TDMA, as distinct from
TDM above, is normally used to indicate a number of TDM signals from
separate sources, where the challenge of synchronizing is somewhat more
onerous. Put simply, because the TDMA signals are generated at different
cable modems, each with different distances and different propagation
delay distances along the cable from the CMTS, special provisions have to
be made to ensure the CMTS can reliably separate the signals. Upstream is
organized into 200 kHz frequencies and within each of these there are up to
sixteen TDMA time slots. DOCSIS 1.1 is very similar to DOCSIS 1.0 but
has sixteen state QAM modulation providing a higher upstream data rate of
10.24 Mbps and some additional features.
DOCSIS 2.0. DOCSIS 2.0 supersedes DOCSIS 1.0 and DOCSIS 1.1.
DOCSIS 2.0 is aimed at increasing throughput and robustness of the
upstream channel and can deliver up to 30 Mbps over 6.4 MHz channels.
Different modulation schemes are introduced providing eight, sixteen
(same as DOCSIS 1.1), 32 and 64 state QAM (Quadrature Amplitude
Modulation). An extended Reed Solomon forward error correction (FEC)
scheme is also introduced to handle errors.
As well as supporting upstream TDMA at higher modulation rates than
DOCSIS 1.0 and DOCSIS 1.1, DOCSIS 2.0 introduces a system known as
CDMA (Code Division Multiple Access). The upstream data is organized
into units known as minislots. A minislot contains information from one
cable modem. A number of orthogonal spreading codes are assigned to the
data to be transmitted, which are from four to 128 bits long. These
spreading codes are multiplied by the bits to be sent, which are referred to
as symbols. A slot can be up to 32 symbol durations in time. Due to the
orthogonality of the spreading codes, they are combined with a group of
symbols to be transmitted and the resultant signals are then combined and
sent to the modulator. Due to the orthogonality of the codes, the symbols
can be reliably decoded by the CMTS. However, when they are combined
with signals from other cable modems, this strict timing is not necessarily
maintained. As a result the specification defines the maximum timing error
between the Cable Modem (CM) and the CMTS.
PacketCable
The means of using the cable infrastructure to carry traditional voice traffic
has been facilitated by means of PacketCable. PacketCable 1.0, introduced
in 1999, provided baseline voice capabilities. It did not provide certain
essential services, primarily 911 service and functionality without
consumer power, and was aimed at the residential second line market.

Packet cable 1.1, introduced in 2000, provided primary line functionality,

including E911, service protection during the failure of the user’s power
and support for CALEA (Communications Assistance for Law
Enforcement Act). It enables residential users to replace their primary line
with service, provided over cable.
PacketCable 1.2, introduced in 2002, provided SIP (session initiation
protocol) based call management server (CMS). Session initiation protocol
is a standard for establishing and managing communication sessions over
IP.
Multimedia 1.0 added QoS and high speed voice and nonvoice,
bidirectional communication.
PacketCable and VoIP

PacketCable VoIP is a means of providing telephone service over the
DOCSIS architecture as distinct from voice service over personal
computers. The DOCSIS architecture, together with the managed IP
infrastructure provided by the cable operator, provides a well controlled
network, distinct from the public Internet, which is essentially
uncontrolled. As a result, Quality of Service (QoS) can be provided, which
ensures that voice traffic receives the appropriate prioritization within the
cable access network, providing consistent voice service that meets
necessary standards for voice performance. The PacketCable system
interconnects to the public switched telecommunication network by a
packet voice gateway.
Within the cable access network, a number of divisions are made within the
architecture as Figure 14-8 shows.

Components within a cable access network
Managed IP Network
Router
HFC Access Network HFC Access Network
(DOCSIS) (DOCSIS)
CMTS Router Router CMTS

MTA/CM MTA/CM
PSTN
OSS Servers Router
Gateway
Call Media Server

RKS
Management
Server Media
DNS Gateway
Announce- Controller PSTN Network
ment Voice
DHCP Call Controller PSTN Traffic
Agent
Media
SNMP Gateway
Signalling
TFTP PSTN Traffic
Announce- Signalling
SYSLOG ment Gateway
Gate Player
Controller
TGS
Figure 14-8: Components within a cable access network (use with

permission from Cable Television Laboratories, Inc.)
Cable modem
The cable modem converts the digital signal from the user equipment, into
a signal suitable for sending in both directions on the hybrid fiber coax
network. This is multiplexed onto the HFC network, together with the
broadcast equipment.
Multimedia terminal adapter

There is a device known as a multimedia terminal adapter, which will also
be included and will be integrated with the cable modem. An MTA
converts multimedia information, such as voice, into suitable IP traffic for
sending to the cable modem. The MTA contains codecs and signaling for
connecting analog phones to the network, as well as, interfaces to connect
IP enabled phones. The MTA provides all signaling and encapsulation
functions, so that a standard analog phone can be connected to an MTA. An
embedded MTA is a single device that incorporates a cable modem
together with an MTA.
Access network
This is the network that connects the user to the cable service provider
(MSO). It can be regarded as consisting of three primary components:
Hybrid fiber/coax (HFC) access network–PacketCable-based
services are carried over the hybrid fiber/coax (HFC) access

network. The access network is a bidirectional, shared-media

system that consists of the cable modem (CM), the cable modem
termination system (CMTS), and the DOCSIS MAC (media access
control) and physical access layers.
Managed IP network–This is the section of the network that
connects the IP network to the operator’s servers, which then
connect to the head end.
Public switched telephone network (PSTN)–Connects to the
managed IP network through a system of gateways, to convert the
voice IP traffic into traffic compatible with the PSTN and also to
manage the signaling, which in the IP network is organized as
signals within the Internet Protocol, while on the PSTN network it
is carried as a separate signaling path.
Cable modem termination system

This is the equipment located at the cable television system head-end or
distribution hub (operated by the service provider), which allocates and
releases resources within the cable network and provides complementary
functionality to the cable modems to enable data connectivity to a wide-
area network. The CMTS is also responsible for managing all QoS (Quality
of Service) activities. It also records usage of resources, which is needed
for some billing purposes.
Call management server (CMS)

The call management server provides call control and signaling related
services for the MTA, CMTS, and PSTN gateways in the PacketCable
network. The CMS is a trusted network element that resides on the
managed IP portion of the PacketCable network. A PacketCable 1.0 CMS
consists of the following logical PacketCable components:
Call agent (CMS/CA)–Call agent is a term that is often used
interchangeably with CMS, especially in the Media Gateway Control
Protocol (MGCP). In PacketCable, the call agent (CA) refers to the
control component of the CMS that is responsible for providing
signaling services using the NCS protocol to the MTA. In this context,
examples of call agent responsibilities are as follows:
Implementing call features
Maintaining call progress state
The use of codecs within the subscriber MTA device
Collecting and pre-processing dialed digits
Collecting and classifying user actions

Gate controller (CMS/GC)–The gate controller (GC) is a logical

QoS management component within the CMS that coordinates all
quality of service authorization and control. Gate controller
functionality is defined in the dynamic quality of service
specification. The CMS may also contain the following logical
components:
Announcement controller (ANC)–The ANC is a logical signaling
management component used to control network announcement
players.
The CMS may also provide the following functions:
Call management and CLASS (custom local area signaling
services) features
Directory services and address translation
Call routing
Record usage of local number portability services
Zone-to-zone call signaling and QoS admission control
Media server
The Media Gateway Controller is the logical signaling management
component is used to control PSTN media gateways.
Network announcement player–This device holds all the verbal
messages that need to be generated for the telephone and other
voice related services.
PSTN gateway
PacketCable allows MTAs to interoperate with the current PSTN through
the use of PSTN gateways. In order to enable operators to minimize cost
and optimize their PSTN interconnection arrangements, the PSTN gateway
is decomposed into three functional components, a controller (media
gateway controller), one gateway for the bearer path, and a second gateway
for the signaling path.
Media gateway controller (MGC)–The MGC maintains the call
state and controls the overall behavior of the PSTN gateway. It
terminates and generates the call signaling from and to the
PacketCable side of the network.
Media gateway (MG)–The MG terminates the bearer paths and
transcodes media between the PSTN and IP network.
Signaling gateway (SG)–The SG provides a signaling
interconnection function between the PSTN SS7 signaling network
and the IP network.

OSS (operations systems support) components

TGS (Ticket granting server)–This is used with the Kerberos
server, which is part of the security system. The TGS grants
Kerberos tickets to the MTA. This ticket contains information to
enable authentication, privacy, integrity and access control between
the MTA and the cable modem.
DHCP server–This is the standard DHCP protocol server, which
dynamically allocates IP addresses.
Domain name server–This provides the look up between text urls
(Universal Resource Locators) and numerical IP addresses.
TFTP or HTTP server–This server is used to download
configuration files to the MTA.
SYSLOG–This is a logging server that records operational and
error messages from MTAs.
Record keeping server–This server collects usage information
relating to the CM, CMTS and MGC. This information is collected
for subsequent analysis of usage and for billing purposes.
Announcement controller, server and player–These three
devices deliver audible tones and voice messages required on the
network.
QoS
Quality of Service has already been mentioned. Internet protocol specifies
a number of delivery mechanisms. Most common are TCP (transmission
control protocol) and user datagram protocol. These mechanisms do not in
themselves, provide either QoS or guaranteed delivery. However, TCP
provides for acknowledgement and retransmission of lost packets.
PacketCable carried over DOCSIS 1.1 provides means for ensuring that
packets are delivered in such a way to guarantee delivery and sequencing.
The CMTS has ultimate control of the QoS mechanisms. Clients make
requests to the CMTS, but it is only the CMTS, or a policy server
controlling the allocation made by the CMTS that has the authority to grant
or deny those requests.
PacketCable security
The standard defines methods for protection of information generated by
the user, protection of confidential information, such as passwords, and for
protection of copyrighted material, made available by the service provider.
The security architecture provides facilities to easily detect and identify
attempted breaches.

Service types defined for upstream traffic flows

The general principles employed, are for the upstream bandwidth
allocation to be managed by a scheduler, which is a function of the CMTS.
The CMTS issues a message, known as a bandwidth allocation map, in
response to requests for upstream service from the cable modems.
Best-effort
A standard contention-based resource management strategy, in which
transmit opportunities are granted in the order in which the request is
received by the CMTS, as coordinated by the CMTS scheduler. This
scheduling type may be supplemented with QoS characteristics in which,
for example, maximum rate limits are applied to a particular service flow.
Non–real-time polling
A reservation-based resource management strategy is one in which the
cable modem is polled at fixed time intervals. Although it is a fixed
interval, the interval is sufficiently large, to facilitate utilization efficiency
at the cost of real-time performance. When queued traffic is identified on a
particular service flow, a transmission opportunity, or grant, for that service
flow, is provided by the scheduler.
Real-time polling
Real-time polling is analogous to the non–real-time polling scheduling
type, except that the fixed polling interval is typically very short (<500 ms).
Polling scheduling types are most suitable for variable bit rate traffic that
has inflexible delay and throughput requirements. Video streaming is
typical of this type of traffic.
Unsolicited grant
A reservation-based resource management strategy is one in which a fixed-
size grant is provided to a particular service flow at approximately fixed
intervals. This scheduling type is most suitable for constant bit rate traffic
and eliminates much of the protocol overhead associated with the polling
types. This is suitable for voice traffic.
Unsolicited grant with activity detection

A reservation-based resource management strategy that represents a hybrid
of the polling and unsolicited grant scheduling types, is one in which fixed
grants are provided at approximately fixed intervals as long as data is
queued for transmission. During periods of inactivity, this scheduling type
reverts to a polling mode, in order to release unused capacity.

Downstream Service Flows are defined using the same set of QoS
parameters that are associated with the best-effort scheduling type on the
upstream.
PacketCable and provision for future services

By intent, the specification is less rigorous than some of the other
specifications, to facilitate faster time to market. In particular the
specification does not cover the following concepts:
client-to-server signaling
client-to-server authentication
The approach allows vendors and service providers to define the service
and the standard only specifies a QoS enabled interface to the MSO
network.
Examples of new services facilitated by the multimedia specification
include:
Video streaming
Gaming portals
Video conferencing
The architecture allows for control and regulation of third party service
offerings. Operators will be able to define service levels and include these
service levels in agreements with third party service providers.
Acceptance by standards bodies

All three phases of the DOCSIS specifications have been formally
approved by national, regional, and international standards development
organizations, such as the Society for Cable Telecommunications
Engineers (SCTE), the European Telecommunications Standards Institute
(ETSI), and the International Telecommunications Union (ITU).


From this chapter, the reader should have gained an understanding of
different network access technologies. You have seen how the access
network and the control of access traffic is managed by various means. In
addition, it should become clear that the access part of the network can
have a significant effect on overall traffic.
In the case of a dedicated link from service provider to the user, such as a
wireline, possibly carrying a DSL service, there is a possibility of conflict
between services from a particular residence or user. This is expected to
become more of a problem when delivery of live video over DSL becomes
established, due to the requirement for a continuous high, data-rate stream
of video. If the DSL link is shared among a number of services, anticipated
effects are, possible interruption of video due to a game playing event or
due to IP phone activity.
One of the problems with access is that some of the access physical layers
are now being used in a way for which they were not designed when
originally installed. For example, much cable was designed for
downstream use only, cellular systems were designed for voice, and offer
excellent protection to voice traffic, which is less satisfactory for nonvoice
traffic. Wireline was not designed for DSL service and often incorporates
bridge taps and gauge changes, which affect the characteristics of the link,
to the detriment of DSL performance. Wireless LAN was not designed for
voice, but is now being used for voice traffic.
For this reason, engineering of networks and traffic is necessary to
overcome these difficulties. This is particularly demanding time, the case
of real-time traffic, because real-time traffic has very little processing time,
and techniques to mitigate against network issues, such as buffering, error
checking and correction and resending, can often not be applied to
real-time traffic.
References
Cable Television Laboratories, Inc., http:\\www.cablelabs.org

341
Chapter 15
The Future Internet Protocol: IPv6
Elwyn B. Davies
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
IPv6 Basics
QoS in IPv6
IPSec in IPv6
Routing in IPv6
Network Control in IPv6
Application Programming Interfaces in IPv6
IPv6 Transition Strategies
Tunneling
Interworking between IPv4 and IPv6

Renewing the Internet

The Internet architecture has shown itself to be highly flexible and allowed
the introduction of many new services, such as the World Wide Web
(WWW).
Continuing growth of the Internet is premised on an evolving architecture.
The network architecture has to accommodate new technologies bringing
new services and applications to a growing user base. Internet Protocol
Version 4 (IPv4) was designed over twenty years ago and some constraints
have emerged that could not have been anticipated at that time. Internet
Protocol Version 6 (IPv6) is designed to relieve many of these constraints
and will underpin the continuing expansion of the Internet.
IPv6 significantly alters the way the network operates while preserving the
basic IP paradigm: data is still transmitted in individually addressed
packets across a stateless data plane, which makes a separate forwarding
decision at each hop, depending on the addressing information in the
packet.
At the core of IPv6 is a replacement for the IPv4 network layer; however,
IPv6 is much more than a protocol. IPv6 is a fundamental element of
universal, secure connectivity in evolving converged networks. The larger
address space of IPv6 is particularly important to the very large number of
new users that both wireless mobile and wireline broadband access will
bring to the Internet over the next few years.
The widespread adoption of Network Address Translation (NAT)
technology as a stop-gap solution for IPv4 address space exhaustion has
affected the ability to send a packet from anywhere-to-anywhere across the
Internet without modification. IPv6 will aid in restoring this ‘transparency,’
providing much better support for person-to-person or any-to-any
communications.
IPv6 serves to reestablish the any-to-any connectivity and automated
recovery from even major multipoint network failures that were the
hallmarks of the original IPv4-based Internet. IPv6 facilitates end-to-end
security with the ubiquitous implementation of IP security in end points.
Network security and robustness will be more challenging than ever with
the convergence of networks and the implementation of IPv6. Real-time
network engineers will need to gain experience in developing highly
resilient voice networks and providing quality of service on data networks
in order to provide secure, robust networks that fit the needs of business
and government customers.
To obtain maximum benefit from the improvements of IPv6, the IPv6
routed network should be one component of a structured, resilient network
built over a robust (optical) infrastructure. This will allow the full benefits
of IPv6 distributed routing to repair the network under major failure

Chapter 15 The Future Internet Protocol: IPv6 343
conditions while leveraging the advantages of well-established resilient

pipe technology to support fast recovery in day-to-day operations.
The Internet has now reached the point where we can foresee a very large
expansion of the required address space. The reasons are as follows:
The explosion in Internet usage in populous countries of the Asia-
Pacific region
The widespread introduction of always-on broadband technology,
which requires the permanent allocation of addresses to connected
hosts
The advent of large numbers of mobile users accessing the Internet
through third generation mobile telephones and wireless LAN
technology
The transition to IPv6 is already underway–most notably in Asia–and the
recent adoption of an IPv6 strategy by the US Department of Defense is
likely to trigger more widespread adoption of IPv6 globally.
Sidebar: How Many Addresses does IPv6 Provide?
64 billion (296) times the size of the IPv4 address space (232)
In other words, the total number of available IPv6 addresses is
340,282,366,920,938,463,463,374,607,431,768,211,456
Pessimistic estimate:
1,564 addresses per square meter of the Earth’s surface.
Optimistic estimate:
3,911,873,538,269,506,102 addresses per square meter
Maybe enough addresses to assign one to each of grain of sand
on the planet!
Even the US Department of Defense should be happy with this!
(The networked battlefield initiative needs many addresses)
Key IPv6 Items affecting Real-Time Networking

IPv6 offers solutions to various problems that network users and operators
may encounter, but there are pitfalls and penalties that the designer of a
real-time network has to take into account.
IPv6 introduces a new and much bigger address space, using 128 bit
addresses instead of the 32 bit addresses of IPv4.
The basic header of an IPv6 packet is twice the size of an IPv4 packet
header, and there may be additional extension headers in use (unlike IPv4
where options are rare), increasing the processing time and memory needs
of IP hosts and routers.

Routing and filtering of the 128 bit addresses of IPv6 requires significantly
greater processing than the smaller IPv4 addresses, and the full header
structure of IPv6 is more complex than that of IPv4. On the other hand, no
Layer 3 checksum calculation is required and QoS routing may be quicker
due to the flow label.
IPv6 has some improvements that should make basic management,
especially in Enterprise networks, considerably easier. But this is offset by
the need to manage networks that can transmit both IPv4 and IPv6, as well
as the transition and migration mechanisms that will be used during the
transition period.
During the transition from IPv4 to IPv6, significant amounts of traffic will
either be traveling through tunnels or will be using a protocol translator. In
both cases, the extra processing may add significantly to the network
latency and jitter experienced by packets.
The remainder of this chapter describes the basic points of IPv6 technology
and transition mechanisms sufficient to show how they can affect the
performance of a real-time network. As already mentioned, IPv6 affects a
great deal more than just the network layer. Refer to Appendix D for more
detailed information on IPv6.
Basics of the IPv6 network layer

IPv6 introduces a complete replacement network layer for IP networks.
The basics of IPv6 are described in the IETF standard Internet Protocol,
Version 6 (IPv6) Specification [RFC2460]. The network layer handles data
as ‘datagrams’, each of which is transmitted across the network as a self-
contained packet. As in IPv4, the datagram contains a header describing the
packet and holding the addressing information. The most compelling
reason for deploying IPv6 is the ability of IPv6 to address many more
endpoints than IPv4.

IPv6 = 40 octet header

0 4 12 16 24 31
IPv4 = 20 octet header VERS TRAFFIC CLASS FLOW LABEL
0 4 8 16 19 24 31 PAYLOAD LENGTH NEXT HEADER HOP LIMIT
VERS HLEN SERVICE TYPE TOTAL LENGTH SOURCE IP ADDRESS
IDENTIFICATION FLAGS FRAGMENT OFFSET ...

TIME TO LIVE PROTOCOL HEADER CHECKSUM ...
SOURCE IP ADDRESS ...
DESTINATION IP ADDRESS DESTINATION IP ADDRESS
IP OPTIONS (IF ANY) PADDING ...
DATA ...
...
...
Extension
Modified field for IPv6 ...
data
Deleted field for IPv6
Figure 15-2: IPv6 and IPv4 datagram headers compared

The move to a sixteen octet (128 bit) address means that the basic IPv6
header has grown to forty octets even though some header fields from IPv4
are not replicated in IPv6. This represents a significant increase in overhead
in a real-time network. The layout of the IPv6 header is shown in Figure
15-2 with the IPv4 header for comparison.
The fields in the two basic headers in Table 15-1 compare, in more detail,
the changes to the network.

IPv4 IPv6
Twenty octet header Forty octet header
Twelve fields Eight fields
Four octet (32 bit) addresses Sixteen octet (128 bit) addresses
QoS Specification in Service Type octet: QoS Specification in Flow Label and Traffic
Class octet:
Originally:
Type of Service (five bits) Originally:
Priority (three bits) (Traffic Class – never used)
Now: Now (as IPv4):
DiffServ Code Point (six bits) DiffServ Code Point (six bits),
Explicit Congestion Notification Explicit Congestion Notification
(two bits) (two bits)
Limited options (rarely used) Extensible header system

(significant number of packets will have
extension headers)
Packet fragmentation in routers Packet fragmentation only at source

Fragmentation information in main Fragmentation information in extension
header—fields always present header—present only when needed
Header has checksum No header checksum
Protocol field identifies type of Packet Next Header field in main header and each
Data Unit (PDU) extension identifies next component
Time To Live field Hop Limit

(originally supposed to be in time units (now really in hops)
but actually in hops)
Total Length: Payload Length:

combined length of header plus basic header is fixed length, extension
payload headers each include length
Table 15-1: Fields in IPv4 and IPv6 Basic Headers
Notation for IPv6 Addresses

If an IPv6 address needs to be written in a numeric format, the format used
splits the 128 bits into eight sixteen-bit fields as described in IPv6
Addressing Architecture [RFC3513]. Each field is written as a group of up
to four hexadecimal digits, with the fields separated by colons. So a fully
expanded IPv6 address might look like the following:
FEDC:BA98:7654:3210:FEDC:BA98:7654:3210

In practice, many of the fields are all zeroes:

1080:0000:0000:0000:0008:0800:200C:417A
The fully expanded form can then be shortened by removing leading zeroes
from each field:
1080:0:0:0:8:800:200C:417A
Take care not to leave out trailing zeroes! A further compression can be
achieved by leaving out a group of adjacent fields, which are all zeroes (but
only once in the whole address), leaving just the enclosing pair of colons:
1080::8:800:200C:417A
In mixed IPv4 and IPv6 environments, an IPv4 address can sometimes be
embedded as the last 32 bits of an IPv6 address, which can then be written:
::32.12.65.122
Numerical IPv6 addresses are even less memorable than their IPv4
counterparts. It is undesirable to force users to input numerical IPv6
addresses; therefore, domain names should be used for applications
whenever possible.
No IP Layer Checksum
The checksum found in IPv4 headers has not been carried over to IPv6.
The logic behind this is that almost all the risk of corruption in packets
comes from the transmission between nodes. All Layer 2 technologies in
common use provide a checksum that can detect corruption during
transmission on each separate link; IP transport protocols provide a
checksum that can detect corruption of the payload end-to-end. The
considerable extra processing load needed to calculate the checksum
initially—verify it at each node, modify it as the hop count is decremented
and recalculate it at each waypoint when a routing header is included—is
not justified and could be considered an added risk for corruption.
To ensure that a transport layer checksum is provided in all cases, the
specification of UDP has been slightly modified to force the use of the
checksum when UDP is carried over IPv6. UDP checksums are optional
for IPv4 networks, based on the grounds that there is a Layer 3 checksum.
Fragmentation
IPv6 routers are no longer expected to fragment packets if the Maximum
Transmission Unit (MTU) of the next link is too small for the size of
packet. Instead, all IPv6 nodes and links are required to handle packets up
to 1280 octets long. They may be constructed to handle bigger packets, but
must guarantee to provide this minimum value of the MTU.
Hosts may still have to fragment large packets for transmission and
reassemble them on receipt. Routers have to handle the resulting

fragments, but will drop any that exceed the available MTU and report the
error back to the source with an Internet Control Message Protocol (ICMP)
‘Packet Too Big’ error message.
A host can either decide to work with the minimum guaranteed MTU, for a
message or a session, or use Path MTU Discovery (PMTUD) to find out
what is the minimum MTU for the set of links on which a packet will
traverse in reaching a destination. Traditionally, PMTUD has been carried
out by starting transmission with large packets and reducing the size used if
‘Packet Too Big’ errors are received as described in Path MTU Discovery
for IPv6 [RFC1981]. This approach has a number of drawbacks. For
example, network resources and time are wasted transmitting the overly
large packets and the responses. Also, many systems no longer forward
ICMP messages to avoid some denial of service attacks so that the error
responses will never be received. The pmtud working group in the IETF is
currently (mid 2004) designing an improved ‘packetization layer’ system
for PMTUD that will address these concerns. The new scheme starts with
small packet lengths and increases the size used until a packet fails to make
it through the network. See Path MTU discovery draft
[I-D.ietf-pmtud-method].
There is a trade-off between the time taken to determine the actual MTU
and the overhead incurred in fragmenting some or all of the packets to be
sent to a destination. A management interface is provided to control
whether PMTUD is used for particular applications and paths. The
application designer needs to consider whether it is possible or desirable to
limit messages to the minimum MTU. If only a very few messages are
likely to exceed this value, it may be worth living with a little
fragmentation; however, for long streams of messages over the minimum
MTU, PMTUD may offer significant value if the path can offer a larger
MTU. Designers also need to be aware that the path MTU may change
during a session if the path is rerouted. Network designers should consider
how the MTU of a network will affect the applications that will run over it.
Extension headers
IPv4 packets allow a fixed maximum amount of space for optional fields
and the majority of routers handle IPv4 options reluctantly, at best. Most
IPv4 packets carrying options are classed as special cases and diverted
from the hardware supported ‘fast path’ in larger routers. The processing of
packets with options in the ‘slow path’ carries a large performance penalty;
as a result, IPv4 options are little used.
By contrast, IPv6 is designed with an extensible header system that is
intended to be flexible and future proof. Several of the capabilities that are

signalled by flags in the IPv4 header are transmitted as extension headers

when needed.
Offset to transport port numbers Note: ‘Last’ header does not
depends on set of headers in packet, (necessarily) have TLV format –
and contents thereof: Compared with Means need to know id’s of all
IPv4 where these are at fixed offsets for possible ‘last’ headers (ugh!). Done
almost all packets, and can know this to allow existing UDP, TCP etc
from header length field. PDUs to be used unchanged (could
be mended!)
0 3 4
IPv6 Main IPv6 IPv6

header Hop-by-Hop Routing “Transport”
header header Header
Next Header has id # for type of next header

(taken from same space as protocol numbers.
Figure 15-3: IPv6 Extension Headers

If one or more extension headers are needed, they are placed between the
basic header and the Protocol Data Unit (PDU), or payload. The extensible
headers use a variety of Type-Length-Value (TLV) format that allows
multiple headers to be daisy-chained onto the basic IPv6 header. The Next
Header field in the basic header contains a number that identifies the type
of the next segment of the packet. To maintain compatibility with the
payload format of IPv4 transport protocols such as TCP and UDP, the
header identifiers are taken from the same space that is already used for
protocol numbers. If an IPv6 packet does not need any extension headers,
the Next Header field contains the protocol number of the transport (or
other) protocol packet being carried, and the PDU of the protocol starts
directly after the basic header. Each header contains the type of the next
one in sequence in its Next Header field, and its own length, so that the
starting offset of the next header can be calculated.
Several of the types of extension headers that have already been defined
(see Figure 15-3) are likely to be needed during normal operation of an
IPv6 network.
For example, a node needing to send a PDU that is longer than the Path
Maximum Transmission Unit (PMTUD) has to fragment it. IPv4 packets
have fields in the basic header to hold the description of the fragments but

IPv6 sources add a fragment header to each part so that the destination can
reconstruct the PDU.
Extension
Functions Comments
Header
Contains any options that need to be
processed by every router on path:
Hop-by-Hop Jumbo Packet Allows for packets bigger than 64KB
Router Alert Flags that packet needs inspection by all
routers on path
Path Spec Specifies ‘waypoints’ that packet must
Routing
pass through
Packet Fragment Attached to each portion of a packet that
Information has had to be fragmented to allow
reassembly. Because the fragment header
Fragment
is only added when it is needed, bandwidth
is not wasted transmitting empty fields as
is the case for most packets in IPv4.
Contains options that need only be
examined at the destination of the packet:
Mobile IP Information needed to maintain the link
Destination Information between a mobile station and its home
location.
Tunnel Extension Limits on the depth of tunnel nesting for a
packet
IPsec information:
Authentication When a packet is authenticated but not
Security
encrypted
Encryption Encryption specification
Last Header Dummy Used if a packet has no protocol payload
Table 15-2: IPv6 Extension Headers

Hop-by-hop and Destination headers can contain a sequence of option
entries providing an additional level of flexibility. Each option has flags to
indicate the following conditions:
The option should be ignored or the packet discarded if the option
is not recognized by a node, which might be the case if new options
are defined.
The value in the option field may be altered as the packet traverses
the network (see “IPsec and IPv6” ) – the ‘change en-route’ flag.
If the Hop-by-Hop extension header is needed in a packet, it must be
placed immediately after the base header. Otherwise, in an unfragmented
packet extension, headers can appear in any order. The IPv6 Specification
[RFC2460] recommends an order designed to reduce the processing load
for network hardware. The following are headers that are most frequently
examined along the forwarding path are earlier in the chain:
Hop-by-Hop (always first if present)

Destination #1 (to be processed at ‘waypoints’ specified in routing

header)
Routing
Fragment
Destination #2 (to be processed at final destination)
Security
Last header or protocol PDU
If the packet has to be fragmented, then any part of the packet that needs to
be inspected at nodes on the path before the packet reaches its final
destination is unfragmentable. The unfragmentable part, including the base
header and some extension headers (currently Hop-by-Hop, Routing and
any associated Destination headers), has to be duplicated at the start of
each fragment packet. The rest of the headers and the packet payload is
divided into chunks so that the resulting fragment packets will not exceed
the MTU of the transmission path.
The IPv6 header mechanism is intended to be future proof; but, as a result,
offers-more of a challenge than IPv4 headers for fast path hardware. The
vast majority of IPv4 packets will have a fixed twenty octet IP layer header.
For transport layer protocols, the source and destination ports are then
found at fixed locations just after the IPv4 header. In contrast, many IPv6
packets will contain at least one extension header. Which headers are
present can only be determined by traversing the chain of headers; the
transport port fields will be placed at a variable position in the packet after
the extension headers.
While it seems possible to design hardware that would be able to cope with
all current and future IPv6 extension headers chained together in any order,
real-time network and application designers should be aware that some
router designers appear to have made trade-offs to reduce the complexity
of forwarding hardware. Such routers may process only packets with a
limited number of patterns of extension headers in the fast path. The
extension header order recommended in RFC2460 should be followed
when generating packets to maximize the chances of fast path processing
for the packet.
The patterns of headers that are processed in the fast path may well cover
the majority of IPv6 datagrams; however, these designs are not fully future
proof. They will limit the performance of applications that force the use of
header patterns that the design will not process in the fast path.
Quality of Service in IPv6

It is often said that IPv6 offers improved Quality of Service (QoS)
capabilities. In practice, QoS capabilities in most current routers are

exactly analogous to those in IPv4; but, there is the potential for

improvement in the future.
IP QoS both in IPv4 and IPv6 uses either the Integrated Services (IntServ)
[RFC2205] or the Differentiated Services (DiffServ) [RFC2475]
mechanisms. These mechanisms are both supported in the same way in the
two versions of IP. For more information on QoS mechanisms, see Chapter
10.
In IntServ, the RSVP protocol is used to inform the routers along the path
of a packet stream of the QoS treatment that should be given to the stream.
A flag in each RSVP packet signals to the routers that they need to examine
the packet more closely and probably instantiate some state that will enable
them to identify the packets in the following data packet stream and
remember the QoS treatment to be applied. In IPv6, this flag is carried in
the Hop-by-Hop extension header as the Router Alert Option [RFC2711].
At present, packets in the data stream are identified by matching some or
all of the traffic specification fields against the values in the packet, which
normally includes the source and destination addresses, the transport
protocol number, and the transport source and destination port numbers.
This is also known as five tuple matching.
In DiffServ, each packet carries a Differentiated Services Code Point
(DSCP) value that is used to select a Per-Hop Behavior (PHB) in each
router through which it passes. The six bit DSCP value is carried in what
was originally to have been the Traffic Class field of IPv6 packets and is
identical to the DSCP used in IPv4. The PHB determines the relative
priority that is given to the packet and the QoS achieved as a result. The
DSCP to be used for a flow of packets is determined either by the source or
at the edge of a domain that supports DiffServ usually using 5-tuple
matching as in IntServ.
While the initial operation of IntServ and DiffServ with an IPv6 network
layer is essentially unchanged, real-time network designers should be
aware that the implementation of five tuple filters and classifiers is more
difficult because of the variable size of extension headers (see “Extension
headers” ). The performance of nodes that have to do classification should
be checked to ensure that it is adequate for the number of flows expected.
In future, the performance of classifiers may be much improved by the use
of the flow label field in the IPv6 header. This field can be filled with a bit
pattern that is unique to certain flows, or types of flow passing between a
given source and destination, allowing routers to perform effective
classification of packets based only on the address and flow label fields in
the basic IPv6 header. This has considerable advantages for traffic
classification and QoS processing:
The fields used are at fixed offsets in the packet, making high-speed
classification hardware simpler.

The fields are not affected by encryption, allowing QoS and IPsec
security to work together.
No layer violations are needed (the network layer doesn’t need to
take into account which transport protocol is used or values of
transport parameters).
The basic standards for the use of the flow label have just been completed
at the time of writing [RFC3697]. Applications have not yet been adapted
to take advantage of this new capability, and classifiers using the flow label
field are not generally available at present. The RSVP specification
[RFC2205] and MIB [RFC2206] already allow the flow label to be
included in traffic specifications. Likewise, the DiffServ architecture
[RFC2475] implicitly allows use of the flow label: the MIB [RFC3289]
and PIB [RFC3317] for DiffServ already provide standardized access to
classifiers that use the flow label.
IPsec and IPv6

Internet Security at the network layer in an IPv6-enabled network uses the
same IPsec framework that has already been deployed for IPv4 [RFC3168].
Network layer security allows all IP datagrams to be protected without
applications needing to explicitly add authentication data or provide
encryption.
IPsec with IPv6 offers a considerable step forward – it is much better
integrated into IPv6 through the use of extension headers. With the

mandatory deployment of IPsec on all IPv6 nodes, it will be easier to

secure most communications from end to end.
Authentication Header
IPv6 Header Authentication Header TCP Header Payload
IPv6 HDR
Next header = Authentication
Value 51 e
Version Class Flow Nxt hdr PL length Res Source port Destination port
PL length Nxt Hdr Hop lmt Security Parameter Index Sequence number
Source IP address Sequence number field Acknowledgement number
Checksum + urgent pointer

Options + padding
= authenticated
Encrypted security payload

IPv6 Header Encryption Header TCP Header Payload Encryption Trailer
IPv6 HDR
Next header = Encryption
Value 50
Version Class Flow Security Parameter Index Source port Destination port Padding
PL length ! Nxt Hdr Hop Imt Sequence number Sequence number Padding length
Source IP address Encryption parameters Acknowledgement number Payload type
(eg. initialization vector)
Destination IP address Code bits + window size Authentication data (optional)
Checksum + urgent pointer
= encrypted Options + padding
Figure 15-4: IPv6 Security Header Layouts

The use of security headers (see Figure 15-4) represents a considerable
overhead that needs to be taken into account if it is intended to use IPsec
with short payloads. One optimization that is available in IPv6 allows the
Encrypted Security Payload (ESP) header [RFC2406] to also carry the
authentication data, which would otherwise need an additional
Authentication Header (AH) [RFC2402]. It is possible that the AH may be
deprecated in future because the ESP can carry the same data.
There are some fields in the IPv6 header that may legitimately change as
the packet passes through the network. These fields cause difficulty for
IPsec authentication. To avoid this the Version, Traffic Class (which now
contains the DSCP) and Flow Label fields in the IPv6 header are excluded
from the calculation of cryptographically secure checksums, while the Hop
Limit field is assumed to contain a zero and options within Hop-by-Hop
and Destination option extension headers have a ‘change-en-route’ flag bit.
If this bit is set, the option field is incorporated as a sequence of zero octets
when calculating checksums.
Before IPsec can be used to secure communications between two end
points, a Security Association (SA) has to be established for each direction
of packet flow. SAs describe the authentication and encryption options to
be used in the communication and provide a container for the secret keys

used. SAs can be set up manually at each end point; but, this is obviously
not desirable for a network that hopes to make extensive usage of IPsec.
One of the main obstacles to widespread deployment of IPsec for end-to-
end communications has been the slow development of an acceptable key
exchange protocol and ubiquitous key distribution infrastructure. The
Internet Key Exchange (IKE) protocol is still under development to replace
two earlier key exchange mechanisms that have not found wide acceptance
[RFC2408], [RFC2409]. The second version of the Internet Key Exchange
protocol (IKEv2) shows considerable promise as a reasonably simple and
robust key distribution mechanism [I-D.ietf-ipsec-ikev2].
IPsec provides two modes of operation:
Tunnel Mode, which has been widely deployed to provide security
protection for Virtual Private Networks and ‘road warriors’ in IPv4-
based networks, is designed for use where hosts are generally IPsec
unaware. The secured tunnel end points are provided by specialized
security gateways or add-on ‘extranet clients’.
Transport Mode is designed for use by IPsec aware hosts and
provides end-to-end security between pairs of such hosts.
The use of Transport Mode IPsec connections to provide end-to-end
security is likely become much more prevalent in IPv6 networks because of
the requirement that all nodes support IPsec. Apart from the problems of
key distribution, the use of Transport Mode IPsec has been inhibited by the
interactions between IPsec and firewalls, and NATs and other ‘middle
boxes’ that currently need to examine the fields of packets that may be
concealed by IPsec. The introduction of IPv6 should essentially eliminate
NATs, and work is in progress to solve the other problems.
IPsec is not a suitable solution for the security of all real-time data
exchanges. The overhead and the sensitivity to lost packets makes IPsec
less appropriate for media streams and it is likely that IPsec will be
reserved for the protection of signaling protocols. Even if IPsec becomes
commonly available in IPv6 networks, media streams may well make use
of a stream security protocol, such as Secure RTP [RFC3711] and its
associated keying mechanisms, so that lost packets do not result in
termination or loss of synchronization between end points and reduction of
overhead encryption.
Operating systems typically provide mechanisms to establish an overall
security policy, set up the database of security associations, and to allow
Application Programmable Interface (API) applications to control the use
of IPsec within the limits of the overall policy. See Appendix D for more
details on this subject.

Layers above the Network Layer
Transport Protocols
IPv6 is primarily a replacement for the IP Network Layer. All the existing
IP transport protocols (including TCP, UDP, SCTP) can be carried as
payloads in an IPv6 datagram just as in IPv4. As mentioned in “Extension
headers” , the protocol number for the transport in use is placed in the Next
Header field of the last IPv6 extension header (or in the basic header if
there are no extensions).
The only modifications that are needed have to do with transport layer
checksums [RFC2460]:
The use of IPv6 affects the pseudo-header, which is used to
calculate the checksum carried in the transport payload and
includes some of the IP header fields.
Because there is no IP layer checksum in IPv6, the use of a UDP
checksum has been made mandatory for IPv6 networks (it was
optional for IPv4).
Upper Layer and Application Protocols

Many upper layer protocols can be used unchanged in networks with an
IPv6 Network Layer. The only problems that are likely to arise stem from
payloads that carry IP addresses in numeric form (rather than using fully-
qualified domain names). Several common protocols are affected,
including the control connection of FTP, SDP and SNMP.
Careful consideration has to be given to backwards compatibility where a
new version of a protocol supporting IPv4 and IPv6 needs to interoperate
with legacy implementations that support only IPv4.
Control, Operations and Management
ICMPv6 and IPv6 Network Configuration

The Internet Control Message Protocol (ICMP) currently standardized by
[RFC2463] but being updated by [I-D.ietf-ipngwg-icmp-v3] is an essential
assistant to any IP network layer. ICMPv6 is much more powerful than
ICMP for IPv4. It has been extended to provide many of the functions in
which IPv4 needed helper protocols, such as the Address Resolution
Protocol (ARP), Reverse ARP and the Internet Group Management
Protocol (IGMP), and a few more. The underused and deprecated features
of ICMPv4 have been removed.
Many of the extensions in ICMPv6 provide for easier automated
configuration of networks, and configuration of the IP layer through IP
protocols (ARP and RARP are link layer protocols). The mechanisms

supported by IPv6 assume that most of the link layer networks that will be
used are Ethernet-like, and will work most easily with multiple access link
layers that support broadcast transmission at the link layer. Point-to-point
links are no problem; but, using a multiple access network—such as a
multipoint ATM network or X.25 network, which does not support
broadcast—requires extra work.
ICMPv6 now has a multitude of functions:
Returns error messages to the source if a packet could not be
delivered. Four different error messages are specified in
[RFC2463].
Monitors connectivity through echo requests and responses used by
the ping and traceroute utilities. The Echo Request and Echo
Response messages are specified in [RFC2463].
Finds neighbors (both routers and hosts) connected to the same link
and determines their IP and link layer addresses. These messages
are also used to check the uniqueness of any addresses that an
interface proposes to use through Duplicate Address Detection
(DAD)—DAD can be turned off if the network administrator
believes that the configuration method used is bound to generate
unique addresses. Four messages—Neighbor Solicitation (NS),
Neighbor Advertisement (NA), Router Solicitation (RS) and Router
Advertisement (RA)—are specified in [RFC2461].
Ensures that neighbors remain reachable using the same IP and link
layer address by applying Neighbor Unreachability Discovery
(NUD) and notifies neighbors of changes to link layer addresses.
This function uses NS and NA messages as specified in [RFC2461].
Finds routers and determines how to obtain IP addresses to join the
subnets supported by the routers. This function uses RS and RA
messages as specified in [RFC2461].
Communicates prefixes and other configuration information
(including the link MTU and suggested hop count default) from
routers to hosts if stateless autoconfiguration of hosts is enabled
(see “Autoconfiguration for Hosts” ). This function uses RS and
RA messages as specified in [RFC2461].
Redirects packets to a more appropriate router on the local link for
the destination address or points out that a destination is actually on
the local link even if it is not obvious from the IP address (where a
link supports multiple subnets). This facility could be used by a
malicious sender to divert packets. Nodes should provide
configuration options to prevent the messages being sent by routers
and acted on by hosts. The redirect message is specified in
[RFC2461].

Supports renumbering of networks by allowing the prefixes

advertised by routers to be altered. Uses NS, NA, RS and RA
messages as specified in [RFC2461].
Communicates which multicast groups have listeners on a link to
the multicast capable routers connected to the link. Uses messages
Multicast Listener Query, Multicast Listener Report (two versions)
and Multicast Listener Done (version 1 only) as specified in
Multicast Listener Discovery MLDv1 [RFC2710] and
MLDv2[RFC3810].
Provides support for some aspects of Mobile IPv6, especially
dealing with the IPv6 Mobile Home Agent functionality provided
in routers and needed to support a Mobile node homed on the link.
The Home Agent Address Discovery Request and reply, and
Mobile Prefix Solicitation and Advertisement messages are
specified in [I-D.ietf-mobileip-ipv6]
Allows ICMPv6 to provide some basic information about the node
to interested parties (based on a 2003 proposal).
Autoconfiguration for Hosts

One of the major improvements in IPv6 is complete support for ‘plug and
play’ connection to the network described in [RFC2461] and [RFC2462].
A host on an IPv6 network will hardly ever need any manual configuration
before acquiring addresses and being able to communicate on its connected
networks. By contrast, the best that is normally available for IPv4 is that
the user has to tell the host to use DHCP to obtain an address. Two
alternative mechanisms are defined to support this autoconfiguration
process. They are known as ‘stateless’ and ‘stateful’ autoconfiguration.
Stateless autoconfiguration derives the information needed to start up a
host from a router connected to the same link. The mechanism is stateless
because the router doesn't need to maintain any additional information
about which hosts obtain configuration from the router. (This is not to say
that the router does not have any state relating to these hosts; it will have
state about all hosts connected to the link in order to know how to direct
packets to them). Stateful autoconfiguration emulates the current process
for IPv4 using an updated version of DHCP. Stateful autoconfiguration is
likely to be of continuing use in Enterprise networks where administrators
will want to retain central control of configuration; whereas, stateless
configuration will simplify the installation of small office and home
networks.
IPv6 Addressing – Multiple Addresses per Interface

IPv6 allows every interface to have multiple unicast IP addresses—unlike
IPv4 where the rule was strictly one unicast address per interface. Expect
an IPv6-capable interface to have a minimum of two addresses except in

the most trivial of networks. Every interface will have a 'link local' address,
which is required for operational control communications with neighbors
and local routers on the same link but can be used for application traffic
between neighbors on the same link. For unicast communication beyond
the local link, the interface will need at least one global unicast address and
it may be a member of several multicast groups.
Applications have to be adapted to use the address possibilities correctly: a
prospective communication partner that has several addresses may only be
reachable on some of them (for example, because of a network failure) and
the application has to be prepared to cycle through the available addresses
to find one that works. More details on this process are given in Appendix
D.
Real-time network designers should be aware that cycling through the
possible destination addresses may take significant time, due to the
network round trips and timeouts involved in determining that an address is
unreachable or unusable. Any additional knowledge that is available should
be used to inform the address selection procedure and override the defaults
where this will speed up the communication process.
DNS and Dynamic DNS for IPv6

DNS has been extended so that multiple IPv6 addresses can be associated
with a domain name as well as IPv4 addresses. IPv6 addresses are held in
new AAAA (‘quad-A’) records [RFC3596], which are direct analogs of the
A records for IPv4 addresses1. A corresponding reverse lookup domain,
IP6.INT, for mapping IPv6 addresses to domain names has been defined.
Autoconfiguration of an IPv6 interface will generally result in a need to
dynamically update the DNS records for the node’s domain name.
Dynamic DNS (DDNS), which was developed to support DHCP, provides
the technical means to alter the DNS database; however, there are
significantly increased security concerns if stateless autoconfiguration is in
use. For DHCP and DHCPv6, DDNS need only trust the DHCP server.
With IPv6 stateless autoconfiguration, DDNS updates may originate from
any node; security mechanisms to protect these transactions present both a
scalability and performance challenge.
Dynamic Routing Protocols for IPv6 Networks

All the standardized routing protocols used for IPv4 have been adapted for
use in IPv6 networks. See Appendix D for more details on the routing
protocol differences. There is no significant difference in the interactions
1. An alternative proposal, which introduced A6 records [RFC3596], was intended to make it easier to
renumber networks; but, this has now been made experimental because of some fears that the
recursive lookups needed might result in loops and security breaches.

between the IPv4 and IPv6 versions of the routing protocols and real-time
applications.
IPv6 Transition Strategies
Transition Mechanisms Historical Perspective

Transition from and interoperation with the current IPv4-based Internet
was one of the original requirements for IPv6. The Next Generation
Transition (ngtrans) working group was established at the same time that
the original IPng working group started work on the protocols of IPv6.
Since 1994, ngtrans has fathered a bewildering array of proposed
mechanisms to assist the interoperation and parallel working of IPv6 and
IPv4. Many of these tools have been found to have practical drawbacks
either on security or scalability grounds.
The practicality of the original tools has now been addressed by a second
and, in some cases, a third generation of tools that more clearly address the
real requirements of deployment. The tools need to address networks
operating large numbers of IPv4 and IPv6 capable nodes in parallel and
with significant amounts of interoperation. The tremendous expansion of
IPv4-based networks during the long gestation period of IPv6 means that
coexistence of IPv4 and IPv6 will be a major requirement for many years to
come.
In 2002, the IETF decided that IPv6 is ready for widescale deployment and
it has been declared operational. To mark this event, the ngtrans working
group has been shut down and its work taken over by the new IPv6
Operations working group (v6ops). The set of available tools is
summarized in Basic Transition Mechanisms for IPv6 Hosts and Routers
[I-D.ietf-v6ops-mech-v2].
Deployment Scenario Analysis

Four groups of likely deployment scenarios are being addressed by work
going on in the v6ops working group. For each scenario, the appropriate
tools to cover the situations in each scenario are being identified. In
general, the tools to be used are those that have already been created; but, if
it seems that there are any gaps in provision, v6ops will initiate work to
develop new mechanisms. In the meantime, this work will serve to
prioritize the work on existing mechanisms; any mechanisms that do not
appear useful will be relegated to 'historical' status. Some first generation
tools are already embarked on this path.
The deployment scenarios being considered at present are as follows:
Unmanaged networks (for example, many SOHO networks)
Enterprise networks

ISP networks
3G Cellular networks
Once this work has been completed, it will naturally suggest the
mechanisms that appear most useful, and natural selection can take its
course.
Document Title Comments

[RFC3750] Unmanaged Describes a set of 'unmanaged
Networks IPv6 networks' scenarios so that the
Transition Scenarios suitability of IPv4-to-IPv6 transition
and coexistence mechanisms can be
evaluated for each type of network
[I-D.ietf-v6ops- Evaluation of Evaluates the suitability of IPv4-to-
unmaneval] Transition IPv6 transition and coexistence
Mechanisms for mechanisms for unmanaged
Unmanaged networks scenarios.
Networks
[I-D.ietf-v6ops-ent- IPv6 Enterprise Defines a set of Enterprise network
scenarios] Networks Scenarios scenarios and the issues that they
face in moving to the use of IPv6.
[I-D.ietf-v6ops-isp- Scenarios and Describes different scenarios in
scenarios-analysis] Analysis for service provider networks and
Introducing IPv6 into applicability of transition
ISP Networks mechanisms to these scenarios.
[RFC3574] Transition Scenarios Describes network scenarios relevant
for 3GPP Networks to transition to IPv6 in UMTS
networks and the issues that are
relevant.
[I-D.ietf-v6ops-3gpp- Analysis on IPv6 Describes the transition to IPv6 in
analysis] Transition in 3GPP UMTS packet networks. Focus is on
Networks analysis of different scenarios,
applicable transition and coexistence
mechanisms, and suitable solutions.
Table 15-3: Deployment Scenario Documents

Unicast Transition Mechanisms Taxonomy

IPv4 host
IPv6 host
IPv4 only
IPv6 host router
IPv4 IPv6
IPv6 IPv6 only
router
IPv6/IPv4 IPv4 only IPv4 only IPv6/IPv4
IPv6 host translator router router translator
IPv6 IPv6/IPv4
translator
IPv6/IPv4 IPv6/IPv4
Single administrative host host
domain IPv6 only
IPv6 router
IPv4 only with IPv6 islands IPv6
IPv4 IPv4
IPv4 only
router
IPv6/IPv4 IPv6/IPv4 IPv6/IPv4
translator router router
IPv4
IPv6
Figure 15-5: IPv4/IPv6 Unicast Transition Mechanisms

The network diagram in Figure 15-5 shows, at a very high level, the
scenarios that are likely to need IPv4-to-IPv6 unicast transition or
coexistence mechanisms. It shows that there are two fundamental classes of
mechanisms that are needed during the initial part of the transition:
Coexistence mechanisms to connect newly created islands of IPv6
connectivity with other islands of IPv6 connectivity across
intervening oceans with other sorts of connectivity, including IPv4,
MPLS and point-to-point links such as frame relay.
Translation mechanisms to allow individual IPv6 capable nodes
or islands of IPv6 connectivity to communicate with legacy nodes
that have only IPv4 connectivity.
As IPv6 becomes established in the network, IPv4 and IPv6 coexistence
will predominantly be delivered by the universal use of dual stack routers
and end systems (see “Deploying IPv6 using Dual Stack Networks” ).
Backbone networks will transport IPv4 and IPv6 in parallel, with IPv4
packets forwarded according to routes learned from IPv4 routing protocols
and IPv6 forwarding driven by native IPv6 routing protocols.
There is a final class of mechanism, which might be needed towards the
end of the transition, to allow the interconnection of islands of IPv4
connectivity across an ocean of IPv6 connectivity. Assuming that IPv6
does become widely deployed, it is clear that IPv4 and IPv6 will have to
coexist for a considerable period of time. By the end of this period, it is
assumed that nodes with only IPv4 capability will be a tiny minority. IPv4
capability can be offered to a domain that offers only IPv6 connectivity. Do
not confuse this mechanism with Dual Stack Networks!

There is a very small amount of work going on to address transition of

multicast protocols that is covered in “Multicast Transition Mechanisms” .
Connecting IPv6 Islands

Solutions for connecting IPv6 islands are as follows:
Using point-to-point Layer 2 technologies, such as frame relay,
ATM Virtual Circuits or Packet Over SONET, to provide links
between each pair of IPv6 islands
Using MPLS networks to carry IPv6 packets
Using tunnels across existing IPv4 networks to carry encapsulated
IPv6 packets
Which solution is deployed depends on the available infrastructure and the
requirements of the situation. In many cases, the solutions are very similar
to existing solutions used to interconnect IPv4 sites, and the benefits and
limitations are also similar.
Point-to-Point Links
Standards for the encapsulation of IPv6 packets in various Layer 2
technologies have been defined. The interconnection routers are similar to
those used to carry IPv4 over point-to-point links with IPv6 interfaces on
the customer side and L2 interfaces on the core network side. As with the
IPv4 case, this solution has a high setup and management cost, but is useful
for situations where stable, secure communications with a well-defined
traffic pattern is required. It can be implemented without needing any IPv6
capabilities from the Layer 2 service provider.
MPLS Networks
MPLS Label Switched Paths (LSPs) can carry IPv6 packets using a
standard encapsulation. Existing MPLS networks with IPv4-based control
planes can be used to carry IPv6 if the Label Edge Routers (LERs) have
dual IPv4/IPv6 stacks. In due course, MPLS networks will be built with
IPv6 control planes and the need for dual stack LERs will gradually
disappear.
Tunnels across IPv4 Networks

Tunnelling is (in this case) the encapsulation of IPv6 traffic within IPv4
packets so that they can be sent over an IPv4 backbone, allowing isolated
IPv6 end systems and routers to communicate without the need to upgrade
the IPv4 infrastructure that exists between them. Tunnelling is one of the
key deployment strategies for both service providers and Enterprises during
the period of IPv4 and IPv6 coexistence.

A considerable variety of tunnelling mechanisms have been proposed; not

all of which will survive exposure to practical usage. More details of the
different proposals are given in Appendix D. All the different kinds of
tunnels may have an impact on performance of networks used for real-time
applications because of the overheads of encapsulation and decapsulation,
plus the extra bits to be transported in the encapsulating headers.
Maintaining QoS in a tunnelled environment may also be challenging—it
is essential that the tunnel endpoints and the treatment of the traffic within
the tunnel provide the required prioritization to support real-time traffic
flows in the case of congestion.
A major issue for IPv4 tunnels carrying IPv6 traffic is the ability of tunnels
to traverse NATs. To overcome this problem, alternative tunnelling
schemes have been developed that encapsulate the IPv6 packet in an IPv4
UDP transport packet and can pass through NATs. See Appendix D for
more details of the available schemes.
Deploying IPv6 using Dual Stack Networks

As IPv6 becomes more established, IPv6 traffic volumes will increase
towards parity with IPv4 and beyond, while the implementation of IPv6
protocol stacks and forwarding mechanisms will be tuned so that they
achieve the sorts of efficiency that are now achieved by IPv4. As this
happens, the simplest way to provide for the coexistence of IPv4 and IPv6
networks will be through widespread deployment of dual-stack networks,
where all nodes support both IPv4 and IPv6 protocol stacks.
In dual-stack networks, IPv4 communication uses the IPv4 protocol stack
with the forwarding of IPv4 packets driven by routes learned through
running IPv4-specific routing protocols, while IPv6 communication uses
the IPv6 stack with routes learned through the IPv6-specific routing
protocols.
In support of Dual Stack Networks, it will be necessary to have an
appropriate implementation of DNS that can map domain names to both
IPv4 and IPv6 addresses. Applications will be able to choose between
using IPv4 or IPv6 for communication, depending on the responses from
the DNS resolver. Once the protocol has been selected, the application will
have to select the correct address from what may be a list of addresses
according to the application requirements.
In order to make best use of Dual Stack networks, it will be essential to
have an integrated management system that can control both IPv4 and IPv6
routing stacks. The protocols and communications will share physical
resources in the routers and must be configured for both IPv4 and IPv6
functions.

Interworking between IPv4 and IPv6

The mechanisms described above will allow IPv6 islands to be installed
and connected together, enabling end-to-end IPv6 communication between
hosts. As IPv6 becomes more important in deployed systems, it will be
essential for the new IPv6 hosts to be able to exchange data with legacy
IPv4 systems. The transition period seems likely to last for a considerable
number of years and some services may not be available through IPv6 until
later in the transition period. Likewise, late adopters of IPv6 might wish to
access some new service that is only available from an IPv6 server.
Interworking requires translation between IPv6 and IPv4 packet formats
together with mapping between the address families. The techniques used
are closely related to those used for IPv4 NATs—the mechanism that is
mentioned most when translation is being discussed is Network Address
Translation–Protocol Translation (NAT-PT). Many of the translation
techniques require complex adaptation for particular protocols and may
have security problems. See Appendix D for more details on translation
mechanisms.
Transition to IPv6 without using general purpose translators is desirable
when it can be achieved. Translators will introduce significant delay and
represent a nexus of failure and a target for security attacks so that network
designers should try to avoid using them if at all possible.

In the later stages of deployment of IPv6, IPv6 will become the norm for
communication. There will be islands of IPv4 connectivity without IPv6
connectivity, and some parts of the network will have only IPv6
connectivity. Some experimental work is going on to develop techniques
for this final stage of the transition. Since this is likely to be many years in
the future, it will not be commercially deployed for some time and the
technology may be revamped considerably. See Appendix D for more
information.
Multicast Transition Mechanisms

Very little work has been done in this area. Two proposals have been put
forward: 'An IPv6/IPv4 Multicast Translator based on IGMP/MLD
Proxying (mtp)' [ID.tsuchiyamtp]) and 'An IPv4-IPv6 Multicast Gateway'
[I-D.venaas-mboned-v4v6mcastgw]. Both drafts have now expired and it is
not clear if any work is going on in this area. The first proposal gives an
outline of a translation mechanism analogous to NAT-PT, combined with a
double-sided multicast listener discovery proxy. The second proposal has
many similarities. More work will be needed if there is to be significant
interaction between multicast applications in the two domains.


IPv6 provides a replacement network layer for the IP protocol stack. It
should be clear from the discussion that IPv6 is not just a new protocol—it
is a metaphor for the next generation of enhanced IP networking, which is
intended to solve some of the problems that have become apparent over the
twenty year lifespan of IPv4. It provides the basis for continued expansion
of the Internet, providing renewed flexibility in the architecture. This will
greatly facilitate the introduction of many new services, especially those
that require real-time performance and are more oriented to peer-to-peer
networking than has been the case since the beginning of the dominance of
the World Wide Web.
You should have learned about the vastly expanded address space and the
revised structure of the IP datagram that is needed to support this, together
with the more flexible structure of extension headers that will underpin the
renewed flexibility of the Internet architecture. The differences between
IPv4 and IPv6 datagrams and their treatment in the network, especially as
regards to fragmentation of long packets, were discussed.
QoS capabilities and closer integration of IPsec security capabilities were
shown to be realizations of two of the original design intentions for IPv6,
allowing finer-grained QoS and end-to-end security for all connections.
The reuse of the existing transport layer protocols and the close
relationships between the routing protocols used for IPv6 and IPv4 should
have emphasized that the overall operation of an IPv6 network should be
very similar to existing IPv4 networks.
The control and management of IPv6 networks shows significant
differences from IPv4. IPv6 is intended to allow total plug-and-play
operation of hosts, at least. It should have become clear that the IP control
protocol ICMPv6 has acquired a very large number of additional functions
as compared with the corresponding IPv4 protocol. Many of these features
support automated configuration of networks and maintenance of
connectivity relationships during operation, both for unicast
communication and multicast groups. And finally in this section, DNS and
the need for selection of source and destination addresses resulting from
the multiplicity of addresses that an IPv6 node or interface can acquire.
The next major section discusses the plethora of mechanisms that have
been designed to facilitate a smooth transition from IPv4-only networks
through a long period of coexistence between IPv4 and IPv6 to an eventual
pure IPv6 network. The use of dual-stack nodes as the primary means of
transition and coexistence is recommended and there is extensive
discussion of the various tunnelling mechanisms that can be used to link
islands of IPv6 connectivity across oceans of legacy IPv4 backbones.
Mechanisms for translating between IPv4 and IPv6 packets are noted;
however, the use of these schemes seems to be undesirable for many of the
same reasons that NATs are not a good feature of IP networks.

References
[I-D.ietf-ipngwg-icmp-v3] Conta, A. and S. Deering, “Internet Control
Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6)
Specification,” draft-ietf-ipngwg-icmp-v3-04 (in preparation), IETF, June
2004.
[I-D.ietf-ipsec-ikev2] Kaufman, C., “Internet Key Exchange (IKEv2)
Protocol,” draft-ietf-ipsec-ikev2-14 (in preparation), IETF, June 2004.
[I-D.ietf-mobileip-ipv6] Johnson, D., Perkins, C. and J. Arkko, “Mobility
Support in IPv6,” draft-ietf-mobileip-ipv6-24 (in preparation), IETF, July
2003.
[I-D.ietf-pmtud-method] Mathis, M., “Path MTU Discovery,” draft-ietf-
pmtud-method-01 (in preparation), IETF, February 2004.
[I-D.ietf-v6ops-3gpp-analysis] Wiljakka, J., “Analysis on IPv6 Transition
in 3GPP Networks,” draft-ietf-v6ops-3gpp-analysis-10 (in preparation),
IETF, May 2004.
[I-D.ietf-v6ops-ent-scenarios] Bound, J., “IPv6 Enterprise Network
Scenarios,” draft-ietf-v6ops-ent-scenarios-04 (in preparation), IETF, July
2004.
[I-D.ietf-v6ops-isp-scenarios-analysis]Lind, M., Ksinant, V., Park, S.,
Baudot, A. and P. Savola, “Scenarios and Analysis for Introducing IPv6
into ISP Networks,” draft-ietf-v6ops-isp-scenarios-analysis-03 (in
preparation), IETF, June 2004.
[I-D.ietf-v6ops-mech-v2] Nordmark, E. and R. Gilligan, “Basic Transition
Mechanisms for IPv6 Hosts and Routers,” draft-ietf-v6ops-mech-v2-03 (in
preparation), IETF, June 2004.
[I-D.ietf-v6ops-unmaneval]Huitema, C., “Evaluation of Transition
Mechanisms for Unmanaged Networks,” draft-ietf-v6ops-unmaneval-03
(in preparation), IETF, June 2004.
[I-D.tsuchiya-mtp] Tsuchiya, K., Higuchi, H., Sawada, S. and S. Nozaki,
“An IPv6/IPv4 Multicast Translator based on IGMP/MLD Proxying
(mtp),” draft-tsuchiya-mtp-01 (in preparation), IETF, February 2003.
[I-D.venaas-mboned-v4v6mcastgw] Venaas, S., “An IPv4 - IPv6 multicast
gateway,” draft-venaas-mboned-v4v6mcastgw-00 (in preparation), IETF,
February 2003.
RFC1981, McCann, J., Deering, S. and J. Mogul, “Path MTU Discovery
for IP version 6,” IETF, August 1996.
RFC 2205, Braden, B., Zhang, L., Berson, S., Herzog, S. and S. Jamin,
“Resource ReSerVation Protocol (RSVP) -- Version 1 Functional
Specification,” IETF, September 1997.

RFC 2206, Baker, F., Krawczyk, J. and A. Sastry, “RSVP Management

Information Base using SMIv2,” IETF, September 1997.
RFC 2402, Kent, S. and R. Atkinson, “IP Authentication Header,” IETF,
November 1998.
RFC 2406, Kent, S. and R. Atkinson, “IP Encapsulating Security Payload
(ESP),” IETF, November 1998.
RFC 2408, Maughan, D., Schneider, M. and M. Schertler, “Internet
Security Association and Key Management Protocol (ISAKMP),” IETF,
November 1998.
RFC 2409, Harkins, D. and D. Carrel, “The Internet Key Exchange (IKE),”
IETF, November 1998.
RFC 2460, Deering, S. and R. Hinden, “Internet Protocol, Version 6 (IPv6)
Specification,” IETF, December 1998.
RFC 2461, Narten, T., Nordmark, E. and W. Simpson, “Neighbor
Discovery for IP Version 6 (IPv6),” IETF, December 1998.
RFC 2462, Thomson, S. and T. Narten, “IPv6 Stateless Address
Autoconfiguration,” IETF, December 1998.
RFC 2463, Conta, A. and S. Deering, “Internet Control Message Protocol
(ICMPv6) for the Internet Protocol Version 6 (IPv6) Specification,” IETF,
December 1998.
RFC 2475, Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z. and W.
Weiss, “An Architecture for Differentiated Services,” IETF, December
1998.
RFC 2710, Deering, S., Fenner, W. and B. Haberman, “Multicast Listener
Discovery (MLD) for IPv6,” IETF, October 1999.
RFC 2711, Partridge, C. and A. Jackson, “IPv6 Router Alert Option,”
IETF, October 1999
RFC 3168, Ramakrishnan, K., Floyd, S. and D. Black, “The Addition of
Explicit Congestion Notification (ECN) to IP,” IETF, September 2001.
RFC 3289, Baker, F., Chan, K. and A. Smith, “Management Information
Base for the Differentiated Services Architecture,” IETF, May 2002.
RFC 3317, Chan, K., Sahita, R., Hahn, S. and K. McCloghrie,
“Differentiated Services Quality of Service Policy Information Base,”
IETF, March 2003.
RFC 3513, Hinden, R. and S. Deering, “Internet Protocol Version 6 (IPv6)
Addressing Architecture,” IETF, April 2003.
RFC 3574, Soininen, J., “Transition Scenarios for 3GPP Networks,” IETF,
August 2003.

RFC 3596, Thomson, S., Huitema, C., Ksinant, V. and M. Souissi, “DNS
Extensions to Support IP Version 6,” IETF, October 2003.
RFC 3697, Rajahalme, J., Conta, A., Carpenter, B. and S. Deering, “IPv6
Flow Label Specification,” IETF, March 2004.
RFC 3711, Baugher, M., McGrew, D., Naslund, M., Carrara, E. and K.
Norrman, “The Secure Real-time Transport Protocol (SRTP),” IETF,
March 2004.
RFC 3750, Huitema, C., Austein, R., Satapati, S. and R. van der Pol,
“Unmanaged Networks IPv6 Transition Scenarios,” IETF, April 2004.
RFC 3810, Vida, R. and L. Costa, “Multicast Listener Discovery Version 2
(MLDv2) for IPv6,” IETF, June 2004.


371
Section V:
Network Design and Implementation
The previous sections have described individual topics and technologies
that form the building blocks of the Real-Time network. This section deals
with the broader aspects of how these building blocks are combined into an
entity called a network. The focus shifts from the details of technology and
protocol definition to the behavior of the elements and devices in
combination: the operation, resiliency, and performance of the network as a
whole. The chapters in this section address ways the various protocols,
technologies, and techniques can be integrated into a complete solution,
based on Quality of Experience performance targets.
Implementation of real-time networking principles requires the
coordination of a number of network deployment options. These options
include the end-to-end QoS mechanisms, network architecture and
topology choices, selection of codec and packet size, and provisioning of
link speed to ensure compatibility with the desired performance.
Successful deployment requires an understanding of how the packet
network behaves under different traffic conditions and how it will
interwork with any existing legacy systems, whether as part of the same
network or where a call is handed off to another network or carrier.
A converged network takes on the reliability requirements of the most
demanding traffic, application, or service that it carries. If the network
carrying only e-mail traffic goes down for a few seconds every five
minutes, users might not even notice. Similar outages on a network
carrying mission-critical services can result in business disruption,
confusion, frustration, or even loss of life.
Chapter16, Network Address Translation (NAT) will help you gain a
working knowledge of this technology and its implications for real-time
network design.
Chapter 17 and 18 describe protocols and techniques used to improve the
resiliency of a network infrastructure. Methods include built-in redundancy
through duplication and the use of higher-layer protocols that will provide
sub-second reconvergence times. These techniques build on the Layer 2
protocols that were discussed in Section IV.
Chapter 19 provides specific guidance on mapping QoS settings from one
network technology to another, an important factor in ensuring end-to-end
Quality of Service operation.
Chapter 20 looks at engineering of converged networks to deliver high QoE
for both voice and data applications. A planning process is described that is
used to predict voice performance based on known relationships of voice

372
impairments and packet network behaviors. The planning process guides

the determination of appropriate network provisioning, including codec
selection, packet size, link speeds, and QoS settings, as well as delay
budget and jitter performance in the packet network, and interworking of
packet networks and legacy systems.

373
Chapter 16
Network Address Translation
Elwyn B Davies
Cedric Aoun
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323 Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
Network Address Translation (NAT)
Introduction to firewalls
Autonomous and signalled operation for NATs and firewalls
Costs and benefits of NATs, firewalls and other Internet
improvements
The middlebox concept
The basics of NAT technology

374 Section V: Network Design and Implementation
Taxonomy of NATs
Interactions of NAT with transport protocols and applications
How NATs modify network packets
The issues resulting form the introduction of NATs
Introduction
This chapter covers Network Address Translation and the Network
Address Translators (NATs) that implement the translation. NATs are one
of a group of technologies introduced into the original IP network to
increase its capabilities and update the architecture to preserve its
flexibility for the future. The next generation Internet Protocol, IPv6,
covered in Chapter 15 and Virtual Private Networks (VPN) covered in
Appendix E, are the other technologies that make up this group. These
solutions all significantly alter the way the network operates while
preserving the basic IP paradigm; data is still transmitted in individually
addressed packets across a stateless data plane that makes a separate
forwarding decision at each hop, depending on the addressing information
in the packet.
NATs have been extensively deployed in the IPv4 Internet to eke out the
limited supply of globally routable IPv4 addresses. NATs allow a network
(such as an Enterprise network) to utilize the ‘private’ address space
defined in RFC 1918. The Enterprise can use some or all of the large
amount of space within prefixes 10/8, 172.16/12 and 192.168/16 in the
private networks on one side of the NAT and still retain a fair measure of
access to the global Internet using a small number of globally unique IPv4
addresses on the other side. Some network operators see the reduced
transparency of the Internet when using NATs as an advantageous means of
access control and a security protection, but the ‘security by obscurity’
offered by a standard NAT is not much of an obstacle to a determined
invader. Genuine perimeter security needs a 'firewall' device that will
appropriately filter the packets rather than just changing the address fields.
NATs can also restrict the deployment of some peer-to-peer applications
(such as VoIP). The spread of NATs has been exacerbated by the delayed
deployment of IPv6, which would solve the address supply shortage. The
network is paying the price through increased processing and latency
experienced by packets traversing NATs, as well as reduced service
deployment velocity because of the interactions between NATs and many
application protocols.
NATS and firewalls

The technology needed to implement NATs is very similar to that needed to
provide perimeter security through firewalls. Both types of devices are
deployed close to the boundary of a domain. Both need to provide 'stateful
packet inspection', acting on every packet that passes through according to

Chapter 16 Network Address Translation 375
flow state information maintained by the various packet flows to which the
packets belong. In both cases, some packets are allowed to pass through the
device because of the policies set by the administrators of the domain.
NATs will also modify the packets to implement the address mappings. A
NAT is often combined with a firewall in one device because of the
similarities and frequent need to provide both functions at the same point in
a network.
Autonomous versus signalled operation

Historically, both NATs and firewalls have been designed to work without
any interaction with the applications generating the traffic flows that they
affect. The devices were intended to restore the transparency of the
network for the packet flows that they allowed to pass without the
applications being aware that the devices were present. This is known as
autonomous operation. More recent developments have acknowledged that
autonomous operation may not always be possible for certain types of
application that are badly affected by both NATs and firewalls. Future
developments are likely to see applications signaling their requirements to
both sorts of devices.
The swings and the roundabouts

IPv6, VPNs, NATs and firewalls—alone or in combinations—all offer
solutions to various problems that network users and operators may
encounter; however, there are pitfalls and penalties that the designer of a
real-time network has to take into account.
Additional bits to transmit

Using either IPv6 or VPNs leads to a significantly greater network
overhead in terms of the number of bits transmitted in the network and can
result in greater packet latency and greater memory requirements in both
hosts and routers:
The basic header of an IPv6 packet is twice the size of an IPv4
packet header, and there may be additional extension headers in use
(unlike IPv4 where options are rare).
The encapsulations used for VPN tunnels add extra overhead in the
form of an additional header (of one kind or another) when the
packets are traversing the public network.
Change of addressing mechanism

IPv6 introduces a new and much bigger address space, using 128 bit
addresses instead of the 32 bit addresses of IPv4.
VPNs have to extend the addressing structure to identify the private
network to which a packet belongs when it is being transmitted across the

public infrastructure. This can be done either explicitly, as in BGP/MPLS

VPNs [RFC2547] where an MPLS label stack entry is used, or implicitly,
where specific tunnels are used to carry the packets belonging to a VPN
between its points of presence.
NATs map between the private addressing model [RFC1918] behind or
inside the NAT and the global addressing model outside the walled
network.
Extra processing required

All these mechanisms increase the amount of processing that has to be
done at certain points in the network and can increase the latency of a
packet.
Routing and filtering of the 128 bit addresses of IPv6 requires significantly
greater processing than the smaller IPv4 addresses, and the full header
structure of IPv6 is more complex than that of IPv4. On the other hand, no
Layer 3 checksum calculation is required and QoS routing may be quicker
due to the flow label.
All VPNs impose an extra processing load. Part of the load is per packet,
where the packet enters and leaves the tunnel between points of presence,
and part per VPN, to set.up and maintain the tunnels.
NATs and firewalls impose a significant processing load in modifying
packets as they pass through the device, filtering out those that should not
be passed through and setting up the filters and mappings.
Changes to management
Introduction of IPv6, VPN or NAT technology requires changes to the
network management tools and techniques, and may add significantly to
the complexity of the management task.
IPv6 has some improvements that should make basic management,
especially in Enterprise networks, considerably easier; however, this is
offset by the need to manage networks that can transmit both IPv4 and
IPv6. The easier basic management is also offset by the transition and
migration mechanisms that will be used during the transition period.
VPNs add an extra layer of complexity to the management process, and if
configured tunnels are used, there is a significant management load setting
up and maintaining the tunnels.
The configuration loaded into a NAT or firewall has to be carefully
managed to correctly limit access to and from the outside world by both
applications and users in the private network, as well as preserving the
security of the network behind the NAT and providing an adequate supply
of globally routable IPv4 addresses.

Both NATs and IPv6 have direct interactions with real-time applications
and are explored in the main body of the book, but VPNs are relatively
transparent to real-time applications although the QoS and performance
aspects need to be taken into account when designing networks for real-
time applications. NAT technology, limitations and the large number of
variant implementations are explored in detail in the remainder of this
chapter. IPv6 technology is described in Chapter 15 with some additional
material in Appendix D, while VPN technology is described in Appendix
E. Firewall technology is not discussed in detail in this book, but many of
the application interactions and performance limitations that NATs exhibit
also affect firewalls.
Why do we need NATs?

When the architecture of the Internet was originally conceived in the
1970s, the architects could not realistically have anticipated the size of
today’s Internet. Computers were large and expensive; the microprocessor
was still little more than a laboratory toy and networking speeds were
measured in hundreds of bits per second. A 32 bit address for a machine
seemed to be more than adequate for any network that could be imagined
given these constraints, and, in practice, the hardware would have been
hard pressed to deal with larger addresses.
The Internet has succeeded beyond the wildest dreams of the pioneers.
Global any-to-any connectivity without needing a data plane state has
proven to be an immensely robust and scalable architecture. Driven by the
cost reductions, size reductions, and speed increases of the microprocessor,
the number of computers has exploded, and networking technology has
almost kept pace. A significant fraction of these computers can now
realistically expect to connect to a network, and the numbers will continue
to increase still further as affordable communication technology spreads to
a much larger fraction of the world’s population. From the vantage point of
today, we can see that a large percentage of the population will be
individually attached to the network before long; but, there is another still
larger growth spurt waiting in the wings. Not only will each individual
have several individually addressable machines for personal
communication, but both households and industrial Enterprises are likely
to acquire a burgeoning set of ‘sensor devices’ designed to control, monitor
and report on all aspects of the environment.
Taken together the Internet and related private IP-based networks are likely
to have several billion attached devices within the next couple of decades.
The address space of IPv4 simply will be unable to cope. The writing has
been on the wall for some time. In 1989, the IETF had already identified
that the original ‘classful’ allocation scheme for IPv4 was going to run out
of addresses, possibly as early as 1995.

To address this problem, the development of what became IPv6 was started
(see Chapter 15). It rapidly became clear that the IPv4 address space would
run out before IPv6 could be deployed and a range of shorter term
measures were put in place to postpone the exhaustion of IPv4 addresses.
Technical changes to routing allowed by removal of the class
boundaries in addresses. Removing the class boundaries allowed
address allocations for subnetworks to be more closely matched to
the network size increasing the efficiency of address space
utilization.
Restraining the availability of globally routable IPv4 addresses
through address registry allocation policies: users have to justify the
size of each allocation request and registries will only allow a small
percentage of growth headroom in each allocation
Widespread deployment of private addressing schemes behind
Network Address Translators (NATs).
These solutions are now reaching the limits of their effectiveness; IPv4
address space exhaustion is now believed, once again, to be getting much
closer (five years or a little longer.)
NATs as a middlebox
The original architecture for the Internet (DARPAnet) envisioned that
packets would travel end-to-end essentially unmodified (only the Time-to-
Live field is altered as the packet goes from router to router; the checksum
is adjusted to match).
A number of developments have introduced various varieties of
‘middlebox’, which intercept the packets in flight and manipulate some
part of the packet, especially the IP and/or the transport headers.
Middleboxes can also apply filters to the traffic passing through them and
apply various administrative policies that may result in packets not being
forwarded (‘dropped’). Depending on the policy, the middlebox may or
may not try to inform the packet source using an ICMP error message.
Network Address Translators are examples of middleboxes; other varieties
include firewalls and IPsec tunnel gateways.
NAT terminology
Middleboxes and NATs, in particular, have acquired a number of pieces of
special terminology.
Private Address. Address according to the RFC1918 scheme for addresses
that are to be used in networks not directly connected to the public Internet.
Private addresses can be reused across multiple sites that are not
administratively connected.

Filter. Used to refer to the specification of the packet characteristics for a

flow of packets on which the middlebox performs some action as well as
the actual action of selecting those packets and the hardware or software
used to perform the action. The standard example of a filter specification is
the ‘five tuple’ meaning a combination of source address, destination
address, protocol type, source transport port number and destination
transport port number. For IPv6, a filter might specify source address,
destination address and flow label pattern.
Action. The specification of the action to be taken by the middlebox when
a packet matches a filter specification. For a NAT box, this is typically the
address or port number translation to be made. Other actions, which are
also relevant to various types of middlebox, include discarding (dropping)
the packet or forwarding it. Actions can be combined.
Policy rule. The combination of a filter specification and an action
specification.
Address Realm. An addressing domain or combination of domains where
addresses are taken from a single address space and addresses of interfaces
are usually unique within the realm. Different address realms may have
overlapping address spaces. For example, two sites that both use IPv4
private addresses but have no administrative connection are separate
address realms and may use the same set of addresses (that is, a given
address may refer to an interface in each realm). A particular problem
arises when merging two address realms with overlapping address sets that
were previously separated. The realm is 'partitioned' and ensuring that
communication between the partitions is optimal requires special attention.
NAT Binding/Bind. A NAT Binding is an association or mapping of either
an address or a combination of an address and a transport identifier (for
example, port number for UDP or TCP, query identifier for ICMP) in a
specific realm to a corresponding combination in another realm. Typically,
the mapping is between a private addressing realm and the public space of
globally unique IPv4 addresses.
Pinhole. A term used to refer to a policy rule that allows packets matching
a very specific filter to be forwarded in a situation where the default action
is to drop the packets. Both NAT and firewall middleboxes are typically
configured to drop all packets that do not match a specific filter.
The basics of Network Address Translation

In looking at the actual use that was being made of the Internet when
address space exhaustion was looming, a number of things were realized.
In the majority of Enterprise networks, communication within the
Enterprise dominated communication to outside networks.

Most Enterprises and other end networks were connected to the

core of the Internet by a small number of links to service providers.
Most Enterprise (network) managers would be quite happy to limit
the access of corporate users to external services for security
reasons and also to keep the costs of external communication down.
Most Enterprises only used a small suite of applications and the
majority were client-server applications.
The communications end points (‘sockets’) used by applications are
identified by a combination of three things: transport protocol, IP
address (32 bits) and transport identifier (frequently a sixteen bit
port number). The transport identifier space was very sparsely
populated in most cases with most applications using a small
number of well-known port numbers to identify a service at the
destination and a randomly allocated port number to identify the
source. Unless there are a very large number of sessions starting at
a host, there will be lots of unused transport identifiers most of the
time.
With these observations in mind, the idea of Network Address Translation
and the Network Address Translators that implement it were conceived.
The basic principal is to allow a small portion of the IP address space to be
reused many times in independent, isolated networks, but still allow hosts
using these addresses to access the public network transparently. If this
could be done, it would swell the effective pool of IPv4 addresses
considerably. A section of the IP address space was designated as ‘private’
address space by [RFC1918]. This was originally intended for use in
networks that were never connected to the global Internet, but now is
pressed into service to extend the size of the global IPv4 network.
If addresses are used in several different networks, it is no longer possible
to route packets using these addresses unambiguously. This problem can be
avoided by placing a translator at each of the small number of places where
packets cross the boundary between the private network and the public
Internet. Each translator is allocated a small pool of globally unique IPv4
addresses. The addresses in each packet are examined and matched against
policy rules configured into the translator. If the rules identify a binding
that applies to the source or destination address, the address will be
substituted before the packet is forwarded. Packets going from the ‘inside’
(the private network) to the ‘outside’ (the public network) will have a
private address substituted by a public address from the pool allocated to
the translator and vice versa for packets coming from outside to inside.
Since only a small number of the hosts on the inside are expected to be
communicating with the outside at any one time, only a relatively small
number of public addresses are needed provided that bindings can be
dynamic and time limited.

Packets from the inside are routed to one or other of the translators (it has
to be the same one for every packet in a communication session so that the
same binding is used unless the states of the translators are coordinated)
and, after translation, can be routed onwards by normal IP routing in the
public network. Packets from the outside will be routed to the public side
of the translator using the public address and, after reverse translation, onto
the host using the private address and intradomain routing.
The NAT solution to the scarcity of public IP addresses has several
disadvantages. The most important of these is the removal of the end-to-
end significance of IP addresses, thereby reducing the transparency of the
network and increasing the amount of state in the data plane of the
network. Many protocols, for example, the file transfer protocol (FTP),
took advantage of the end-to-end significance of IP addresses by
embedding numeric IP addresses in protocol payloads. Whether or not this
represents a mistake in the design of these protocols has been extensively
discussed. NAT has to accommodate the existing protocols to remain
application independent and avoid the need for modifications to end hosts.
NAT devices can be expected to contain several application level gateways
(ALGs) that monitor the traffic passing through the NAT and perform
additional modification of the protocol payloads for specific protocols. The
ALG for a protocol ‘understands’ the packet formats and sequences of the
protocol. The NAT and ALGs coordinate to divert protocol packets that
will need additional translation through the ALG where they may be
modified, using the NAT bindings and any additional state maintained in
the ALG, before forwarding.
Network Address Translator taxonomy

Network Address Translators come in a bewildering variety of forms, each
of which is designed to meet a particular set of application requirements as
discussed in the next section. NATs are no longer just used for their
originally conceived purpose of extending the IPv4 address space.
The ‘traditional’ NAT or Outbound NAT as specified in [RFC3022] can
either be:
The basic NAT. The source IP address is replaced by a public
address when the IP packet goes from inside to outside and the
destination address is replaced in the opposite direction. The
transport identifier is unchanged.
The Network Address and Port Translator (NAPT). The sparse
usage of transport identifiers is exploited to further reduce the
number of public IP addresses needed. In a NAPT, a binding links a
private address and transport identifier (port number for TCP or
UDP, query identifier for ICMP) pair to a public address and
(different) transport identifier pair. One public address can be

linked to quite a lot of private address and transport identifier pairs

by making much greater use of the transport identifier space than is
usually the case, provided the public side transport identifiers are
all different the different communication streams can still be
distinguished and the destinations generally do not care what
source numbers are used. NAPT devices forward only TCP, UDP
and ICMP packets because of the use of transport identifiers.
For both Basic NAT and NAPT, the bindings are initiated from within the
private addressing realm. In the simplest case, the bindings can be statically
configured, but generally more dynamic and automated configuration is
needed. Static bindings are mostly used to allow connections initiated from
the public addressing realm ('inbound' connections). Static bindings could
also be used in networks where only a few hosts are allowed to send
packets to the public network or for small networks using only a few
applications. Limits on which hosts can send to the public network are now
more usually provided by the use of access control lists rather than static
bindings. Basic NAT with static bindings is also known as One-to-One
NAT.
Dynamic bindings are created when a host sends an initial packet in a
communication session and is routed through the NAT or NAPT. The
translator examines each outbound packet to check if it already has a
binding applicable to the packet. If it does, the source address (and port for
NAPT) is mapped and the packet forwarded towards the destination
address. If no binding exists, a new binding is created using a public
address from the NATs free pool and a port from the free port pool for
NAPT. The binding will be given a lifetime, creating ‘soft state’. The
lifetime is extended each time the mapping is used by a new packet; but if
the binding is not used during the lifetime, the soft state will be removed,
destroying the binding and returning the public address and port to the free
pool. Subsequent packets matching this newly created binding will have
the same translation performed so that the destination sees all the packets
in the session coming from the same source address and can, therefore,
dispatch them to the correct application and socket.
The creation of bindings may be controlled by policy rules such as
limitations on which hosts in the private network are allowed to send
packets through the NAT. Such policies are effectively the same kind of
rules that might be implemented in a firewall. Very sophisticated firewall
functionality can be usefully integrated with NAT functionality in a
combined NAT and firewall.
If it is possible to identify when a communication session has ended from
the protocol used (as in TCP), then the state can be destroyed when the last
packet in the session is seen by the translator rather than waiting for the
lifetime to expire. The lifetime needs to be chosen appropriately for each
communication so that the binding is neither destroyed prematurely due to

low throughput (in some cases, an application may need to generate

'keepalive packets' to maintain the binding when no real data needs to be
transmitted for an extended period) nor maintained for too long after it is
no longer in use, tying up scarce public address resources and allowing
access to packets that no longer have a legitimate destination.
NAPT is also known as Many-to-One NAT or Port Address Translation
(PAT). NAT with dynamic bindings is also known as Many-to-Many NAT.
Bidirectional NAT or two-way NAT mentioned in [RFC2663] allows
sessions to be initiated from hosts in the public realm, as well as from the
private network. This is not a problem if static bindings are used; but, an
external host cannot expect a dynamic binding to be in place when it wants
to start a session to a host in the private realm. Indeed the external host
would not generally want to know or be able to determine that there was a
NAT on the path to the intended destination. One solution to this problem
involves an extension to the ALG for the domain name system protocol
(DNS). In most cases, the external host will determine the IP address by
performing a lookup for the fully qualified domain name of the destination.
If the DNS server for the private domain is within the private realm, the
DNS request will pass through the NAT. If a binding for the destination
does not exist, the DNS ALG can create a new binding at the same time as
translating the private address in the DNS response.
The term Destination NAT/NAPT or Inbound NAT/NAPT is used when the
primary purpose of the NAT is to support connections to services in the
private realm from clients in the public realm rather than connections from
clients in the private realm to public servers. The bindings have to be
statically configured because the NAT or NAPT needs to be able to accept
incoming sessions from any IP address and translate the packets for these
sessions. The number of instances of a service that uses a well-known port
number is limited by the number of public IP addresses available. A NAPT
that uses just one public IP address and lots of port numbers would only be
able to support one server of this kind unless a load balancer is used inside
the private network.
Twice NAT is a variation of NAT in which both the source and destination
addresses are modified as the packet crosses between address realms. This
variation may be necessary when there is overlap between the address
spaces on either side of the NAT, as might happen when the NAT is directly
between two private realms. It can be thought of as a combination of Basic
and Inbound NAT.
The term Double NAT is also used, but it has a variety of meanings:
As a synonym for Twice NAT.
The term is also used to refer to serial combinations of NAT/NAPT
where the address is actually translated twice. Some countries with

a severe shortage of IP addresses have resorted to Multiple NAT

layers.
NAPT should not be confused with Network Address Translator - Protocol
Translator (NAT-PT). NAT-PT is a transition mechanism used to link
networks using IPv4 to networks using IPv6 (see Chapter 15).
NATs and Transport Protocols

When NAT or NAPT is being used with a connection-oriented transport
protocol such as TCP or with ICMP, all the packets in a communication
session will be directed to the same destination. This makes it easy to
identify valid reply packets and avoid various types of denial of service
attacks that could be made on a translator, as well as being able to
accurately manage the soft state in the translator. When NAT or NAPT is
being used with UDP, the destination can be any address and port number
pair for a given source because a UDP socket is not usually bound to a
single destination address.
Early NAT and NAPT implementations came up with a variety of solutions
regarding the creation of bindings and the treatment of reply packets when
dealing with UDP sessions. Four different treatments that are applicable to
both NAT and NAPT are described in [RFC3489]:
Full Cone. A Full Cone NAT or NAPT use the same binding for all
sessions associated with a private address and port on an internal
host, mapping the private address and port to the same public
address and port irrespective of destination. Furthermore, any
external host using a public address can send a packet to the private
address and port, once the binding has been established, by sending
a packet to the mapped public address and port. An external host
can also send from any source port when sending to the mapped
public address and not just the one to which the first packet in the
session was sent by the internal host.
Restricted Cone. A Restricted Cone NAT or NAPT is similar to a
Full Cone NAT or NAPT; however, external hosts can only send
packets to the private address, using the public address in the
binding, if the internal host had previously sent a packet to the same
public address. As with a Full Cone NAT, the external host can send
from any port.
Port Restricted Cone. A Port Restricted Cone NAT or NAPT
further limits what the external host is allowed to send; the external
host can only send from an IP address and port to which the internal
host had previously sent a packet.
Symmetric. A Symmetric NAT or NAPT creates a separate and
different binding for each distinct four tuple of (Private Address,
Private Port, Destination Address, Destination Port). As with Port

Restricted Cone devices, external hosts can only send packets from
an IP address and port to which the internal host had previously sent
a packet.
Interactions with applications

When considering whether an application will be affected by NATs in a
network, there are three categories of interactions:
Connectivity issues. An application can communicate only with
another entity as if it has a valid, reachable address for the entity; in
some very specific cases, policy may prevent forwarding of packets
using a particular port. More discussion of these policies and a
mechanism to control them can be found in [STUN].
Protocol issues. If the application uses a protocol that has numeric
addresses embedded in the PDU, the protocol will at best need the
support of an application specific helper at each NAT (Application
Layer Gateway) and may not be able to pass through the NAT at all.
Mapping issues. The mapping of addresses means that security
mechanisms, such as integrity checking mechanisms using address
and port data, will be invalidated.
Packet translations needed to support NAT

In addition to modifying the source and/or destination IP address of
packets passing through a NAT device, some additional translations or
modifications may be needed to maintain consistency of the packet. As
mentioned in “The basics of Network Address Translation” the payload of
some protocol packets will also need to be modified, usually by an ALG
associated with the NAT.
Network and transport header manipulations

All forms of NAT will modify the IP or network layer header of packets to
which a binding applies.
In the Basic or Outbound NAT model, either the source (outbound packets)
or destination (inbound packets) address will be replaced. The checksum of
the IP header has to be modified to reflect the changed address value.
TCP and UDP, which are the main transport protocols used in the Internet,
carry a checksum that is calculated over the transport payload plus a
notional (IP) pseudo-header, which includes the source and destination
addresses. Consequently, NAT devices must also recalculate and update the
transport layer checksum for these protocols. This has to be done for all
TCP packets and for UDP packets that have a checksum (when the original
checksum field is not zero).

Newer transport protocols, such as SCTP, do not use a pseudo-header and

so can be passed through a NAT without modifying the transport layer
header if no other changes are made to the packet.
The pseudo-header mechanism is not used by ICMP (for IPv4 - ICMPv6
does use a pseudo-header), RSVP and any of the other protocols that are
carried directly in raw IP datagrams, rather than over a transport protocol,
so that their checksums are also not affected by changes to the IP header.
Inbound and Twice NATs have to implement similar modifications to IP
destination addresses.
In the NAPT model, the IP addresses are modified in a similar way to the
NAT model, but the transport identifiers are modified as well. In the case of
UDP and TCP, these identifiers are the port numbers at the beginning of the
transport layer header. For ICMP Query packets, the query ID has to be
replaced with a number that will be unique for all recent requests
emanating from the translated address. The responses have to be translated
in the opposite direction. The checksums in all these protocols have to be
modified to reflect all the changes made by NAPT.
ICMP error packet modifications

ICMP Error packets contain a portion of the packet that caused the error.
The IP address(es) and transport identifiers (in the case of NAPT) must be
translated so that the destination can correctly identify the offending
packet. Any changes must also be reflected in the ICMP checksum.
Application layer protocols

A number of common standard and proprietary application layer protocols
carry numeric IP addresses (and sometimes port numbers) in their
payloads. All forms of NAT and NAPT need to provide ALGs to modify
the payloads of these protocols if they are to work through the devices.
Affected protocols for which ALGs are generally available for all or part of
the protocol functionality are as follows:
FTP (original mode – new passive mode designed for NAT
traversal)
DNS
IP Multicast
H.323 and associated protocols
Progressive Networks’ RealAudio and RealVideo*
White Pines’ CuSeeMe
Protocols that are associated with peer-to-peer applications are particularly
difficult to operate through NAT and NAPT. Work is in progress to provide
solutions for the following protocols:

SIP/SDP
RSVP
talk, ntalk
A number of other protocols would need complex ALGs, but it does not
make sense to operate them through NATs, including the following
examples:
All dynamic routing control protocols (including RIP, OSPF, BGP,
IS-IS). Routing in the public and private domains are essentially
independent problems. The NAT devices are placed at the
boundaries. They act as routing proxies for the whole private
domain and forward the traffic between the domains.
BOOTP, DHCP. It makes little sense to deliver configuration
information to hosts in the private domain from servers in the
public domain and vice versa.
SNMP. Translating the information returned by SNMP would be
confusing to managers and might expose details of the private
network that the owners would prefer to keep private.
Finally, a number of common protocols can operate through NATs without
need for ALGs or translation because they do not carry IP addresses or
transport identifiers, including the following examples:
HTTP, HTTPS (unless numeric addresses are used)
TFTP
telnet
archie
finger
NTP, NNTP
NFS
radius
IRC
SSH
POP3
SMTP
rlogin, rsh, rcp (provided Kerberos authentication is not used)
Using the varieties of NAT

Basic NAT does not provide any reduction in the number of public or
registered IP addresses needed by a network, but it may still be attractive to

Enterprises that actually have enough registered addresses for various

reasons:
The private network can use a subnetting scheme to provide
routing, identification or isolation between parts of the network,
and make it easier to expand the network or move workstations at
the ‘cost’ of leaving parts of the private addressing scheme unused.
The private addresses of the end stations that are actually in use are
mapped to a single range of public addresses using NAT without
needing to leave ‘holes’ in the address space; thereby, making
maximum use of the scarce public address space. Also, any nodes
(such as routers) that do not originate traffic that leaves the private
network do not take up any public address space.
The Enterprise can remain independent of its ISP by using a private
addressing scheme for its IP devices.
The Enterprise can use NAT as part of a scheme to multihome onto
two or more ISPs by translating the addresses provided by each ISP
into a single address space within the Enterprise.
NAT can also be used as part of a scheme to restrict access to the
public realm to a subset of the hosts or users in the private network.
Since Basic NAT does not manipulate the transport identifier in the packet,
both end devices and firewalls will be able to apply policy rules relevant to
a protocol based on the well-known number of the transport port used.
Basic NAT may also be useful when it is necessary to renumber a network.
By careful use of NAT, it may be possible to provide reachability of hosts
using both old and new addresses during the changeover.
Finally, Basic NAT can be used to assist when merging networks that have
overlapping private address spaces by mapping both to non–overlapping
ranges. Some privacy schemes have also been based on overlapping
address ranges and NATs. Twice NAT may also be relevant is this situation.
NAPT is primarily used to reduce the amount of registered public IP
address space needed by multiplexing many private addresses onto a
smaller number of public addresses and using translated transport
identifiers to distinguish the sessions. However, the protocols that can be
used through a NAPT device are limited—ICMP is available but other
protocols have to use UDP or TCP transports (see “Application layer
protocols” ). To make NAPT useful, bindings must be dynamic and should
be deleted when the session is known to be complete or the associated soft
state timer expires. Timers need to be set differently depending on the
protocol associated with the binding. Typical values for ICMP and UDP
might be three minutes and for TCP 120 minutes. TCP connections may lie
idle for long periods but can be deleted explicitly when the connection
terminates if the NAPT is able to inspect the packets and mirror the state of
the protocol.

Destination or Inbound NAT is useful if a service implemented on a private

network is to be made available publicly:
The external service address can be kept independent of the internal
addressing, and so can be altered without needing to reconfigure the
private network.
Load balancing can be provided across multiple servers from a
single external address as described in [RFC2391]. This is really
only useful for TCP connections where successive connections are
round-robinned or otherwise distributed across the available
servers.
A virtual server for multiple services accessed at a single address
can be created. Each service can be hosted on a separate machine
and the specific requests directed to the correct machine by the NAT
device.
Issues with NAT
Packet processing load

All forms of NAT introduce a considerable processing and state storage
load in the NAT device. From the section “Packet translations needed to
support NAT” , it can be seen that quite a lot of processing has to be done
for each packet forwarded by a NAT device in matching and substituting
fields in the packet and recalculating checksums. Although it is possible to
provide fast hardware assist for the basic field matching, substitution, and
calculation needed for every packet forwarded, the ALGs require more
involved logic and state manipulation, which is less amenable to hardware
acceleration.
Consequently, NAT devices require significantly more processing power
and additional memory to achieve the same throughput as a straightforward
router carrying the same traffic.
The extra processing needed will also introduce additional delay, especially
where the NAT has to handle large numbers of session starts and ends.
Reduced network transparency

One of the major features of the original IP network design was
'transparency.' This resulted from a combination of features:
packets travelled from end-to-end essentially unchanged,
the route taken through the network did not affect the packets, and
addresses were globally unique, acting as locators that provide
enough information to route the packet to its destination, as well as
unique identifiers for an interface or node.

All kinds of NAT reduce the transparency of the IP network by interfering

with all three of these features. This has several major undesirable
consequences:
Peer-to-peer connectivity is made more difficult because nodes
outside the NAT will not always be able to originate
communications to peers inside the NAT as they do not necessarily
have a usable global address for the peer.
Control protocols that carry numeric IP addresses or port numbers
in their payloads will not work if the addresses or ports are mapped
when the packet traverses a NAT unless there is an ALG to modify
the payload in step with the mapping.
A standalone NAT is a potential single point of failure for the
private network. Normal IP rerouting is not sufficient to protect the
network from failure of a NAT. Because the NAT and any
associated ALGs maintain state about connections passing through
the NAT, these communications cannot be simply rerouted through
another NAT unless the alternative NAT was acting as a 'hot
standby' with some kind of state mirroring to keep the connections
going when they are rerouted.
NATs are tempting targets for security attacks. The added
complexity of ALGs and the extra processing load for certain
packets, together with the potential for a NAT to be a failure nexus,
makes NATs more vulnerable to denial of service attacks than a
conventional router.
The modification of packets by NATs is a security risk and a barrier
to the use of end-to-end encryption and authentication. A subverted
NAT could alter, divert or duplicate packets with relative ease
because the device is intended to modify packets. The
modifications make end-to-end authentication impossible.
Encryption hides some or all of the fields that NAPT and ALGs
need to examine and/or modify. Some of these problems can be
overcome by using an extra UDP encapsulation for protocols
traversing NATs, but at the cost of additional overhead. However,
the basic security risk from a subverted NAT is not fully addressed.
Keeping the ALGs up to date

The ALGs in all the NATs attached to a private network have to support all
the protocols and services used from the network. Network owners are at
the mercy of the NAT vendor. Introduction of new services has to wait if
the vendor doesn't support the right ALG for a new service. Improvements
to protocols may not be immediately supported and some protocols are
extremely difficult to support with 'autonomous' ALGs, which are designed
to make the network look as close to transparent as possible. In the worst

case, the NATs may have to be replaced to deal with a new service or
protocol.
The IETF is currently investigating the use of signaling to control NATs
and other middleboxes. This would remove many of these difficulties but
introduces new needs for policy and security if NATs are to be controlled
by applications.
Security by obscurity
Some network operators, encouraged by marketing hype, have come to see
the hiding of network addresses by NATs as providing some security for
the private network.
Unfortunately, they are more or less completely deceived! Any sense of
security that this provides is totally misplaced. While the NAT reduces the
visibility of the private network topology to a casual observer, a NAT
without additional firewalling provides little or no obstacle to a determined
attacker. All NATs will accept some incoming packets that don't
necessarily come from an approved source and send them on to a machine
in the private network—exactly what the conditions have to be depends on
the type of NAT in use; but, all NATs are vulnerable for at least part of the
time. Without additional filtering, the private network is highly vulnerable
and must be protected.
Problems with fragmented packets

NAPT has an additional problem handling fragmented packets because
only the first fragment contains the transport identifier. The NAPT will be
unable to perform the address mapping unless it records the fragment
identifier from the first fragment and uses this to help map subsequent
fragments. Not all implementations of NAPT are equipped to handle
fragmented packets, and even those that do will probably not be able to
handle out-of-sequence situations where a second or subsequent fragment
arrives at the NAPT before the first fragment has established the transport
identifier to fragment identifier relationship.
Management overhead
Managing the address pools and the policies associated with NATs is a
considerable management overhead. The management becomes even more
complex if multiple layers of NATs have to be employed. In areas where IP
addresses are extremely scarce and providers charge a premium for
globally unique addresses, network operators have been forced to use
several layers of NAT to cope with needs of their networks.


This chapter outlines Network Address Translator (NAT) technology,
which allows networks using RFC1918 private addresses to interwork with
the public Internet generally using a limited allocation of globally unique
IP addresses. NATs have been a major factor in staving off the exhaustion
of the IPv4 address space while IPv6 was developed. The taxonomy of the
large number of different NAT variants was presented together with
scenarios showing the ways in which the different variants are expected to
be used. The limitations on transparency in networks that use NATs and the
effects on various types of application are shown to be a major concern,
which can be removed by using IPv6 (see Chapter 15). Finally, the
operational and architectural issues that result from the use of NATs were
presented.

References
RFC 1918, Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E.
Lear, “Address Allocation for Private Internets,” BCP 5, IETF, February
1996.
RFC 2391, Srisuresh, P. and D. Gan, “Load Sharing using IP Network
Address Translation (LSNAT),” IETF, August, 1998
RFC 2547, Rosen, E. and Y. Rekhter, “BGP/MPLS VPNs”, IETF, March
1999.
RFC 2663, Srisuresh, P. and M. Holdrege, “IP Network Address Translator
(NAT) Terminology and Considerations,” IETF, August 1999.
RFC 3022, Srisuresh, P. and K. Egevang, “Traditional IP Network Address
Translator (Traditional NAT),” IETF, January 2001.
RFC 3489, Rosenberg, J., Weinberger, J., Huitema, C. and R. Mahy,
“STUN - Simple Traversal of User Datagram Protocol (UDP) Through
Network Address Translators (NATs),” IETF, March 2003.


395
Chapter 17
Network Reconvergence
Shardul Joshi
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
Protection switching versus rerouting
Protection schemes
Spanning tree
Multilink trunking
Distributed Multilink Trunking (MLT)
Split MLT
Virtual Router Redundancy Protocol (VRRP)
Open Shortest Path First (OSPF) Equal Cost Multipath (ECMP)
OPSF failure modes

Private Network-to-Network Interface (PNNI) failure modes
Introduction
One network for all services, including real-time voice and video, provides
significant cost savings over separate, individual networks; however, if the
“One Network” fails, everything stops. On August 12, 2003, CBS News
provided an example of what happens when the “One Network” fails.
Maryland's Motor Vehicle Administration shut all its offices at noon as
technicians cleaned the agency's network systems. “There's no telephone service
right now. There's no online service right now. There's no kiosk or express office
service.”
“One Network” demands high availability. E-mail, faxes, phones, computer
networks, and video conferencing stops and business ceases if the “One
Network” fails.
Achieving resiliency
There are two methods to providing network resiliency. The first is to
provide redundancy. Redundancy can come through the duplication of
circuits or network elements (for example, ports and routers). The second is
to use protocols to provide quick reconvergence and high availability of
existing circuits after a failure event occurs in the network.
Cost is the most important factor when determining the amount of
redundancy, and where the redundancy is to be integrated into the network.
When evaluating the cost of the network, it is important to look beyond the
simple cost of adding equipment and bandwidth. The cost can be
associated to network outages and may be measured in terms of loss of
revenue or loss of service to customers. This will be a factor in determining
to what degree the network engineer uses sophisticated architecture and
fault management strategies. The end result is survivability. The educated
network engineer must evaluate cost versus effect to provide the greatest
return per dollar spent.
Redundancy to provide reliability

Redundancy, the duplication of routers and circuits to ensure the network
does not contain a single point of failure, is the easiest way to provide
network resiliency. From a path perspective, there are working and
protected paths already available and awaiting a failure event to switch the
circuit. This network strategy requires little or no engineering knowledge
and is easy to implement. The problem is that this strategy is cost
prohibitive and inefficient when applied to the entire network. To use this
type of strategy entirely would be to essentially duplicate your working
network and have it ready in the case of a failure event. This total
duplication would leave half of the network in standby mode, which would

Chapter 17 Network Reconvergence 397
never allow the network to gain any return on investment, and double all of
the required maintenance costs unnecessarily.
This is not to say providing a physical or logical redundancy in some areas
of the network is not a good idea. The engineer must decide which areas
are critical and provide cost effective means to provide resiliency. The first
area that full redundancy can be employed is at the network edge. Typical
customer edge devices are low cost, low reliability and primarily support
the use of simple protocols. Implementing full redundancy in this area
allows an easy way to provide resiliency. The trade-off is elements at the
network edge have a higher population in terms of total equipment in the
network. While these elements may be low in cost, their use of full
redundancy throughout all of these elements may be extremely expensive,
and difficult to justify. Minor outages among a large customer base will
have only a minor impact upon total network Quality of Experience (QoE)
or revenue. The redundancy should only be associated to those critical
customers that have higher requirements for redundancy and resiliency.
The alternative at the edge would be to provide a more robust network
element, which may be less cost-effective based on the amount of traffic
the customer is generating. The other place full redundancy may be
implemented is in critical areas of the network. These areas utilize more
expensive robust equipment; however, the need to maintain constant
uptime regardless of issue requires the use of the redundancy. This scenario
manifests itself in the form of circuit and port redundancy versus element
duplication. In some rare cases, the entire element may be duplicated to
ensure full redundancy. It is important to remember that in the core, the use
of high-availability protocols that can reconverge are often used in
conjunction with these types of robust routers and switches.
Path redundancy and recovery

There are several types of port redundancy that usually fall under the
overall classification called protection switching (generally associated with
Layer 1). Reconvergence routing (associated with Layers 2/3) deals with
path redundancy. With protection switching, also referred to as Automatic
Protection Switching (APS), the protection port is preplanned and is
created during the initial configuration of the working port. When the
working port receives a failure indication from the SONET layer, the
switch immediately activates the spare port and switches traffic onto this
port. The use of APS is point-to-point and is used at the edge of networks
or stubby areas of networks. MPLS also provides the ability to create a
spare Label Switch Path (LSP) that can be used when a failure notification
is received. Unlike APS, the MPLS version of protection switching spares
the entire LSP through the network.
In rerouting, the recovery path is not required to be preplanned, but may be
established on demand. The operation of finding (creating) a protection

path is postponed until a failure is detected on the working path. Rerouting

will use some mechanism, typically Dijkstra, to determine the next shortest
path to the destination. Downtime is network specific and is determined by
the number of nodes between the failure and the destination. Rerouting
does allow the flexibility to have bandwidth available for common usage
until such time as a failure event occurs.
Protection Switching is preferred over rerouting whenever switching times
or failover predictability is important. Protection switching saves on
switching time because the decision node does not have to wait for the
routing mechanism to find a new route.
Protection schemes
The expression protection scheme, or protection model, designates the
strategies noted 1+1, 1:1, 1:N and M:N in SONET and N+1 protection
when using the data reference model. For clarity and consistency
throughout the book and chapter, we will be using the SONET reference;
however, it is important to keep both models in mind.
There are several protection schemes, corresponding to various degrees of
network availability. Either one protection path protects one working path,
namely the models 1+1 and 1:1, or one protection path protects N working
paths, noted 1:N. This latter model may be generalized by allowing M
protection paths to protect N working paths (M<N), noted M:N.
In the 1+1 strategy, the traffic travels on the working path and the
protection path simultaneously. Traffic is carried only on the working path
in the case of the 1:1 strategy. The protection path may carry a lower
priority traffic that may be preempted upon occurrence of a failure on the
protected path to allow for the recovery of that path. The 1:1 strategy is
extended to 1:N, where one protection path protects N working paths.
Obviously, in the 1+1 strategy, the working and the protection paths must
be disjointed (a failure on one path does not fail the other one). Moreover,
the requirement of protecting against single failures leads to the need that
in the cases 1:1, 1:N, and M:N all the paths be disjointed.
In the 1:N protection model, upon occurrence of a failure on one of the N
working paths, the traffic that used to travel on that failed path is switched
to the protection path. At that time, the remaining N–1 working paths are
no longer protected and are required to find a new protection path(s) as
soon as possible. A long delay between the failure of a path and the finding
of a new protection path affects the degree of network availability.
Application of protection schemes to an IP network

In thinking about an IP-routed network, you do not normally talk in terms
of a 1:1 protection scheme. However, at the edge of an IP network, the
mechanism at work makes this type of protection the closest description.

You normally think that packets can take any number of (seemingly)
random paths to reach their destination. Many of today’s networks use core
routing protocols that are based on shortest path algorithms. That means
that at any given instant in time, there is exactly one shortest path between
any two end points. That path can be thought of as the working path for that
traffic.
In the event of a failure at an intermediate point in the network, the routing
protocols will recalculate the shortest path to create a new path for packets
to follow. Although the resources needed to create that new path are
identified on demand, only a single path is the new shortest path. Using the
terminology described above, with a single (working) path being protected
and a single (protection) path being created, the protection scheme for an
IP-routed network would be best classified as 1:1 protection. The fact that
resources are identified on demand means that we would call it rerouting
rather than protection switching.
Protocols for the network edge

The Customer Edge (CE) edge of the network is the point where access to
specific users is aggregated. It is at this point where full replication of
physical network elements can be used. By providing full redundancy of
the physical elements, two separate ways to enter the core/carrier network
are provided. This is the network equivalent of a double driveway at your
home. Redundancy requires having two, but costs don’t have to double—
some costs can still be saved. For example, you don’t need a map to get out
of your driveway. Likewise, at the edge of the network, there is no need for
expensive routers that run overhead intensive routing protocols, such as
OSPF, which contains a complete map/topology of the network.
Depending upon your network environment, the edge device does not need
to be a router at all but can be simple inexpensive Layer 2 switches. There
are many simple protocols that can be used at the network edge. Look at
the network to determine which best fits your network needs.
Spanning tree
The simplest example is connecting two Ethernet switches together with
two lines for redundancy.

Multi-Link Trunks
Edge STP Aggregation

Switch X Switch
Figure 17-2: Spanning Tree

Spanning Tree Protocol (STP) will block one of the Ethernet links so the
system becomes passive redundant where only one of the links is utilized
and the other remains inactive until the other fails. The protection path is
predefined, in essence a 1:1 protection scheme. However, performance is
not optimal due to the nature of STPs convergent algorithm. If used in a
moderately complex network, STP will require 35-45 seconds to
recompute and converge. During this convergence time, the STP domain is
not operational and no traffic is forwarded. Each STP reconvergence would
stop all traffic from traversing the network, leaving a devastating impact on
any real-time application running over the network.
Why does it take so long to do such a simple failover? Spanning Tree
Protocol was developed primarily to prevent loops forming within Layer 2
networks and wherever it finds a loop it blocks traffic to prevent traffic
continually cycling around that loop. STPs sole role in life is to block
traffic, not to forward it.
Spanning Tree Protocol (STP) is most effectively used in cases where
bridges are more widely deployed. Bridges had only STP available to
prevent looping. Bridges have since become switches, and routing has all
but supplanted the role of STP, so it is quite questionable if STP has the
ability to provide wide scale application in today’s converged network
environment.
To make use of the second link, a second VLAN may be created with a
separate Spanning Tree Group so that each VLAN is associated with a link.
Extra VLANs are required for each link between switches. To associate the
STP into the VLAN, Multiple Spanning Tree Protocol (MSTP) must also
be run. In order to be part of a common MST region, a group of switches
must share the same configuration attributes. This must be done on all
switches that are part of the MST region by means of CLI or SNMP. If one
makes a simple spelling error, such as the MST name using the CLI, the
switch will never become part of this MST region. Even with all of this and
the incorporation of Rapid Spanning Tree Protocol (RSTP), reconvergence

can take 30-45 seconds. A failure in the root-bridge of the STP instance,
failure of the Root Forwarding Instance, or a simple configuration error in
the MST are some of the problems that can easily occur but can have
devastating affects on your network. This becomes incredibly complex for
something that should be quite simple.
Does a switch router need two to maintain two network topologies, one for
STP and another for OSPF? No, it does not, particularly at the edge. You do
not need a map to get out of the driveway and an edge switch does not need
a legacy protocol, like STP, blocking traffic for 35-45 seconds every time
there is a link failure somewhere nearby. Thankfully, there is a much
simpler and better way.
Multilink trunking
Multilink trunking (MLT) allows both Ethernet lines to logically connect to
single Layer 3 interfaces or IP Addresses. Both lines are active and the load
is shared across them; there is no need for STP and no need for multiple
VLANs. There are no blocking traffic or convergence delays associated to
MLT. In the event of failure, the Layer 3 protocols, such as OSPF, are not
affected since MLT is entirely Layer 2.
Multi-Link Trunks
MLT
Switch 1 Switch 2
Figure 17-3: Multilink Trunking

This is a true rapid failover mechanism and implementation is also quite
inexpensive. An additional feature associated to MLT is its ability to
dynamically add bandwidth when required. Some switches allow up to
eight Ethernet lines to be combined in a single MLT to share the load
across all eight links. It should be noted that this is not a load-balanced
system, but rather a load-shared system where links have near equal but not
exactly equal load. Load is shared based on an MLT algorithm that ensures
that packets do not arrive out of sequence (which is especially important
for multicast traffic for services such as real-time video). The MLT
algorithm forwards on a particular link within the MLT based on the
source/destination MAC addresses. All traffic between a pair of source and
destination MAC addresses is always sent on the same link, ensuring traffic
arrives in sequence.

Statistically, if there are several hundred source/destination MAC addresses

traversing the MLT, then load sharing will be fairly evenly balanced.
However, in the unusual situation where there were only four to five busy
hosts/servers in a network, it may occur that one link within the MLT can
be loaded far more heavily than others. If this does occur, one might
consider changing NIC MAC addresses to ensure more evenly distributed
load sharing. Such situations are rare and having a fundamental
understanding of how MLT works helps to quickly resolved them.
Distributed MLT
Distributed Multilink Trunking (DMLT) simply expands the MLT concept
to spread Multilink Trunks over multiple switches. Typically, this is done
within a single stack where Ethernet switches are stacked and daisy-
chained together, as depicted in the following. MLT only protects against
links failures, whereas DMLT extends protection to cover a switch failure
within the stack.
Distributed Multi-Link Trunks
Switch
Stack 1
DMLT
Switch 2
Figure 17-4: Distributed Multilink Trunking

Of course, failure of the aggregation Switch 2 in Figure 17-4 would still
bring down the associated links, so the MLT concept has been expanded
further: Split Multilink Trunking.
Split MLT
Split Multilink Trunking (SMLT) is designed to provide aggregation switch
redundancy. An aggregation switch is generally defined as being a switch
not at the edge of the network, connected to switches at the edge. SMLT
can tolerate any link or any switch failing. The network will restore within
milliseconds. SMLT overcomes the shortcomings of the STP by
eliminating the loops that would cause Spanning Tree ports to be blocked.

SMLT uses an interswitch trunk that operates between the aggregation

switches and allows them to exchange information. This permits the rapid
detection of any faults and the modification of forwarding paths.
SMLT improves network redundancy and resiliency even further, by dual
homing edge switch stacks to the aggregation switches by making two
aggregation switches appear to be a single switch to the switch stacks
through an Inter-Switch Trunk (IST). The IST is used for rapid fault
detection and forwarding path modification.
Switch Stack Switch Stack
VRRP No Loops
No Spanning Tree
Split-MLT Fast fail over < 1 sec
Load sharing
Aggregation
Switches
Figure 17-5: Split Multilink Trunking

SMLT provides the following key features:
Dual homing of wiring closet switches
Load sharing over SMLT links
Fast traffic recovery over other SMLT link in case of failure (less
than one second)
Dual homing of multiple systems
Forwarding in hardware
No spanning tree convergence
Scales up to multiple hop SMLT designs
Layer 3 redundancy
Multilink Trunking (MLT) operates at Layer 2—not Layer 3. Apart from
routing IP itself, there is a range of simple failover mechanisms at Layer 3.
These include Virtual Router Redundancy Protocol (VRRP), Equal Cost
Multipath (ECMP), and Border Gateway Protocol (BGP) Multiexit
Discriminators (MEDs). Each have a role to play depending on where
failover is required in the network. For example, VRRP and ECMP can be

used on the same router where VRRP protects from equipment failure,
while ECMP protects from a path failure. The following section examines
each mechanism and provides examples of where they can be best used.
VRRP
Figure 17-6: VRRP diagram

Virtual Router Redundancy Protocol (VRRP) based on RFC 2338 allows
for the router to provide redundancy and availability to IP routing. VRRP
allows two or more interfaces on separate routers appear as though they are
one router interface. This is done through sharing of the IP address and the
virtual MAC address. The destination router sees only one element and one
IP address; in other words, the VRRP is totally transparent to the
destination. One router is designated the “Master” while the others are
considered “Back-Up.” If one of the routers fails, the other continues to
provide operations until the failed router is restored. By default, VRRP
offers recovery from a failure within three seconds. One of the interfaces is
“elected” to become the owner of the IP address and sends out
advertisements that it is the master every 1 second. These communications
are sent by multicast across the VLAN. If the backup router interface does
not receive the advertisement before the Master_Down_Interval times out,
it declares itself to be the master. The Master_Down_Interval is calculated
as follows:
Master_Down_Interval = (3 * Advertisement_Interval) +
Skew_time
Skew_time = (256 – Priority) / 256
The master gets three chances to send an advertisement before the backup
takes over as a master. This means that VRRP (by default) will converge in
three seconds. This advertisement interval can be adjusted down for faster

recovery in a real-time network. At Layer 3, the routed network sees no

failure, so there is no need for reconvergence of OSPF.
The backup interfaces do not actively forward traffic, but both routers can
be made active by taking two interfaces on each router and creating two
separate VRRP instances. Each instance uses one interface from each
router. As long as the master interface does not come from the same router
in both instances, both routers will be able to actively forward traffic. The
two routers then load share traffic rather than having one router remain
idle. Typically, only one of the VRRP switches forwards traffic for a given
subnet (typical router behavior forbids two interfaces on it from the same
subnet).
VRRP and SMLT can work together for increased resilience to failure.
Both SMLT aggregation switches must be able to reach the same
destinations through a routing protocol (that is, OSPF). An additional
subnet on the Interswitch Trunks will have the shortest route, avoiding
having any Internet Control Message Protocol (ICMP) redirect messages
issued on the VRRP subnets. To reach the destination, ICMP redirect
messages will be issued if the router sends a packet back out through the
same subnet it received it on.

Baystack 470 10.19.11.4

VL 11 = 10.19.11.2
VRRP 10.19.11.1 backup 47
47 48
48
VLAN 11
VL 20 = 10.19.20.1 (IST)
IST peer 10.19.20.2 1/5
1/5 1/5
1/5 CLIP: 10.1.1.2
CLIP: 10.1.1.1 .1
10.19.11.0/24 Central B
Central A Passport 8600
Passport 8600 1/1
1/1 1/1
1/1
VRRP.2 VRRP.3 VL 11 = 10.19.11.3

1/8
1/8 1/8
1/8 VRRP 10.19.11.1 master
OSPF Routing R R
Area: 0.0.0.0 1/4
1/4
VL 20 = 10.19.20.2 (IST)
Router IDs = CLIP i/f .1 IST peer 10.19.20.1
.1 .1 .1
ECMP/ Max. 4 paths
1/2
1/2 1/7
1/7 1/2
1/2 1/7
1/7
10.19.31.0/30 10.19.33.0/30 10.19.34.0/30

10.19.32.0/30
1/2
1/2 1/7
1/7 1/7
1/7 1/2
1/2
.2 .2 .2 .2
R 1/1 1/1 R
Clinic / Hospital A 1/1 1/1 Clinic / Hospital B
Passport 8600 VRRP.2 VRRP.3 Passport 8600
VL 12 = 10.19.12.2
1/8 1/8
1/8
VRRP 10.19.12.1 backup 1/8
VL 12 = 10.19.12.3
VL 10 = 10.19.10.1 (IST)
10.19.12.0/24 VRRP 10.19.12.1 master
IST peer 10.19.10.2
CLIP: 10.1.1.3 .1 CLIP: 10.1.1.4 VL 10 = 10.19.10.2 (IST)
1/6
1/6 1/5
1/5
IST peer 10.19.10.1
VLAN 12
48
48 47
47
10.19.12.4 Baystack 470
Figure 17-7: VRRP, Split MLT, and ECMP
OSPF ECMP
Equal Cost Multipath protects against link failure and is best used on the
links between high availability routers where load sharing (not balancing)
and quick recovery from a failure are required. ECMP allows a router
running OSPF to distribute traffic across multiple, equal cost routed paths.
The benefits of ECMP routing include load sharing across multiple paths to
the same destination and rapid convergence to the alternate path if a path
becomes unavailable due to a network event. When configured between
switch routers for load balancing, each ECMP connection should have its
own VLAN and Spanning Tree Group (STG).
Combining Split-MLT, VRRP and ECMP

Figure 17-7 depicts a configuration that brings together all the technologies
discussed to date. In this configuration, SMLT protects Switch Stacks from
uplink and switch failure at Layer 2, while VRRP protects from switch
failure at Layer 3. ECMP is used to load balance between aggregation
switch pairs and protect against link failure between aggregation switches
pairs. Any link or switch failure fully recovers in less than 1 second.

Protocols for the core

In the previous section, we dealt with issues at the edge of the networks or
the “double driveway” of the network. This section moves out into the core
network, or the “highway”. The continuation of this analogy is appropriate,
as the core network is very similar to a national highway network. Packets
traversing the core network have many possible paths that can be taken
from one side of the network to the other, with the issue being routing or
which way to go.
Road traffic systems solve this with the very same algorithm that networks
systems use—a shortest path (probably Dijkstra) algorithm. Dijkstra
algorithms collect information from every router on the entire network so
that a map of the entire network is held in a topology database in every
router. Each router then assumes it is at the center of the universe and
calculates the shortest path to every other point of the network. When a
packet arrives at the switch, the shortest path is already known. Running
the Dijkstra algorithm and figuring out the shortest path to everywhere else
is called convergence and should explain why OSPF is called Open
Shortest Path First.
If a link fails or a new switch is added, then the Dijkstra algorithm is rerun
for each switch throughout the entire network. While this is being done,
switches don't really know the best way to forward packets. Some routers
may stop all network traffic and packet forwarding until reconvergence is
complete.
The Dijkstra algorithm is used in the following:
Ethernet switches for STP
ATM switches for PNNI
Frame relay for DPRS
SONET with Automatically Switched Transport Network (ASTN)
IP Open Shortest Path First (OSPF)
The Dijkstra shortest path algorithm is used extensively throughout various
routing protocols in the communications industry; therefore, having a good
understanding of it is important. Numerous java applets on the Web depict
the various steps of how the Dijkstra algorithm works.
IP routing OSPF/BGP
RIP is not a High Availability routing protocol, as it takes too long to
recover from a failure and does not scale easily. For this reason, network
engineers need to evaluate IP protocols that provide better recovery times
and scale easily. Two such protocols are Open Shortest Path First (OSPF)
and Border Gateway Protocol (BGP). This is not to say RIP cannot be used
at the network edge to provide easy connectivity to end users, as RIP

allows for easy connection and introduction of elements into the network.
Network engineers need to build an IP routing hierarchy.
OSPF was the first Dijkstra-based routing protocol introduced into IP
routing. It was completed in 1992, and many of the same authors then went
on to the ATM forum to create Private Network-to-Network Interface
(PNNI). The next section covers OSPF first, and then points out where
OSPF and PNNI share similar characteristics.
The IP routing hierarchy for OSPF is based on having areas contained
within autonomous systems. Each autonomous system maintains its own
separate routing databases. Within each autonomous system, there are
areas defined that permit segregation of the routing database information
from the autonomous system. Within an Area, all routers will completely
share all routing table information with all other routers. However, only a
summary of that information is shared between the areas. This form of
route summarization allows the routing table exchange to be smaller,
allowing for faster convergence times.
OSPF does not share routing information between autonomous systems.
BGP is used to provide this function. BGP is essentially a distance-vector
protocol. BGP uses AS, similar to OSPF, as well as next hop metrics. OPSF
operates only within an autonomous system and is called an Interior
Gateway Protocol (IGP); whereas, BGP usually operates between
autonomous systems and is called an Exterior Gateway Protocol (EGP)
(although BGP sometimes is used internally).
OSPF OSPF
Area 0 Area 0
BGP
Area 2 Area 6
Area 1 Area 1
Autonomous System 1 Autonomous System 2

Figure 17-8: OSPF Autonomous Systems and Areas
The primary purpose for a routing hierarchy is to create route summaries to
simplify routing and thus increase its speed. For example, if all the streets
in an area of town began with J, and J was not used anywhere else, then the
postman’s job of delivering mail would be that much simpler because he
would be able to summarize all the street in that area by writing J on the
post bag for that area. Unfortunately in the real world, the postman must
keep a long list of all the streets in that area and match each letter to them
each time one arrives. Routers have a similar problem, but in this case the

network can be designed so all the addresses (streets) beginning with J

occur in the same area. Then, OSPF/BGP (the postmen) can form a routing
hierarchy using route summaries to increase routing speed and improve
recovery in the event of failure (which is a critical part of real-time
networking).
The diagram depicted in Figure 17-9 shows all the subnets within a
particular range that fall into an OSPF area. All the OSPF areas fall into a
larger range that can be summarized for the entire AS by BGP. BGP can
tell other ASs all the IP addressing within AS 1 in a single summarization.
Area 0
192.10.0.0/22
192.20.0.0/22
Area 2
Only 2 Summarised
Area 1 routes sent to Area
192.10.0.0/24 192.20.0.0/24 2, not all 8
192.10.1.0/24 192.20.1.0/24
192.10.2.0/24 192.20.2.0/24
192.10.3.0/24 192.20.3.0/24
Autonomous System 1
Figure 17-9: Address Summarization
What happens when a link goes down?

The status of the network is communicated using Link State
Advertisements that respond the moment a link failure is detected. When a
link goes down, Link State Advertisements (LSA) are flooded throughout
the entire autonomous system. Each router in the OSPF network updates its
topology database and reruns Dijkstra to construct a shortest-path tree,
using itself as the root with new optimal routes around the failed link.
Having to report only the summaries minimizes the amount of routing
traffic to report a failed link while still maintaining rapid recovery of the
network.
What happens when a router is added to a network?

When a new router joins an OSFP network, it goes through a Hello process
where it announces its presences to a neighboring router. This process is

called “forming adjacency.” Once the hello process is established, the

routers exchange their topology databases of the network and run Dijkstra.
Although this is what should happen, it often does not. The hello process
has timers, passwords, and many other parameters that must match exactly.
Vendors all too often tweak these so that they only work with their routers
and nobody else’s. Consequently, many network engineers make a point of
knowing the various phases of the Neighbor state sequence so they can
quickly determine any problems. It is best to standardize these parameters
for the entire network. They are as follows:
Hello interval
Authentication type
Password
OSPF router types

Designated routers (DR & BDR). On broadcast networks such as
Ethernet where there are several other routers on the same subnet, one
router will be elected the Designated Router (DR). Another router is
elected the Backup DR (BDR). The DR “forms adjacency” with all the
other routers and any new router joining must “forms adjacency” with the
DR. Note that the designated router will be the most heavily loaded in the
event of a failure as it is the centerpiece of route communications. The
strategy of choosing a high performance router that is not heavily loaded
would seem to be the most logical approach; but, since high powered and
lightly loaded do not often go hand in hand, selecting a DR becomes a
trade-off determined by good judgment.
Internal router (IR). A router having interfaces only within a single area
inside an OSPF network is considered an Internal Router (IR). Unlike Area
Border Routers (ABR), IRs have topological information only about the
area in which they are contained.
Area border router (ABR). A router connected to a boundary of an area
and to two or more areas inside an OSPF network AS is considered an Area
Border Router (ABR). ABRs play an important role summarizing routes.
Area Border Routers contain a complete set of topology databases for all of
the areas to which they connect.
AS boundary router (ASBR). An AS Boundary Router (ASBR) is
attached at the edge of an OSPF network autonomous system. An ASBR
runs an interdomain routing protocol such as BGP. Any router distributing
static routes or RIP routes into OSPF is also considered an ASBR. The
ASBR forwards external routes (anything that did not come from the OSPF
autonomous system) into the OSPF domain, which is how the OSPF learns
about destinations outside its domain.

LSA types, Link State Advertisements (LSA) are classified according to

the router types they originate from (which is not necessarily the last switch
they just came from). You may want to restrict advertisement of routes as a
layer of security. It is not real smart to advertise to external hackers the
entire map or topology of your network. To set up such a security system,
the various LSA types and where the route advertisements originated from
must be known. For example, you may chose to eliminate route
advertisements from entire areas of an OSPF network.
PNNI
Private Network-to-Network Interface (PNNI) provides dynamic routing
and signaling. PNNI-based switching systems monitor network topology
and available resources where calls automatically route around points of
congestion and failure. Nodes in a PNNI network exchange topology and
link state information on an ongoing basis. In this way, nodes maintain an
up-to-date view of the state of the network. PNNI was developed
immediately after OSPF. In PNNI, the Dijkstra SPF algorithm was added
for automatic routing (along with a host of other features that extend far
beyond OSPF capabilities). These capabilities are only available on the
more expensive ATM switches typically used within Carrier systems.
Many of ATMs PNNI traffic engineering capabilities are replicated within
MPLS. Even at a very basic level, label swapping in MPLS is very similar
to header swapping in ATM and it could be said that MPLS provides ATM-
like characteristics to the broader IP/Ethernet market.
Static call routing tables

Each switch in the network maintains a call routing table on the Control
Processor (CP). This table contains all the ATM addresses associated with
each User-to-Network Interface (UNI) interfaces on the switch. The call
routing table maps all the static and dynamic addresses to their
corresponding UNI interfaces. Default addresses are automatically added
to the table, and static addresses are added as they are provisioned.
Dynamic addresses are automatically placed in the table during the
registration process. When an interface goes down, its associated addresses
are removed from the table.
Address matching
An incoming call setup request is received over the signaling channel of a
UNI interface. The call routing table is then scanned for the specified
called address. From the table, the switch selects the best matching address.
This address is the table entry with the most hexadecimal digits identical to
those of the called address. This process is called a maximal address match.

The call setup request is then forwarded to the next-hop UNI interface
associated with the best-match address. If no match exists, the call is
cleared back to the previous node.
PP4 PP7
blocked
PP2 PP5
PNNI i/fs with
blocked Resources
Unavailable
A B
UNI UNI
PNNI
PP1 Interfaces PP3 PP6
Figure 17-10: Forwarding
Topology and link state maps

PNNI uses the topology and resource information acquired from its
neighbors to find the best route for each call, as demonstrated in Figure 17-
10. The ATM switch determines a route to the node that matches the
longest prefix advertised for the destination address. This route satisfies the
Quality of Service (QoS) requirements specified in the call setup request.
The route selection process also takes into consideration the user-specified
optimization criteria (for example, giving preference to routes with
minimal administrative weight).
Route failure under PNNI

If a node blocks the call setup request before the request reaches its
destination, the call signaling messages is returned to the originating node.
Instead, the originating node aborts the blocked request and then either
calculates a new route or rejects the request because of timeout or no
available routes.
PNNI Strict Hierarchy

The PNNI addressing hierarchy differs from IP and OSPF in that is big
enough to maintain a global summarization database covering every nation.
This is the reason for long twenty byte Network Service Access Point
(NSAP) addresses (truncated below to simplify example).
Having global summarization:
Database sizes minimized by representing whole peer groups as
single nodes

Having less information often results in less than optimal routing
A.1 A.2 A.3
PG(A.1) PG(A.2)
PG(A.3)
A.1.1 A.1.2 A.2.1

A.3.1
A.3.2
A.1.3
A.2.3 A.3.4
A.1.5
A.1.4 A.2.2 A.3.3
Figure 17-11: PNNI hierarchy
PNNI impact on Call Routing: PVC and SPVC

SPVC Destination
SPVC Source Optimal Path
PVC
Figure 17-12: SPVC call routing

A Permanent Virtual Circuit (PVC) needs to be provisioned at each hop
and it can use only the provisioned path. If a link fails on the path, the PVC
goes down.
PNNI permits Switched or Soft PVC (SPVC) to be established. An SPVC
only requires provisioning at the source. The connection is routed through
the network using the best available path and terminated at the destination.
It is rerouted in the case of a link failure on the path.

NEW NODE
Re-Optimized SPVC SPVC Destination
SPVC Source
Figure 17-13: Optimization of SPVC path

This is part of the ‘Edge Based Reroute’ specification. The case where a
fault occurs and a connection (an SPVP for example) reroutes using an
alternate path to the destination is not preferred by the network operator
because additional bandwidth is tied up in the network (along the ‘blue’
path), which may cause congestion for other services on those nodes. To
solve this once the fault is cleared, the connection is moved seamlessly
back to the preferred path. This is done by examining the state of the PNNI
topology and reoptimizing the connection based upon the improved paths
available. The advantages of this are as follows:
It is standards-based
As the network topology changes, the route optimization feature
will automatically use new routes for optimal connectivity. Without
this, alternate routes must be provisioned and must be constantly
updated to keep track of the changing topology.
EBR – Edge-Based Reroute

Edge-based Rerouting (EBR) capabilities operating in a PNNI network
recover and optimize active point-to-point SVC, SVP, SPVC, or SPVP
connections. EBR implements two types of rerouting procedures:
connection recovery and path optimization. In the event of a network
failure, EBR uses connection recovery to recover failed connections by
rerouting the connections without intervention from the end systems (The
end system has no knowledge that the network failure or rerouting has
occurred.) In the rerouting process, EBR finds an alternate route for a
connection that would otherwise be cleared back to the end-users.
Connection recovery is also called hard rerouting or break-before-make
rerouting because the incumbent connection segment is released due to
failure before the establishment of the rerouting connection segment is
completed.
EBR also operates when no failure has occurred. EBR uses path
optimization to move active connections to more optimal routes. Path
optimization can be activated to run either automatically or manually on all
eligible connections on the PNNI links. This optimization aspect of EBR
improves the QoS of a point-to-point connection and provides more
efficient use of network resources. Path optimization is also called soft

rerouting or make-before-break rerouting, because a rerouted connection

segment is established before the incumbent connection segment is
released. These features are based off of the ATM Forum's mandatory
requirement for asymmetrical soft rerouting procedures specified for PNNI
edge-based rerouting.
The Reduced Cell Loss (RCL) mechanism offers the reduction of cell loss
during path optimization, thus minimizing network disruptions. This RCL
is accomplished via Operations and Maintenance (OAM) cells and the
placement of OAM Segment Boundaries. It marks the End of Transmission
(EOT) on the incumbent connection segment, which triggers the data path
swap at the rendezvous node.
The cell loss during the data path swap is proportional to the trip delay of
the incumbent connection segment and is less than that of the mechanism
used by the path optimization procedure based on the guidelines in the
ATM Forum.
Access Control Impact of PNNI

UNI Called UNI Calling
Address Filter Address Filter
Allow DisAllow Allow DisAllow
4720 4750 4730 8660
4730 4770 6510 9630
6510 8660 4710
86.10 47.10
4710 Destination
‘4720’
Supported Address Formats (including
summarized) : X.121, DCC, E.164, ICD
Setup to
4720
Source
‘4710’
Figure 17-14: Access Control

Security of ATM connections is handled through the Access Control
Mechanism. Access control on both the called and calling address can be
done.
Why is this required? If it was not there, any connection could initiate and
terminate on any switch, allowing the possibility of security issues to arise.
Access Control lists can be used for allowing/denying connectivity to
business partners as they change: a simple change to access privileges to
enable connectivity.

QoS Variance
Variance is provisioned per QoS. It allows a call to ‘see’ more available
paths in the network. The green lines in Figure 17-15 represent the ‘paths’
available for a call to be routed on.
Variance
Setup Load
Request Balance
Destination
Source
Acceptable Paths
Figure 17-15: QoS variance under PNNI
For example, assume the middle two paths in Figure 17-15 provide a Cell
Transfer Delay (CTD) of 150 ms. The upper path provides a CTD of 250
ms and the lower path a CTD of 220 ms.
Without variance, if the call setup request requires a Cell Transfer Delay of
200 ms, only the middle two paths would be acceptable. This limits the
choices in routing the connection across the network. If there are many
such connections, the middle two paths will become more heavily loaded
than the only slightly different upper and lower paths.
If the QoS variance is set to 25%, all paths meeting a CTD metric of 250
ms (200 ms*125%) would be acceptable. Thus, the upper and lower paths
could be used and calls would have more paths to choose from. Combined
with one of the load balancing algorithms, this will provide excellent call
and load distribution.
What you are seeing is the ability of the ATM network to evaluate all of the
available links within the network and determine the ability of each
individual link to provide the desired QoS. This can be a very tricky thing;
however, the network engineer can use this to their advantage. By slightly
relaxing the QoS requirements on a service, the network is available to a
number of parallel links allowing ATM to load share across the number of
links. This makes data transfer more efficient across a network. By setting
the connection QoS requirements too stringently, the network engineer can
choke the links not allowing connection completion even while bandwidth
is available.


In this chapter, you learned about the options and strategies that are
available to configure and design a fault-tolerant network that will be able
to withstand failure and continue to provide quality and service. There are a
number of ways that this can be done. It is the network engineers’
responsibility to provide a cost-effective means to accomplish this goal.
Remember, the selection of redundancy features and strategies should be
predicated on the impact (measured in terms of loss of revenue) of network
features. Determination will depend on where you are introducing elements
into the network and where partial or full redundancy is required. You
should also be able to evaluate which protocols and ports can be used to
provide higher layer redundancy. All of these things are blended together to
provide a reliable, cost-effective network.


419
Chapter 18
MPLS Recovery Mechanisms
Ali Labed
.
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Control Application
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLSMPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
The various MPLS protection schemes
The components of an MPLS recovery solution
LSP Setup using RSVP-TE
LSP monitoring, detection, and notification using RSVP-TE and
ITU-T Y.1711
MPLS Scope of Recovery - Global and Local
Introduction
“Network Survivability refers to the capability of the network to maintain
service continuity in the presence of faults within the network. This can be

accomplished by recovering 'quickly' from network failures” [INTERNET-

TE]. The requirement on the maximum length of the interrupted period is
dependent on the application. This chapter covers network survivability in
Multiprotocol Label Switching (MPLS) networks.
Traditional IP networks support only one class of service—the best-effort
class—and network survivability is provided by Layer 3 rerouting. This
constitutes a concern for applications requiring highly reliable service,
since the recovery in Layer 3 ranges from several seconds to minutes due to
the time needed by a Layer 3 routing algorithm to converge (and the hold-
off timers needed so that all routing advertisements connected with a
failure can be collected).
Since MPLS [MPLS-ARCH] is a technology of choice in future IP-based
transport networks that is gaining a wide acceptance among equipment
vendors and network operators, it is important that MPLS-capable products
be able to provide protection and restoration of traffic. MPLS networks
establish Label Switched Paths (LSP), where packets with the same label
follow the same path. This potentially allows MPLS networks to
preestablish protection LSPs for working LSPs and achieve better
protection switching time than those provided by legacy IP networks.
All LSPs do not necessarily need the same degree of protection, such as the
recovery time and the attributes of the recovery path (for example,
bandwidth and maximum number of hops). Therefore, one can use the LSP
preemption priority as a service differentiation parameter.
The recovery mechanisms presented in this chapter are based on the use of
RSVP-TE (Reservation Protocol with Tunnelling Extensions) signaling
protocol and the Operation, Administration, and Maintenance (OAM)
mechanisms described in the ITU-T Y.1711 and MPLS ping. The rationale
behind the choice of RSVP-TE (over LDP) is that it is generally agreed to
be the predominant signaling protocol in MPLS networks.
MPLS protection schemes

Depending on the targeted degree of network availability, one can choose
among several LSP protection strategies and protection schemes. The first
of these strategies is protection switching and rerouting.
Protection switching and rerouting

In Protection switching, also called Path switching, the protection path is
preplanned: it is created prior to the detection of a failure on the working
path (generally at the same time the working path is created). The case
where the resources are reserved on the protection path at the time this
latter is created is called prereserved resources. The alternative is called
on-demand resource reservation. In this latter option, the resources on the
recovery path are not reserved until the failure is detected.

Chapter 18 MPLS Recovery Mechanisms 421
In rerouting, the recovery path is not preplanned, but established on

demand. The operation of finding (creating) a protection path is postponed
until detection of a failure on the working path. The protection path is said
to be created dynamically, and, therefore, the only resource reservation
option available is “resources reserved on demand.”
In order to achieve the lowest recovery time possible, it is required to use
Path-switching, rather than rerouting. The path-switching option further
improves the recovery time when used with the prereserved resources
option. However, the on-demand resource reservation option achieves
better resource utilization, as no resources are committed until after the
fault occurs.
Protection schemes
The expression protection scheme, or protection model, designates the
strategies noted 1+1, 1:1, 1:N and M:N. In all of these strategies, the
protection path must be disjointed from the corresponding working path on
the working path segment (link, node, or the whole path) that is targeted for
protection by the protection path. Two paths are said to be disjointed on a
given segment of the working path if a failure on one path does not cause a
failure on the other path on the same segment.
There are several protection schemes, corresponding to various degrees of
network availability. Either one protection path protects one working path,
namely the models 1+1 and 1:1, or one protection path protects N working
paths, noted 1:N. This latter model may be generalized by allowing M
protection paths to protect N working paths (M<N), noted M:N.
In the 1+1 strategy, the traffic travels on the working path LSP and on the
protection path LSP, simultaneously. Obviously, the working and the
protection paths must be disjointed. Under normal operational conditions,
the LSPs egress Label Edge Router (LER) accepts the packets arriving on
the working path LSP and drops those arriving on the protection path LSP.
However, upon occurrence of a failure on the working path LSP, and after
the egress LER of that working path LSP receives the failure notification
(or detects the failure itself), the egress LER starts accepting packets
arriving on the protection path.
The traffic is carried only on the working path in the case of the 1:1 (1:N)
strategy. The protection path may carry a lower priority traffic that may be
preempted upon occurrence of a failure on the protected path (on one of the
protection paths). The M:N protection model is a variant of the 1:N model
where there are M protection paths instead of 1. In these three protection
models, all paths must be disjoint (that is, physically and logically
separated, except for the endpoints) in order to meet the requirement of
protection against single failures.

In the 1:N protection model (N>1), upon occurrence of a failure on one of

the N working paths, the traffic that used to travel on that failed path is
switched to the protection path. Here, the remaining N–1 working paths are
no longer protected and a new protection path(s) must be identified as soon
as possible.
The degree of network availability decreases in the following list of
protection schemes from left to right: 1+1, 1:1, 1:N. However, the higher
the degree of network availability provided by a protection model, the
higher the resource consumption of that protection model (assuming that in
the cases of 1:1 and 1:N, the protection path is used to carry lower priority
traffic).
Components of an MPLS recovery solution

This section discusses protection of LSPs traffic, and describes the
functional components that are required for any LSP protection solution.
The ensemble of those components is called a Recovery System.
The list of components of a Recovery System in an MPLS network follows
the intuitive chronological sequence involved in a recovery operation.
There is a need to have surveillance means, called Monitoring and
Detection, in order detect failures or defects. Once a failure is detected,
another component, called Notification, transmits a notification message to
the network element, called “decision node” or Path Switch LSR (PSL),
that has the responsibility to act upon the message. The decision node then
finds an alternate path (if not preplanned), sets an LSP on that path (if not
prereserved), and “switches” the traffic to that new path. Finally, in certain
cases, such as when the protection path is of a lesser quality than the
original working path, there is a need for a mechanism to switch back, or
revert, the traffic to the recovered working path. The functionality is
fulfilled by the Switch-Back component.
In order for this system to be complete, it is required to be able to associate
the protection path with its corresponding working path—the function
fulfilled by the Association-Configuration component. This component
also has the responsibility of interfacing with RSVP-TE to set up/tear down
an LSP and provides RSVP-TE with the explicit route object. The second
required component is a routing mechanism used in order to find LSP
routes based on a set of constraints. These two components are not fully
described in this text; however, we shall give a brief description about the
features that are important to the present topic. A last component, which is
not covered here, is needed for resource optimization purposes in order to
achieve path maintenance. The optimal arrangement of paths is not
necessarily achieved by incremental adding of paths to the network.

Association-Configuration component
There should be a component that keeps track, at the decision nodes, of the
association between the Working Paths (WPs) and their corresponding
Protection Paths (PPs).
Monitoring component
In order to detect the faults (described in the Detection section), the
network must be monitored. An important issue addressed by the
monitoring component is the frequency of monitoring. The higher the
frequency of monitoring, the faster the defect is detected, but the higher the
overhead incurred. The recovery target time dictates the frequency of
monitoring.
Furthermore, monitoring may be available at the network layers that are
below the MPLS layer, such as the physical, and the data-link layers. At
those layers, the monitoring happens at higher frequencies than at higher
layers.
Detection component
The design of the detection functionality needs to address the issues of
which defects to detect, which component detects each of the defects, and
at which network layer. Mechanisms must be provided in order to detect
network connectivity defects. Furthermore, in the context of LSP
protection, there is a need for an MPLS layer OAM mechanism in order to
detect MPLS-fabric defects. Two dimensions to this is sins of omission
(that is, missing heartbeat, large gaps in sequence numbers) and sins of
commission (leakage of traffic). Defects may result in multiple points of
detection; not all of which may be able to perform notification. Network
impairments, such as congestion, that result in lower throughput are not
included because they are outside the scope of this chapter.
Notification component
Upon occurrence of a fault and after its detection, the “detecting” node
needs to have means to notify the decision node, and needs to know when it
should send its notification: immediately or after a predefined delay. In
other words, the Notification mechanism addresses the following issues:
Who to notify. The notification message needs to be transported
and forwarded from the detection node to the decision node.
When to notify. In order to allow a lower layer recovery
mechanism, if any, to take place, the MPLS layer must react to a
defect if it persists for a time defined to be long enough to allow the
lower layer to try to recover from the failure.
How to notify. The notification message needs a transport method.

Path Switching
After being notified of a failure on the working path of an LSP, the decision
node needs to switch traffic from the failed working path to the
corresponding protection path.
Routing component
Within the context of MPLS, the path of an LSP can be computed and
established in various ways. This task can be achieved by an Interior
Gateway Protocol (IGP), which chooses the (generally) shortest path.
Other ways to compute and establish an LSP path include manually,
automatically online using constraint-based routing processes, and
automatically offline using constraint-based routing entities implemented
on external support systems [INTERNET-TE].
“Constraint-based routing system refers to a class of routing systems that
compute routes through a network subject to the satisfaction of a set of
constraints and requirements” [INTERNET-TE]. Constraint-based routing
processes can be provided by a Traffic-Engineering system. The latter can
be centralized or distributed. In the centralized design, all decisions are
centralized as well as the necessary information about the network to take
those decisions. In the decentralized case, the decisions are taken by each
router autonomously based on the routers view of the state of the network,
which requires a protocol between these decision entities in order to
exchange information on that state.
Switching back component

In certain scenarios, such as when the protection path is of a “lesser”
quality (for example, less bandwidth) than the working path, there is a need
to have a switchback operation. It consists in the traffic being switched to
the original working path after it recovers from the failure.
Monitoring, detection and notification mechanisms

The following section explores the mechanisms for monitoring the network
for faults, detecting those faults and notifying about them. The section also
describes why the RSVP state refresh protocol is not applicable in large
networks. That drawback is overcome by the RSVP-TE Hello protocol. We
proceed with a description of ITU-T Y1711 and LSP Ping and show how
these protocols complement RSVP-TE Hello.
LSP set up and state refresh using RSVP-TE

RSVP-TE is used when setting up (signaling) LSPs. Pinned up LSPs are
considered; that is, an RSVP-TE setup (Path) message includes an Explicit
Route object to set up an LSP.

In order to set up an LSP, the LSP ingress LER builds an RSVP-TE Path
message and sends it to the LSP egress LER. The Path message includes
the explicit route (ER) that must be followed by that message. The Path
message establishes an RSVP Path state on each node it traverses (and
must be listed in the ER object) towards the egress LER. In order to
respond to that Path message, the egress LER builds a Resv message and
sends it to the Ingress LER. The Resv message crosses, in the reverse
direction, the same path traversed by the corresponding RSVP Path
message and establishes an RSVP Resv state on each node it traverses
towards the Ingress LER (see Figure 18-2). Those states include the
parameters associated with the corresponding LSP.
S1 R
Ingress N1 N N3 N4 Egress
1: Path 2: Resv
2
Figure 18-2: LSP setup using RSVP-TE signaling protocol
The RSVP (Path and Resv) states are called “soft states”; that is, they need
a periodic refresh in order not to be deleted (and the corresponding LSP
torn down). For a given LSP and a given node traversed by that LSP, the
RSVP-TE refresh operates as follows:
The RSVP-TE process on the node periodically retransmits to its
downstream neighbor, the Path message.
The RSVP-TE process on the node periodically retransmits to its
upstream neighbor, the Resv message.
A node detects a failure when it does not receive the expected refresh
message from a neighbor after a (configurable) delay. Upon detection of a
failure, a node sends an RSVP tear message to the end node.
S1 R
Ingress N1 N2 X N3 N Egress
ResvTear PathTearr 4
Figure 18-3: RSVP tear message upon detection of a failure
In order to detect faults in a timely fashion, the refresh messages must run
at a high frequency. However, this raises the scalability issue as the refresh
mechanism is per LSP. This issue has been overcome in RSVP-TE. The
latter extends RSVP with several features and mechanisms, including a
keep-alive protocol called RSVP-TE Hello protocol that is described in the
sequel.

RSVP-TE Hello
RSVP-TE Hello protocol provides node-to-node failure detection. It runs
between neighboring nodes (at the control plane). It is a Layer 3 keep-alive
mechanism that enables RSVP nodes to detect when a neighboring node is
not reachable. The neighbors periodically exchange keep-alive (Hello)
messages. The loss of communication with a neighbor is declared after a
configurable number (default=3) of consecutive Hello messages are
missing. A node can detect a loss of communication with a neighbor over a
specific link (in case multiple links run between neighbors, a different
instance of RSVP Hello runs on each link). When such a fault is detected,
the detecting node reacts exactly in the same way as when a fault is
detected through non–reception of an RSVP refresh messages.
In the context of LSP protection, there are two types of faults that need to
be detected:
Network connectivity faults. An interruption in the data path.
MPLS fabric defects. The data continue to be forwarded, but on a
wrong path, not the one that was configured for it to be forwarded
on. In other words, the symptom of MPLS fabric defects is the
sending, by any node on an LSP path, of packets of a certain LSP
on a different LSP. This symptom is called misrouting.
RSVP-TE Hello protocol detects network connectivity defects, but does
not detect MPLS fabric defects. There is a need for an MPLS layer OAM
mechanism in order to detect MPLS-fabric defects. Two mechanisms are
available to fulfill that role: ITU-T Y.1711 and IETF MPLS Ping. From a
functionality point of view (that of detecting MPLS table failures), both
mechanisms are similar. MPLS ping offers a supplementary functionality
that supports debugging. After occurrence of a failure, an operator can use
MPLS ping in order to locate the failure.
ITU-T Y.1711
ITU-T Y.1711 provides a mechanism for path continuity test in order to
detect “path failures.” In this scheme, the Ingress of an LSP periodically
inserts, in the LSP, a specific OAM packet—called Connectivity
Verification (CV)—into the concerned LSP (in-band). The Egress of the
LSP detects a defect on the LSP when it does not receive three consecutive
CV packets for that LSP, at which time it sends a Backward Defect
Indication packet (BDI) to the Ingress of that LSP to notify the Ingress
about the fault. Therefore, the Ingress is the only node that has the ability to
recover from a failure—called the Path Switch LSR (PSL).
ITU-T Y.1711 includes a mechanism, whereby, on any node on a path of an
LSP, a layer below the MPLS layer that detects a fault notifies the MPLS
layer. The MPLS layer sends a Forward Defect Indication (FDI) towards
the Egresses of the affected LSPs. As the lower layer defect detection and

notification time is relatively instantaneous, this mechanism allows

achieving very low recovery times (tens to hundreds of milliseconds).
Furthermore, this mechanism, along with the CV heartbeat mechanism,
provides a bounded detection time for all defects in the LSP path.
BDI
Egress
Ingress LER
LER
LSP 1 2 3 X 4 5
Figure 18-4: ITU-T Y.1711 notification mechanism
MPLS Ping
MPLS Ping is used to detect data-plane failures in MPLS LSPs. This
mechanism is modeled after the ping/traceroute philosophy. It operates
under two modes. The first mode is the fault detection at the data plane,
where MPLS Ping is used for connectivity checks, while the second mode
complements the first one by providing a fault isolation mechanism. In this
mode, Traceroute is used for hop-by-hop fault localization.
MPLS Ping may be used in various ways. The following depicts how it can
be used for LSP path continuity test, which belongs to the first mode. In
this case, MPLS Ping operation is similar to that of ITU-T Y.1711. The
Ingress of an LSP periodically inserts, in the LSP, an Echo packet and
expects to receive a Reply message from the expected (the configured) LSP
egress. A failure is detected when either a no reply is received or a different
egress responds with the inclusion of the corresponding error code.
However, unlike ITU-T Y1711, MPLS Ping relies on many non-LSP
components, and a fault notification is not a reliable indication of an actual
problem on the LSP. Furthermore, with no alarm suppression mechanisms,
a ping failure is not coordinated with local detection mechanisms. If a link
fails, both the local LSRs and the pinging LSR will detect a problem in an
uncoordinated fashion.
Monitoring times and overhead

In any of the cited monitoring mechanisms, the higher the frequency of
monitoring, the faster the response time for fault detection. However, the
higher the frequency of monitoring, the higher the overhead incurred. This
raises the issue of scalability because the polling is per LSP, and the
processing overhead (of polling messages) increases dramatically with
higher polling frequencies. The detection time is generally set to three
times the periodicity of monitoring. The target recovery time dictates the

frequency of monitoring. ITU-T Y.1711 stipulates the frequency of

monitoring to be one (1) second; whereas, it is configurable in RSVP-TE
Hello and in MPLS Ping.
The overhead generated by the monitoring mechanism is not the only
limiting factor to the monitoring frequency. The interlayer recovery
coordination is another limiting factor. In order to allow a lower layer
recovery mechanism, if any, to take place, the MPLS layer must react after
the expected recovery time of the lower layer has elapsed.
MPLS scope of recovery — global and local

The topology scope of a recovery solution can be Global, also called end-
to-end or Local. This section describes these two solutions, along with a
description of RSVP-TE extension for local protection. The expressions
local repair, local recovery, and fast reroute (FRR) are used
interchangeably.
Global recovery
In global recovery (abbreviation: Global), also called end-to-end recovery,
the working path LSP is protected by an end-to-end protection path LSP
(see Figure 18-5). The latter is preestablished and does not depend on use
of either Path-switching or rerouting.
The working path LSP and its corresponding protection path LSP are
disjoint; that is, they have the same end points1 (ingress and egress LERs)
and those are the only network elements that they have in common. In this
case, Global protects against link and node failures on the working path,
except for the ingress (and possibly the egress) LER. However, in certain
cases, Global protects only a segment of the working path LSP. In which
case, the protection path LSP starts at a node downstream from the ingress
LER node, called PSL (Path Switch LSR), and merges back with the
working path LSP at a node upstream from the egress LER (called PML:
Path Merge LSR).
In both cases, upon occurrence of a fault, the PSL or the ingress LER
receives a notification message. It is responsible for switching the traffic
from the working path to the protection path. The time the fault notification
message takes to reach the PSL (or the ingress) is important because it
makes the recovery time unacceptable for certain applications. Local
recovery overcomes this problem.
1. The Egress node of the working path LSP may be different from that of the protection path LSP, if the
destination of the traffic is reachable through another Egress. In this case, the protection path LSP
protects against failures on the working path Egress, as well.

Local recovery
In local recovery, a node traversed by an LSP protects against failures on a
subtending link or on a neighboring node, traversed by that LSP. In other
words, it is the node directly upstream of the failed component that is
responsible for detecting the failures and switching the traffic on an
alternate route (using Protection-switching or rerouting). That node is
called Point of Local Repair (PLR). The local protection path LSP merges
back with the main working path LSP, at a downstream node, called Path
Merge LSR (PML) (see Figure 18-5).
Local recovery achieves better recovery times than Global, since the
notification message does not have to travel upstream to the Path Switch
LSR (PSL) (the PLR is itself the PSL).
Link and Node Recovery

Depending on the downstream network element, link, or node protected by
a PLR, local repair can be used to protect against failures on the subtending
link or to protect against failures on a neighboring node. The local node
protection also protects against failures on the subtending link connecting
to the neighbor node. The first one is called link recovery, and the second is
named node recovery. The following figure illustrates these two cases.
PP_1 is a node recovery, whereas PP_2 and PP_3 correspond to link
recovery.
Global LSP protection Egress

Ingress LER
LER
LSP 1 X X2 3 4 X 5
PP_1
PP_3
PP_2
6 7
8
9
Figure 18-5: Global (end-to-end) and local protection
Node recovery imposes itself when the downstream node is deemed to be
unreliable, otherwise link recovery is sufficient. Therefore, the choice
between these two alternatives is based on the degree of nodes reliability
and the target level of network availability.
RSVP-TE extension for local recovery

The IETF draft “draft-ietf-mpls-rsvp-lsp-fastreroute-01.txt” [FRR_RSVP]
extends RSVP-TE with objects and flags as well as with new mechanisms

in order to handle MPLS local protection (also called Fast-ReRoute: FRR).

The new mechanisms that extend RSVP-TE enable the redirection of traffic
onto local backup LSP tunnels in the event of a failure.
There are two techniques to set up local backup LSPs: 1:1 and Bypass-
Tunnel. In the 1:1 technique, each LSP will be associated to a local
protection LSP. In the Bypass-Tunnel technique, multiple local protection
path LSPs are nested in one LSP—the Bypass-Tunnel LSP. In the latter
technique, only one protection path LSP needs to be maintained between
the PLR and the PML, which saves on resources, including labels and
RSVP refresh overheads.
MPLS recovery versus IP (IGP) recovery

For the sake of clarity, one example of IGP, in this case OSPF, is
considered. In OSPF, when a router detects that one of its interfaces is
down, it floods a Link State Advertisement (LSA) to all other routers in the
network (in its area). That LSA will be received by the router’s neighbors
first, which, after an initial processing on that LSA, forward it to their
neighbors (except the one from which the LSA was received). This process
continues until all routers receive that LSA. Each router receiving that LSA
waits a certain delay (SPF Delay), calculates a new shortest path tree, then
reroutes the traffic according to this new tree.
The four main factors composing the OSPF recovery time are the detection
time, the notification time, the SPF Delay, and tree recalculation time.
Recovery Time = detection + notification + SPF Delay + tree re-
calculation
The SPF Delay + tree recalculation is called the convergence time. The
SPF Delay is a vendor-defined parameter. It is used in order to avoid
frequent shortest path calculations. Its value is usually set to five seconds.
As the value of SPF delay decreases, the risk of higher recovery times
increases. The tree recalculation time varies with the size of the network,
and the degree of connectivity (number of interfaces) in that network.
The MPLS recovery time is composed of the following times:
the detection time
the notification time
the switching time
Recovery Time = detection + notification + switching
As MPLS and OSPF use a Hello protocol for failure detection, the
detection time in MPLS and OSPF should be very close.
Due to the flooding mechanism in OSPF, the notification time in OSPF is
shorter than in MPLS (but not more than a couple of hundred

milliseconds). However, this difference disappears if MPLS local

protection is used (as opposed to MPLS global protection).
The MPLS switching time is significantly small compared to the
convergence time introduced by OSPF.
This shows us that MPLS achieves better recovery times than OSPF, and
the difference in the recovery times between the two protocols increases
when MPLS local protection in used (instead of MPLS global protection).
MPLS local recovery can be achieved within intervals on the order of
tenths of seconds compared to intervals of seconds for OSPF.
The order of magnitudes for the recovery times varies between hundred of
milliseconds to minutes in OSPF, and from hundred of milliseconds to
seconds in MPLS.


The MPLS protection schemes are: 1+1, 1:1, 1:N and M:N. They are listed
in a decreasing degree of the network availability they can achieve. In all of
these strategies, the protection path must be disjointed from the
corresponding working path.
The functionalities needed in an MPLS recovery solution are monitoring,
detection, notification, switching, routing and switching back.
RSVP-TE Path message is used to signal a new LSP. The Path message
includes the explicit route that must be followed by that message. RSVP
states associated with the LSP are created on each node traversed by the
LSP. In order to protect against certain types of failures, these states are
periodically refreshed during the lifetime of that LSP.
RSVP-TE augmented RSVP monitoring, detection and notification
mechanisms with a keep-alive protocol, namely the RSVP-TE Hello
protocol.
RSVP-TE monitoring and detection mechanisms do not detect certain
types of faults that are related to the MPLS forwarding table corruption,
and, therefore, are complemented with ITU-T Y.1711, which is an ITU
MPLS monitoring, detection and notification mechanism.
The topology scope of a recovery solution can be Global, also called end-
to-end, or Local. The latter achieves faster recovery times than the former,
but it is more complex to optimize the network resource utilization with
Local than with Global.

References
RFC 2205, R. Braden et al., “Resource ReSerVation Protocol (RSVP),”
IETF, September 1997, http://www.ietf.org/rfc/rfc2205.txt
RFC 3209, D. Awduche et al., “RSVP-TE: Extensions to RSVP for LSP
Tunnels,” IETF, December 2001, http://ietf.org/rfc/rfc3209.txt
FRR_atlas, Alia Atlas et al., “Fast Reroute Extensions to RSVP-TE for
LSP Tunnels,” IETF draft-ietf-mpls-rsvp-lsp-fastreroute-01.txt, IETF.
ITU-T Recommendation Y.1711, “OAM Mechanism for MPLS
Networks,” Study Group 13, International Telecommunication Union
Telecommunication Standardization Sector (ITU-T), February 2002.
Informative references
MPLS-ARCH, “Multi-protocol Label Switching Architecture”, E. Rosen et
al. Request for Comments: 3031 http://ietf.org/rfc/rfc3031.txt, January
2001
MPLS-RECOV, “Framework for MPLS-based Recovery”, Vishal Sharma
et. al., <draft-ietf-mpls-recovery-frmwrk-05.txt>, http://search.ietf.org/
internet-drafts/draft-ietf-mpls-recovery-frmwrk-05.txt
NET-SUSRVIV, “Network Survivability Considerations for Traffic
Engineered IP Networks”, Ken Owens et al., <draft-owens-te-network-
survivability-03.txt>, http://search.ietf.org/internet-drafts/draft-owens-te-
network-survivability-03.txt
MPLS-TE, “Requirements for Traffic Engineering Over MPLS”, D.
Awduche et al., IETF RFC-2702http://www.ietf.org/rfc/rfc2702.txt, Sept.
1999
INTERNET-TE, “Overview and Principles of Internet Traffic
Engineering”, D. Awduche et al.,Request for Comments: 3272 http://
www.ietf.org/rfc/rfc3272.txt, May 2002


435
Chapter 19
Implementing QoS: Achieving
Consistent Application Performance
Ralph Santitoro
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
SIP Perspective
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM

Implementing QoS involves mapping QoS mechanisms between the Layer
3 and Layer 2 protocols. This is done hop-by-hop. Additionally, end-to-end
QoS policies must be implemented and administered in the form of
network service classes. These tie together both the nodal (hop) QoS
mechanisms and performance, and network (transmission) QoS
performance required to achieve good end-to-end QoS for applications and
services. The transport path diagram highlights topics covered in this
chapter.

Concepts covered
Mapping DiffServ to Layer 2 QoS mechanisms
Mapping DSCP to and from 802.1p user priorities
Mapping DiffServ to frame relay
Mapping DiffServ to ATM
Mapping DiffServ to PPP classes
Mapping DiffServ to MPLS E-LSPs and L-LSPs
Application performance requirements
Categorizing applications based on end user expectations and
performance objectives
Making QoS simple with network service classes
Introduction
Chapter 10 describes the many mechanisms that are used to provide good
QoS performance for real-time applications. Chapter 10 looked at QoS
from a bottom-up approach. This chapter takes a top-down approach to
implementing end-to-end QoS policies. First, the mapping between
DiffServ (IP QoS) and various Layer 2 QoS mechanisms are discussed.
This is followed by a discussion of the performance requirements and
categorization of applications supported over a converged network. Finally,
an approach to simplify QoS is discussed, which is based on network
services classes that provide common QoS policies for popular real-time
and non–real-time applications with similar QoS performance
requirements.
Mapping DiffServ to Link Layer (Layer 2) QoS

QoS is provided at Layer 3 (IP) using the Differentiated Services
(DiffServ) architecture as described in Chapter 10. IP packets traverse
different link layers (for example, Ethernet, PPP, ATM and frame relay)
from source to destination host. The Layer 2 networking device may not be
able to read or interpret the DiffServ Codepoint (DSCP) – subsequently
referred to as a non–DiffServ-capable device. Therefore, end-to-end QoS
requires mapping from the DSCP and DiffServ traffic management
mechanisms, for a given DiffServ PHB, to the appropriate link layer QoS
markings and traffic management mechanisms. By providing a consistent
QoS mapping policy between DiffServ and the Layer 2 QoS mechanisms,
consistent QoS performance can be achieved for packets traversing
different Layer 2 networks.
QoS can only be achieved by ensuring that at each hop in the network, each
Network Element (NE) applies consistent treatment to traffic flows over
the different link layers (See Figure 19-2). The most commonly used link
layer technologies are Ethernet, frame relay, ATM and PPP.

Chapter 19 Implementing QoS: Achieving Consistent Application Performance 437
Figure 19-2: QoS, hop-by-hop across different link layers
Mapping DiffServ to Ethernet

The IEEE 802.1Q standard provides the 802.1p user priority and VLAN ID
fields that can be used for QoS (described in Chapter 10). VLANs allow for
the logical grouping of users or devices with similar QoS or security
requirements. VLANs most commonly allow for traffic separation and QoS
based on the particular Ethernet switch port to which a user is connected
(called port-based VLANs). VLANs can also be created based on Ethernet
source or destination MAC addresses, protocol types or other user-defined
information for the Ethernet switches to classify in the packets.
Typically, the QoS mechanism to identify which class of service to provide
is obtained from the Ethernet frame's 802.1 user priority field. Therefore,
the DSCP should be mapped to the 802.1p user priority consistently across
the network. This mapping is important because some Ethernet switches
may not be able to read the DSCP in the Ethernet frame’s payload yet be
able to interpret the 802.1p value in the Ethernet frame’s header.
Mapping DSCP to 802.1p

A DSCP may already be marked in a packet by a trusted source; that is, the
packet has already been classified and marked by a trusted NE somewhere
else in the network. The DSCP may be remarked based upon the QoS
policy setting or if the traffic is using more than the prescribed bandwidth
for its CoS. A packet may also enter an Ethernet interface on a DiffServ-
capable NE marked with both a zero DSCP value and zero 802.1p user
priority value. Based on the network's QoS policy, the DiffServ-capable NE
should mark the packet's DSCP and 802.1p user priority prior to the packet
egressing an Ethernet interface.
Mapping the DSCP to 802.1p user priorities enables next hop, non–
DiffServ-capable NEs to use the 802.1p value to determine the proper QoS
mechanisms to apply. Therefore, this Layer 2 NE relies on the DiffServ-
capable NE to map the appropriate DSCP to 802.1p user priority. Refer to
Figure 19-3.

Figure 19-3: DSCP to 802.1p mapping scenario

A default QoS mapping policy between DSCP and 802.1p is useful for
several reasons.
The DSCP remains with the IP packet across all Layer 2 networks.
Therefore, it should be used to derive the 802.1p value.
Most Ethernet switches support 802.1Q. They may also be non–
DiffServ-capable NEs that cannot interpret the DSCP value.
Therefore, the 802.1p value is the only QoS marking that can be
used to determine the proper QoS mechanisms to apply.
Table 19-1 provides a default mapping of DSCP to 802.1p. Note that
802.1p value one is not used in this mapping. Also, packets marked with
either CS7 or CS6 DSCPs are typically used for network control traffic. In
the DSCP to 802.1p mapping, both are marked with 802.1p to maximize
the number of 802.1p values available for other applications. Most Ethernet
switches do not support drop precedence encoding using 802.1p.
Therefore, the drop precedence encoded in the DSCP for a given AF PHB
class (for example, AF11, AF12, and AF13) cannot be differentiated using
802.1p. In this case, all DSCP values for the given AF PHB class are
mapped to the same 802.1p value as illustrated in Table 19-1.
Maps to
DSCP
802.1p value
CS7
7
CS6
EF, CS5 6
AF41, AF42, AF43, CS4 5
AF31, AF32, AF33, CS3 4
AF21, AF22, AF23, CS2 3
AF11, AF12, AF13, CS1 2
DF, CS0, all undefined DSCPs 0
Table 19-1: Mapping DSCP to 802.1p

Mapping 802.1p to DSCP

A packet may arrive from a downstream non–DiffServ-capable NE with its
DSCP set to zero (Default Forwarding DSCP; that is, best effort class) but
have an 802.1p user priority set. Once the packet ingresses a DiffServ-
capable NE, the packet may be classified and marked with or mapped to a
DSCP based upon network's QoS policy. Refer to Figure 19-4.
Figure 19-4: 802.1p to DSCP mapping scenario

A default QoS mapping policy between 802.1p and DSCP is useful for
several reasons.
The 802.1p value only has local significance over the Ethernet
segment. Once the Ethernet frame header is removed by a router,
the 802.1p value is lost. However, a DSCP value remains with the
IP packet.
Other DiffServ-capable devices may support a DSCP marking but
may not support or not have their Ethernet interfaces configured to
support 802.1Q (and hence no 802.1p support).
Table 19-2 provides the recommended default mapping between 802.1p
and DSCP if the incoming packet has no DSCP marked or if a QoS policy
is set to trust the 802.1p value and ignore the DSCP value. Note that the
802.1p values map to AF PHB group DSCPs with the lowest drop
precedence; for example, 802.1p 5 maps to AF41 (instead of AF42 or
AF43). This is done because 802.1p does not support the concept of drop
precedence. Finally, a custom DSCP may be used to map to 802.1p 1 since
it is not used for any of the standard DSCP values. If no custom DSCP is
defined, 802.1p should be mapped to the DF DSCP.

802.1p User Maps to

Priority DSCP
7 CS6
6 EF
5 AF41
4 AF31
3 AF21
2 AF11
DF or
1
custom
0 DF
Table 19-2: Mapping 802.1p to DSCP
Scenarios for Mapping DSCP to/from 802.1p

There are different scenarios for when DSCP to 802.1p or 802.1p to DSCP
mapping is required. In Figure 19-5, NE-1 and NE-2 are a non–DiffServ-
capable NEs that can only mark 802.1p and are the source and sink of
traffic. NE-3 and NE-4 are DiffServ-capable devices. NE-3 can mark and
interpret both DSCP and 802.1p values. NE-4 connects to other DiffServ-
capable NEs in the network. When NE-3 receives a packet from either NE-
1 or NE-2, only the 802.1p value is marked. Therefore, NE-3 needs a
default map between the 802.1p value and a DSCP value as illustrated in
Table 19-2. Note that NE-3 could also ignore the 802.1p value and classify
the packet to determine the DSCP value if this approach is specified in the
network QoS policy. In the reverse direction, when NE-3 is sending packets
to NE-1 or NE-2, it must map the DSCP to an 802.1p value since NE-1 and
NE-2 are non–DiffServ-capable devices and can only interpret the 802.1p
value to apply the appropriate QoS mechanisms.
Finally, packets sent between NE-3 and NE-4 only need to use the DSCP
value to apply the appropriate DiffServ QoS mechanisms because both
devices are DiffServ-capable. The DSCP value is preserved if both
interfaces belong to the same 'trusted' administrative domain. Otherwise,
the DSCP may be remarked based on the respective network's QoS policy.

Figure 19-5: Scenarios for mapping DSCP to/from 802.1p
Mapping DiffServ to Frame Relay

Frame relay supports traffic management parameters to specify the rate
(CIR) and committed burst size (bc) and excess burst size (be) supported
for a particular Permanent Virtual Circuit (PVC). However, some frame
relay service offerings specify a zero CIR, in which case the
aforementioned traffic management parameters are meaningless.
One frame relay traffic management parameter in the frame relay header
that affects packet loss is called Discard Eligible (DE). The DE parameter
is useful regardless of whether the service specifies a zero or non–zero
CIR. frame relay interfaces use DE to determine whether incoming frame
relay frames may be discarded under congestion. When DE=0, the
incoming frame should not be discarded. When DE=1, the incoming frame
may be discarded if the network element is under congestion.
Mapping DSCPs to the appropriate frame relay DE value is important to
maintain consistency between IP and frame relay traffic management.
Table 19-3 provides the mapping between the standard DSCP values and
frame relay DE value.

DiffServ Codepoint Frame

(DSCP) Relay DE
CS7, CS6 0
EF, CS5 0
AF41, CS4 0
AF42, AF43 1
AF31, CS3 0
AF32, AF33 1
AF21, CS2 0
AF22, AF23 1
AF11, CS1 0
AF12, AF13 1
DF, CS0 1
Table 19-3: Mapping DSCP to frame relay DE value
Mapping DiffServ to ATM

ATM supports service categories, each with different traffic management
parameters and delay, delay variation and loss performance objectives. The
most widely available ATM service categories are Constant Bit Rate
(CBR), rt-VBR (real-time Variable Bit Rate), nrt-VBR (non–real-time
VBR) and UBR (Unspecified Bit Rate). In general, CBR is used for circuit
emulation services (including circuit-based voice or circuit-based video
transport), rt-VBR is used for real-time packet-based voice or packet-based
video applications, nrt-VBR is used for priority data applications, and UBR
is used for best-effort data applications.
An important ATM traffic management parameter in the ATM header that
affects packet loss is the Cell Loss Priority (CLP). Similarly to frame
relay's DE parameter, ATM interfaces use CLP to determine whether
incoming ATM cells may be discarded under congestion. When DE=0, the
incoming cells should not be discarded. When DE=1, the incoming cell
may be discarded if the network element is under congestion.
Mapping packets marked with DSCPs to the appropriate ATM service
category and DSCP values to the appropriate CLP value is important to
maintain consistency between IP and ATM traffic management and service
performance. Table 19-4 provides the mapping between the standard
DSCP values and the ATM service categories and CLP values.

DiffServ Codepoint ATM Service ATM

(DSCP) Category CLP
CS7 rt-VBR (or CBR1) 0
CS6 rt-VBR 0
EF rt-VBR (or CBR2) 0
CS5 rt-VBR 0
AF41, CS4 rt-VBR 0
AF42 rt-VBR 1
AF43 rt-VBR 1
AF31, CS3 rt-VBR 0
AF32 rt-VBR 1
AF33 rt-VBR 1
AF21, CS2 nrt-VBR 0
AF22 nrt-VBR 1
AF23 nrt-VBR 1
AF11, CS1 nrt-VBR 0
AF12 nrt-VBR 1
AF13 nrt-VBR 1
DE, CS0 UBR 1
Table 19-4: Mapping DSCP-marked packets into ATM Service Categories
and CLP12
Mapping DiffServ to PPP Class Numbers

The Point-to-Point Protocol (PPP) with multiclass extensions used for CoS
markings over PPP connections. PPP fragmentation and interleaving is
required when a network supports both real-time and non–real-time
applications over a converged network with low-bandwidth (< 1 Mbps)
connections. In this use case, the non–real-time IP data packets are
fragmented and interleaved to reduce the queuing delay of the real-time
packets as described in Chapter 10. Each packet fragment is assigned a PPP
Class number that identifies to which CoS the packet fragment belongs.
This information is used during the packet reassembly process where the
PPP connection is terminated.
There are two ways to map DiffServ to PPP Class numbers: one using the
short sequence number format and the other using the long sequence
number format.
1. In general, packets marked with the CS7 DSCP should be mapped to rt-VBR. However, there are
critical protocols that provide constant rate heartbeats that require the lowest loss and delay for
optimal network. Such protocol packets should be mapped to CBR.
2. In general, packets marked with the EF DSCP should be mapped to rt-VBR. However, circuit
emulation over IP application packets marked with the EF DSCP should be mapped to CBR2.

The short sequence number format (four classes) is used for connections
operating at less than 1 Mbps. Popular low-speed WAN connections in use
today are 56 kbps, 64 kbps, 128 kbps, 256 kbps and 384 kbps. Additionally,
popular DSL services use PPP over Ethernet (PPPoE) over DSL
connections, which have symmetrical or asymmetrical connections speeds
similar to the aforementioned low-speed WAN connections.
The short sequence number format only allows a subset of the DiffServ
PHBs to be supported. Table 19-5 provides an example mapping. Other
mapping arrangements are possible. Typically, the EF-marked packets are
marked with the PPP class number corresponding to the PPP class that
provides the lowest forwarding delay.
PPP Class Number

DiffServ Codepoint (DSCP) (short sequence
number format)
EF 3
CS7, CS6, CS5 2
AF41, AF42, AF43 1
DE, CS0 0
Table 19-5: DSCP to PPP class number short sequence format example
The long sequence number format (sixteen classes) is not practical to use
with low bandwidth connections because there is insufficient usable
bandwidth per class (application); for example, 128 kbps / 16 classes =
8 kbps per class (application). Additionally, packet fragmentation and
interleaving is not required over high bandwidth (>1 Mbps) connections.
Mapping DiffServ to MPLS

MPLS uses the DiffServ architecture to provide QoS. MPLS introduces
some new DiffServ terminology that adds more precision to the DiffServ
PHB definition. One important new term is the PHB Scheduling Class
(PSC). A DiffServ PSC is a set of one or more DiffServ PHBs that must
follow the same packet ordering constraints. For example, the AF1 PSC
consists of the AF11, AF12 and AF13 PHBs, which means that all AF1x-
marked packets (where x = 1, 2 or 3) must be sent in the order that they
were received. The EF PSC consists of only the EF PHB, and the DF PSC
consists of only the DF PHB. Finally, the CS PSC consists of 8 PHBs
identified by CS0–CS7 DSCPs.
MPLS allows for two different types of LSPs, each defining a different
interpretation of the EXP bits.
The most common type of LSP is the EXP-Inferred-PSC LSP (E-LSP)
where the EXP bits are used to indicate up to eight PSCs; that is, PHB plus
drop precedence for a given E-LSP in an administrative domain. The

mapping of the EXP bits to PSC can be either explicitly signaled during
label setup or statically configured. Since there are many ways to configure
the EXP bits to support PSCs, a flexible mapping approach is required.
Table 19-6 provides an example DSCP to EXP mapping that maximizes
the number of DiffServ PSCs that can be supported.
DiffServ EXP value

DiffServ Codepoint (DSCP)
PSC (for E-LSP)
CS CS7 7
CS CS6 6
EF EF, CS5 5
AF4 AF41, AF42, AF43, CS4 4
AF3 AF31, AF32, AF33, CS3 3
AF2 AF21, AF22, AF23, CS2 2
AF1 AF11, AF12, AF13, CS1 1
DF DF, CS0 0
Table 19-6: DSCP to EXP mapping for E-LSPs supporting the most PSCs
The limitation of the mapping in Table 19-6 is that it does not account for
the drop precedence indication encoded in DSCPs in each AF PHB; for
example, packets marked with AF12 and AF13 DSCPs should be discarded
(have a higher drop precedence) before packets marked with the AF11
DSCP. An alternative mapping approach that accounts for drop precedence
in the AF PHB group is shown in Table 19-7. Note that the mapping in
Table 19-7 can only accommodate fewer DiffServ PSCs because multiple
EXP bits are used for the same PSC to indicate drop precedence; that is, the
packet can be discarded under congestion.
DiffServ DiffServ Codepoint EXP value Discard under

PSC (DSCP) (for E-LSP) congestion?
CS CS7, CS6 7 No
EF EF, CS5 6 No
AF41, CS4 5 No
AF4
AF42, AF43 4 Yes
AF31, CS3 3 No
AF3
AF32, AF33 2 Yes
CS CS2 1 No
DF DF, CS0 0 Yes
Table 19-7:Example DSCP to EXP mapping for E-LSPs supporting drop
precedence

The other type of LSP is the Label-Only-Inferred-PSC LSP where the

MPLS label value determines the PSC and the EXP field in the MPLS shim
header determines the drop precedence for the L-LSP in an administrative
domain. L-LSPs allow for more flexibility because the number of PSCs is
limited only by the number of labels. Refer to Table 19-8 for an example
mapping of DSCP to MPLS label and EXP for L-LSPs using labels 100-
110.
DiffServ MPLS EXP Discard

DiffServ
Codepoint L-LSP value (for under
PSC
(DSCP) Label L-LSP) congestion?
CS CS7, CS6 100 0 No
EF EF, CS5 101 0 No
AF41, CS4 102 0 No
AF4
AF42, AF43 103 1 Yes
AF31, CS3 104 0 No
AF3
AF32, AF33 105 1 Yes
AF 21, CS2 106 0 No
AF2
AF22, AF23 107 1 Yes
AF11, CS1 108 0 No
AF1
AF12, AF13 109 1 Yes
DF DF, CS0 110 1 Yes
Table 19-8: Example DSCP to EXP mapping for L-LSPs using labels 100-110
E-LSPs support many DiffServ PSCs per E-LSP compared to L-LSPs that
support only one DiffServ PSC per L-LSP. Fewer LSPs simplify network
operations, administrations, management and provisioning (OAM&P),
resulting in lower cost of operations. This is why E-LSPs are
predominantly used when implementing QoS in MPLS networks.
Application Performance Requirements

Over a converged IP network, the QoS performance requirements for both
real-time and non–real-time applications must be considered. It cannot be
assumed that only real-time applications are given good QoS performance.
There are also non–real-time mission critical applications that also require
good QoS performance from the converged network.
Table 19-9 illustrates the QoS performance dimensions required by some
popular applications, each with different QoS performance requirements.
Without applying proper QoS technologies over a common, converged IP
network, these applications will receive unpredictable performance,

ultimately resulting in unacceptable user satisfaction or business

productivity.
Performance Dimensions
Bandwidth Sensitivity to
Example Applications
Needs Delay Jitter Loss
IP Telephony (VoIP) Low High High Med-High
Interactive Video
Med-High High High High-Med
Conferencing
Streaming Video on Demand Med-High Med Med Med
Streaming Audio (Webcasts) Low Med Low Med
Client / Server Transactions Med Med Low High
Email Low Low Low High
File transfer Med-High Low Low High
Table 19-9: Application Performance Dimensions
Categorizing Applications
Networked applications can be categorized based on end-user expectations
or application performance requirements. Some applications are between
people while other applications are between a person and a networked host;
for example, a PC (user) and a web server. Finally, some applications are
between networking devices (for example, router to router).
Applications can be divided into four different traffic categories: namely,
Network Control, Interactive, Responsive and Timely. Refer to Table 19-
10. The table includes some representative applications in the different
categories.
Traffic Category Example Application

Network Control Critical Alarms, Routing, Billing, Critical OAM
Interactive VoIP, Interactive Gaming, Video Conferencing
Responsive Streaming audio/video, Client/Server Transactions
Timely Email, Non-critical OAM
Table 19-10: Categorization of Applications into Traffic Categories
Interactive Applications
Some applications are “interactive”; whereby, two or more people actively
participate. The participants expect the networked application to respond in
real time. In this context, real time means that there is minimal delay
(latency) and delay variation (jitter) between the sender and receiver. Some
interactive applications, such as a telephone call, have operated in real time

over the telephone companies' circuit switched networks for decades. The
QoS expectations for real-time voice applications have been set and,
therefore, must also be achieved as voice applications are migrating from
being circuit-based to being packet-based (for example, VoIP).
Other interactive applications include video conferencing and interactive
gaming. Since the interactive applications operate in real time, packet loss
must also be minimized. Imagine a telephone call where whole or partial
words regularly get lost during the conversation. This level of QoS
performance would not only be unsatisfactory but would make the
application (telephone call) not very usable.
Interactive applications typically use Universal Datagram Protocol (UDP)
and, hence, cannot retransmit lost or dropped packets as with Transport
Control Protocol (TCP)-based applications. However, lost packet
retransmission would not be beneficial because interactive applications are
time-based. For example, if a voice packet was lost, it doesn't make sense
for the sender to retransmit it because the conversation has progressed in
time and the lost packet might be from part of the conversation that had
already passed in time.
Responsive Applications
Some applications are between a person and a networked host or
application. End users require these applications to be “responsive”, so a
request sent to the networked host requires a relatively quick response back
to the sender. These applications are sometimes referred to as being “near
real-time” and require relatively low packet delay, jitter and loss. However,
QoS performance requirements for responsive applications are not as
stringent as for the interactive (real-time) applications. This category
includes streaming media and client/server transaction-oriented
applications.
Streaming media applications (for example, movies on demand or
webcasts) require the network to be responsive when they are initiated so
the user doesn't wait too long before the media begins playing. These
applications also require the network to be responsive for certain types of
signaling. For example, with movies on demand, when one changes
channels or “forwards”, “rewinds” or “pauses” the media, one expects the
application to react similarly to the response time of their 'standalone' video
player controls.
Web-based applications involve a user selecting a hyperlink to jump to a
new page or submit information; for example, place an order or submit a
request. These applications also require the network to be responsive, such
that once the hyperlink is selected, a response (for example, a new page
begins loading) occurs typically within one to two seconds. With
broadband Internet access connections, this type of performance is often
achieved over a best-effort network, albeit somewhat inconsistently. Other

responsive applications include a financial transaction; for example, place

credit card order and quickly provide feedback to the user indicating the
transaction has completed. Otherwise, the user may be unsure that the
transaction completed and attempt to initiate another (duplicate) order
unknowingly. Alternatively, the user may assume that the order was placed
correctly, but it may not have been placed at all. In either case, the user
would be dissatisfied with the network or application's performance.
Responsive applications can use either UDP or TCP-based transport.
Streaming media applications typically use UDP (but can also use TCP).
Web-based applications are based on the Hypertext Transport Protocol
(HTTP) and always use TCP. For Web-based applications, packet loss is
managed by TCP, which retransmits lost packets. Retransmission of lost
streaming media packets is typically handled by application-level protocols
as long as the media is sufficiently buffered. If not, then the lost packets are
discarded resulting in some distortions in the audio or video media.
Timely Applications
Some applications between a person and networked host or application
require 'timely' and reliable delivery of the information in 'minutes' instead
of 'seconds'. Such applications include e-mail and file transfer. The relative
importance of these applications is based on their business priorities.
These applications require that the packets arrive within a bounded delay.
For example, if an e-mail takes a few minutes to arrive at its destination,
this is acceptable. However, in a business environment, if an e-mail took
ten minutes to arrive at its destination, this may be unacceptable. The same
bounded delay applies to file transfers. Once a file transfer is initiated,
delay and jitter are less critical because file transfers often take many
minutes to complete. Note that these timely applications use TCP-based
transport and, therefore, packet loss is managed by TCP, which retransmits
any lost packets resulting in no packet loss.
Timely applications expect the network to provide packets with a bounded
amount of delay. Jitter has a negligible effect on these types of applications
and packet loss is reduced to zero due to TCPs loss recovery mechanisms.
Network Control Applications

Some applications are used to control the Operation, Administration and
Maintenance (OAM) of the network. Such applications include network
routing protocols (for example, RIP and OSPF), network alarms and logs
retrieved using SNMP or Telnet sessions, and network device
configurations configured using SNMP, HTTP, COPS or Telnet sessions.
These applications can be subdivided into those required for critical and
those for standard network operating conditions. To create highly available
networks, network control applications are typically given some amount of
network resources (bandwidth) to ensure delivery and minimize packet loss

and delay. This is done because the network must be operating properly in
order for it to provide proper QoS performance for the end-user
applications.
Network control applications require a relatively low amount of delay.
Jitter has a negligible effect on these types of applications and packet loss
must be minimized since some of these applications are not transported via
TCP and, hence, do not have packet loss recovery mechanisms.
Making QoS Simple via Networks Service Classes

Configuring routers and networking equipment requires a great deal of
expertise gained through training and implementation experience. Very
frequently, the networking terminology is neither intuitive nor obvious and
requires further research into product documentation or Internet standards
before even basic configuration can be accomplished. The IP QoS achieved
by the DiffServ architecture provides a new set of terminology and a
toolbox of IP QoS technologies that, unfortunately, are also not very
intuitive. Implementing good end-to-end QoS can be complex because of
the many technologies, standards and implementation choices to consider.
Furthermore, there is no single QoS technology or standard that can be
used across a network given the various Layer 2 technologies (for example,
Ethernet, frame relay, ATM, and PPP, used to transport the IP traffic).
This section describes how applications can be categorized based on their
QoS performance requirements and how network QoS policies can be put
in place to meet these requirements in order for the applications to perform
well and meet end-user expectations. This application categorization
schema, developed by Nortel, provides a service class nomenclature per
application type and default QoS mechanisms per service class to simplify
the deployment of converged networks. Since the service classes provide
network-wide QoS policies, they are referred to as network service classes
(NSC).
These default QoS performance settings are specified per NSC. The
network administrator can then match the QoS performance provided by
the NSC to the application’s requirements. For example, it may not be
obvious that the DiffServ EF PHB should be applied to VoIP traffic and the
AF PHB should be applied to TCP-based traffic, such as Web traffic.
Furthermore, end-to-end QoS requires more than just IP QoS, and DiffServ
only defines a subset of the end-to-end QoS treatment that traffic will
receive or need to receive. For example, as traffic traverses over different
link layers such as Ethernet, frame relay or ATM, it will encounter devices
that can only support link layer QoS mechanisms and do not support
DiffServ Layer 3 (IP) QoS. The NSCs provide default settings for DiffServ
PHBs and their mapping to the most popular link layer QoS mechanisms.
IP packets are classified and placed into a particular NSC that best matches

the application's performance needs resulting in consistent QoS treatment

at both Layer 3 (IP) and Layer 2.
NSCs can be thought of as default QoS policies built into the product.
Furthermore, a common set of NSC names permit a network administrator
to configure (for example, the Premium NSC service on products A, B and
C) without having to understand how each product implements the required
behaviors. The network administrator simply configures the networking
devices to place the traffic into the appropriate NSC that provides the
closest performance behavior required by the application.
Once the network is engineered to support the QoS performance per NSC,
the NSCs provide default, network-wide QoS policies. Each router or
switch is configured with default configurations for the QoS mechanisms
that best match the requirements for the applications in each NSC. The
NSCs reduce the possibility of configuration errors because the NSCs
account for different QoS mechanisms used in the different products that
may be used across the network. Nortel's products incorporate the NSCs.
However, since NSCs are standards-based, they can be implemented on
network elements that do not support them. This is accomplished by
manually configuring the QoS mechanisms required to match the QoS
performance of the NSC.
Table 19-11 illustrates how different applications are grouped into four
broad categories called Network Control, Interactive, Responsive and
Timely—each with a common set of performance characteristics. The
Network Control category is for applications performing OAM of the
network. The remaining three categories are for end-user applications.

Performance Characteristics
Category
N
Traffic
Tolerance
Tolerance
Tolerance
Profile
Traffic
Delay
S Target Applications
Jitter
Loss
C
Critical
Network Control
Very Small
Critical heartbeats between nodes Very Low N/A
Low packets
ICMP, OSPF, BGP, RIP, ISIS

Network
Low to Variable-
COPS, RSVP
Low N/A Very sized
DNS, DHCP, BootP, high priority Low packets
OAM
VoIP (G.711, G.729 and other
codecs)
Typically,
Premium
Telephony signaling between

small
gateway or end device and call Very
Very Low Very Low fixed-
server (H.248, MGCP, H.323, SIP) Low
Interactive
sized
Lawful Intercept
packets
T.38 Fax over IP
Circuit Emulation over IP
Typically,
Platinum
Interactive Video (Video large

Low to Very
Conferencing) Low variable-
Very Low Low
Interactive Gaming sized
packets
Streaming audio, Webcasts
Streaming video (video on demand) Very Variable-
Gold
Broadcast TV High High Low to sized

Low packets
Responsive
Pay per view movies and events

Video surveillance and security
Client / Server applications
SNA terminal to host transactions
Low to Variable-
Silver
Web-based ordering Low-

N/A Very sized
Credit card transactions, wire Medium
Low packets
transfers
ERP applications (SAP / BaaN)
Email
Bronze
Variable-
Billing record transfer
High N/A Low sized
Non critical OAM&P (SNMP, packets
Timely
TFTP)
Standard
All traffic not in any of the other Typically Typically Variable-

classes Not N/A Not sized
Best effort traffic Specified Specified packets
Table 19-11: Network Service Class Performance Objectives and Target

Applications

Network Control Traffic Categories

Network Control traffic is not of interest to an end-user but is important for
network operations. Network Control traffic is different from application
control (signaling) required for some user applications. Network elements
(for example, switches and routers) initiate and terminate the Network
Control traffic. For example, OSPF routing table updates being propagated
across a network are considered Network Control traffic because they do
not originate from an end-user application. Conversely, at call setup, the
SIP protocol signaled between an IP phone and a call server that controls it
is considered End-User traffic and not Network Control traffic. Network
Control traffic typically requires little bandwidth but must be delivered
across the network to keep the network operational.
End-User Traffic Categories

End-user or subscriber traffic is subdivided into three categories—namely,
Interactive, Responsive and Timely. Interactive traffic, which is between
two people, is most sensitive to delay, loss and jitter. Responsive traffic is
typically between a person and a host. Responsive traffic is less affected by
jitter and can tolerate longer delays than Interactive traffic. Timely traffic is
either between hosts or between hosts and people, and the delay tolerance
is significantly longer than Responsive traffic. To put this into perspective:
Interactive Traffic requires a delay performance on the order of tens
of milliseconds
Responsive Traffic requires a delay performance on the order of
hundreds of milliseconds
Timely Traffic requires a delay performance on the order of
seconds to minutes.
Network operators can categorize their applications based on the QoS
performance that they require. Table 19-11 provides some common
applications and the NSCs that best support them based on their
performance requirements. For example, the Premium NSC is best suited
for IP telephony applications. The Platinum NSC is best suited for Video
Conferencing and Interactive Gaming applications. The Gold NSC is best
suited for streaming media applications. The Silver NSC is best suited for
Interactive Client/Server applications. The Bronze NSC is best suited for
store and forward applications such as e-mail. The Standard NSC is best
suited for applications not requiring or expecting any QoS from the
network; that is, best effort treatment.
In addition to providing default configurations for QoS mechanisms, the
NSCs provide default mapping between DiffServ and different link layer
QoS technologies that a particular interface uses (for example, 802.1p for
an Ethernet interface). Finally, the network must be engineered to support
the QoS performance objectives (maximum packet delay, packet jitter and

packet loss) for each NSC. Network engineering considerations include

transmission delay of each network connection in addition to the number of
routers or switches (hops) that the packet must traverse across the network.
NSCs provide the following default settings for various nodal QoS
mechanisms:
Mark or remark DiffServ code point
Mark or remark link layer QoS
Provide rate enforcement (policing) configured per DSCP per NSC
Use a particular scheduler
Classify based on a default DSCP
Enable or disable shaping
Apply a particular queue management mechanism
Additionally, the network must be engineered to provide the following QoS
performance objectives for each NSC:
Packet Delay
Packet Jitter (Delay Variation)
Packet Loss Ratio
For example, the Premium NSC is designed to support IP Telephony
applications such as VoIP. Table 19-12 outlines some of the default
settings for nodal QoS mechanisms for the Premium NSC. Note that the
link layer QoS attributes will be specific for a given interface type and the
network performance objectives can vary based on business objectives or
end-user quality targets. For example, for VoIP, the maximum one-way
packet delay could be engineered to be 150 ms and provide good QoS.
However, in some scenarios where bandwidth is very expensive, the
maximum one way delay could be engineered to be 200 ms to account for
longer queuing delays on the low bandwidth WAN connections. While the
150 ms will provide the best QoS, the 200 ms may provide acceptable QoS
while meeting business cost objectives.

Nodal QoS
Premium NSC default settings
Mechanism
For trusted interfaces, all packets marked with Expedited
Classifier Forwarding (EF) DSCP or Class Selector (CS) 5 DSCP
are placed into Premium NSC.
Once classified from untrusted interfaces, mark voice
media packets with EF DSCP.
Marker
Once classified from untrusted interfaces, mark voice
signaling packets with CS 5 DSCP.
Meter packets to the configured rate and committed burst
Policer size; for example, CIR and CBS. Drop packets
exceeding configured rate or burst size.
For Ethernet interfaces, mark all packets with 802.1p
user priority 6.
For ATM interfaces, use ATM service category rt-VBR,
Link Layer
mark all packets with CLP=0.
QoS
For Frame Relay interfaces, mark all packets with DE=0.
For multiclass PPP interfaces, mark all packets with PPP
Class Number 3.
Scheduler Use priority scheduler.
Queue Use tail drop queue management. Disable Active Queue
Management Management (AQM) techniques such as WRED.
Shaping Disable shaping.
Table 19-12: Premium NSC default nodal settings
Table 19-13 provides a summary of traffic conditioning per NSC.

DiffServ Standard
Policing Action for Scheduler Queue
NSC PHB DSCPs per
out of profile traffic Type Mgmt.
Group NSC
Critical CS CS7 Drop Priority 1 Tail Drop
Network CS CS6 Drop Weighted Tail Drop
Premium EF EF, CS5 Drop Priority 2 Tail Drop
AF41, AF42,
Platinum AF4 Remark Weighted AQM
AF43, CS4
AF31, AF32,
Gold AF3 Remark Weighted AQM
AF33, CS3
AF21, AF22,
Silver AF2 Remark Weighted AQM
AF23, CS2
AF11, AF12,
Bronze AF1 Remark Weighted AQM
AF13, CS1
None
Standard DF DF, CS0 Weighted AQM
(managed by AQM)
Table 19-13: Network Service Class Traffic Management Summary

Table 19-14 provides a summary of the default QoS mapping policy
between DSCP and Layer 2 QoS fields. When an incoming packet is
classified and placed into an NSC, it inherits the default QoS mapping and
traffic management policies illustrated in Table 19-13 and Table 19-14.

ATM PPP Class

Network DiffServ IP Frame MPLS
Number Ethernet
Service Codepoint Prec. Relay E-LSP
(short seq. 802.1p
Class (DSCP) (TOS) CoS CLP DE EXP
format)
rt-
Critical CS7 7 0 0 2 7 7
VBR
rt-
Network CS6 6 0 0 2 7 6
VBR
rt-
Premium EF, CS5 5 0 0 33 6 5
VBR
AF41, CS4 rt- 0 0
Platinum 4 1 5 4
AF42, AF43 VBR 1 1
AF31, CS3 nrt- 0 0
Gold 3 1 4 3
AF32, AF33 VBR 1 1
AF21, CS2 nrt- 0 0
Silver 2 1 3 2
AF22, AF23 VBR 1 1
AF11, CS1 nrt- 0 0
Bronze 1 1 2 1
AF12, AF13 VBR 1 1
Standard DF 0 UBR 1 1 0 0 0
Table 19-14: Network Service Class Default QoS Mapping Policy3
Telephony Gateways and Terminal Configuration

VoIP gateways and terminals (IP phones) can be configured to premark the
VoIP packets for the Premium NSC as illustrated in Table 19-15. The
network will then provide the Premium NSC performance required by
these applications.
3. The Premium NSC traffic uses a higher PPP class number because it is scheduled over a connection
before Critical and Network NSC traffic to minimize any jitter that would be introduced by the
network control traffic over low bandwidth (<1 Mbps) connections.

Network Default
IP Telephony Application-to- Default
Service 802.1p User
Flow Type Application DSCP
Class Priority
Gateway-Gateway Premium EF 6
Voice Media
Terminal-Terminal Premium EF 6
(bearer)
Terminal-Gateway Premium EF 6
Terminal-Terminal Premium CS5 6
Terminal-Gateway Premium CS5 6
Voice
Signaling Gateway-Gateway Premium CS5 6
(Control) Controller
Gateway-Gateway Premium CS5 6
Controller
Fax (T.38) Gateway-Gateway Premium CS5 6
Table 19-15: Default markings for telephony gateways, terminals, and gateway controllers
Network Element Configuration

All network elements (for example, routers and switches) should place
VoIP traffic into the Premium NSC so the network element applies the QoS
mechanisms best suited for VoIP applications. This reduces the risk of
misconfiguring the QoS mechanisms in the network elements.
Additional QoS Implementation Considerations
Considerations over Low-Bandwidth Connections

There are a number of items to consider when sending real-time packets
over low-bandwidth (< 1 Mbps) access, metro or wide area connections.
This section specifically discusses techniques and recommendations for
such connections. Note that this section specifically discusses VoIP
applications but the techniques described also apply to video applications.
Note, however, that VoIP applications require relatively low bandwidth
when compared to video applications, especially video conferencing.
Therefore, it may not be practical to deploy video conferencing
applications when the connection bandwidth is less than 384 kbps.
Per call bandwidth

The amount of bandwidth used by a VoIP call depends on whether the
voice signal is compressed and which link layer protocol the VoIP packet
uses for transport. To maximize bandwidth utilization, it is desirable to
compress voice signals over low-bandwidth connections. There are several
possible choices for voice compression. The ITU G.729 codec provides
perhaps the best voice quality using the least amount of bandwidth (note
that there are other higher compression codecs such as G.723 but such
codecs do not provide consistently acceptable voice quality). The G.729
codec compresses the voice call from its original 64 kbps down to 8 kbps.

This 8 kbps is the “raw voice” bandwidth, which is then encapsulated into
an IP payload that results in 24 kbps of IP bandwidth.
Voice compression is typically not used over high bandwidth, Ethernet
connections. Uncompressed voice is encoded using the ITU G.711 codec
with the voice samples encapsulated into an IP packet. This results in 80
kbps of IP bandwidth using 20 ms voice samples. This is quite small
compared to 10 Mbps or 100 Mbps of Ethernet bandwidth.
In addition to the IP bandwidth, the Layer 2 protocols (for example,
Ethernet, PPP, frame relay or ATM) used to transport the IP packet must be
included when calculating the total bandwidth.
Per Call Bandwidth Example

One of the main attractions of VoIP is the ability to use an existing data
network infrastructure. For cost reasons, some Enterprises connect branch
offices using low-bandwidth metro or wide area connections. For these
situations, special considerations must be made when VoIP is added to
these bandwidth-limited connections. When VoIP calls are active, the
routers are typically configured to reduce the data traffic throughput by the
amount of bandwidth for all active VoIP calls (for example, 24 kbps of IP
bandwidth per call). However, this may reduce the data traffic throughput
to an unacceptable level. In this case, adding VoIP to the existing data
network requires increasing the metro or wide area connection bandwidth.
Example: A company has two sites that are connected via a leased line
WAN connection operating at 128 kbps. This bandwidth is sufficient for
the current data requirements. The company believes that it needs about
70–80 kbps with occasional traffic peaks to the full 128 kbps. The company
decides to use the G.729 codec to minimize the amount of bandwidth per
voice call and wants to support up to four simultaneous voice calls over the
WAN between the sites. If all four calls were simultaneously active, this
would require 96 kbps of the 128 kbps WAN, leaving only 32 kbps of
bandwidth remaining for the data traffic. Based on the company's business
needs, this is an insufficient amount of data traffic bandwidth.
In general, use the G.729 codec for compressed voice over low-bandwidth
connections. G.729 uses 24 kbps of IP bandwidth for each G.729 call. In
general, the G.711 codec (uncompressed voice) should be used over high
bandwidth connections. G.711 uses 80 kbps of IP bandwidth for each call.
VoIP Delay Budget

The overall one way delay “budget” for a voice call from the time you
speak to the time the receiver hears your voice should typically be no
longer than 200 ms (with a goal of 150 ms) for good quality voice over
landline connections. The amount of delay is often longer but unavoidable
for satellite and other types of wireless connections. Studies have shown

that as the 200 ms delay budget is exceeded, most users tend to perceive the
delay, resulting in dissatisfaction in voice quality. Every time a VoIP packet
passes through a device or network connection, delay is introduced. A
significant amount of delay can be introduced over low-bandwidth
connections.
Reducing VoIP Bandwidth

Over low bandwidth connections, it is desirable to minimize the amount of
VoIP bandwidth required while still maintaining high quality voice. While
one could use a codec with higher compression than G.729, this often
results in less desirable voice quality. Another technique is to use IP/UDP/
RTP header compression since the IP/UDP/RTP header overhead for a
VoIP packet is quite large. Header compression is only used over the low
bandwidth connection.
For example, two G.729 voice samples are sent per IP packet. This results
in twenty bytes of voice plus forty bytes of IP/UDP/RTP header
information to create a sixty byte IP packet for twenty bytes worth of voice
payload. By using IP/UDP/RTP compression, the header can be reduced to
two bytes (plus Layer 2 header). When using PPP as the Layer 2 protocol,
eight bytes are added to the 22 byte compressed IP packet. This results in a
50% bandwidth savings without sacrificing any voice quality.
Reducing Delay via Packet Fragmentation and Interleaving

In converged voice/data IP networks, packets must be fragmented and
interleaved prior to traversing bandwidth-limited (<<1 Mbps) connections
to minimize the real-time packet delay and the jitter that is introduced by
the longer data packets. It is important for the router to perform both
fragmentation and interleaving. The fragmentation function only breaks up
the larger data packets into fragments. The interleaving function interleaves
one voice packet with each data fragment so there is a fixed (one voice
packet time) amount of queuing delay and jitter. There are several different
protocols that can be used to fragment packets. For frame relay
connections, you can use the frame relay Implementation Agreement
FRF.12 for fragmenting packets. ATM natively provides fragmentation
since all packets are fragmented into 53 byte ATM cells. These only
provide fragmentation, so interleaving is a separate function that must be
configured on the router. There are two additional types of fragmentation
techniques that are not limited to a specific link layer technology, such as
ATM or frame relay. These methods are via PPP with multiclass extensions
and IP fragmentation. PPP fragmentation is the preferred method if the
router supports it. However, if the router does not support PPP
fragmentation and interleaving, IP fragmentation should be used. Refer to
Chapter 10 for a more detailed discussion on both approaches.

Note that with PPP fragmentation, the packets are only fragmented over the
PPP connection. With IP fragmentation, packets are fragmented from
source to destination resulting in reduced application performance. For
PPP fragmentation, the fragment size of the data packet is selected based
on the maximum size of one VoIP packet. Depending upon the voice codec
used and the number of voice samples per voice payload, the PPP
fragmentation size will vary. Refer to Table 19-16.
Voice PPP fragm ent

Voice
sam ples Voice packet size for data
payload IP/UDP/RTP
per size - no header packet - no
(10 bytes / Overhead
G.729 com pression header
sam ple)
payload com pression
2 20 bytes 40 bytes 60 bytes 60 bytes
Table 19-16: PPP fragment size for data packet

Table 19-17 provides the recommended maximum MTU size for different
connection speeds when using IP fragmentation and interleaving:
Connection Speed (kbps)

56 64 128 256 512
Maximum MTU (bytes) 128 128 256 512 1024
Table 19-17: Recommended MTU size per connection speed
VoIP Packet Scheduling

It is important that all VoIP packets be queued in a router or switch using a
strict priority scheduler, whereby VoIP packets will be transmitted ahead of
other user applications so the VoIP packets receive the minimum amount of
queuing delay. However, since a strict priority scheduler can block (starve)
the servicing of other traffic queues, one must limit the maximum amount
of bandwidth that the VoIP traffic can consume over the connection. This is
often referred to as “rate limiting”. With this capability, the VoIP packets in
the strict priority queue are transmitted up to a configured rate, percentage
of interface bandwidth or a certain number of packets (based on the queue
buffer depth) after which, the other user traffic queues are serviced.
“Weighted” schedulers such as Weighted Round Robin (WRR) or
Weighted Fair Queuing (WFQ) are not recommended for VoIP. If the router
or switch does not support a priority scheduler, then the queue weight for
VoIP traffic should be configured to 100% if the product supports rate
limiting. Otherwise, it should be configured to a high percentage to

minimize the delay and jitter that other traffic can introduce to the VoIP
packets.
Packet Reordering
In some cases there may be multiple paths for a VoIP packet to take when
traveling from its source to its destination. If all VoIP packets do not take
the same path, then packets could arrive out of order. This can cause voice
quality issues, even though packet reordering often has little or no adverse
affect on data application quality since such data packets can be
retransmitted via the TCP protocol.
If two locations connect via two frame relay PVCs, one must ensure that all
VoIP packets for a particular call traverse the same PVC. The routers can
be configured to direct the voice packets from the same source/destination
IP address to traverse the same PVC. Another approach is to configure the
router to send all voice traffic over one of the PVCs.


IP packets will traverse different Layer 2 networks, each with its unique
QoS mechanisms. In order to provide consistent QoS performance across
the different Layer 2 networks, a network-wide mapping policy must be
implemented. Such a policy describes how the DSCPs for different
DiffServ PHBs map to each Layer 2 QoS mechanisms.
Mapping DiffServ over Ethernet requires mapping the DSCP values to
Ethernet 802.1p values so non–DiffServ-aware network elements can
identify the different classes of traffic over Ethernet. When traversing
frame relay networks, the DSCP values are segregated into packets that can
be discarded and those that cannot during times of network congestion.
Those that are eligible for discard have the frame relay DE set to one;
otherwise, it is set to zero to indicate that the packet should not be
discarded. Similarly, when traversing ATM networks, the DSCPs that
indicate which packets can be discarded under congestion have the ATM
CLP set to one while those that should not be discarded have the ATM CLP
set to 0. MPLS routers supporting E-LSPs use the EXP field information in
the MPLS header to determine the DiffServ PSC of a packet. MPLS routers
supporting L-LSPs use the MPLS LSP label to determine the DiffServ PSC
of a packet.
Over low-speed connections (< 1 Mbps) in a converged network, data
packets should be fragmented and interleaved with the real-time voice
packets to minimize jitter that they could introduce to the voice packets.
PPP class numbers enable the packet fragments to be reassembled into the
proper DiffServ PHB at the receiving end of the PPP connection.
Applications can be categorized based on a common set of QoS
performance criteria. These categories are then used to determine the best
DiffServ QoS mechanisms to apply that best match the application's
performance requirements or to meet end user performance expectations.
Network Service Classes (NSCs) provide network-wide default QoS
policies to simplify QoS provisioning and default mapping between
DiffServ and Layer 2 QoS mechanisms. NSCs can be implemented in any
product to hide implementation differences. Finally, NSCs reduce the
likelihood of provisioning errors and result in consistent QoS policy
implementations across the network.

References
RFC 2597, J. Heinanen, et. al., “Assured Forwarding PHB Group,” IETF,
ATM Forum AF-TM-0121.000 Version 4.1 “Traffic Management
Specification,” ftp://ftp.atmforum.com/pub/approved-specs/af-tm-
0121.000.pdf
RFC 2474, K. Nichols, et al., “Definition of the Differentiated Services
Field (DiffServ Field) in the IPv4 and Ipv6 Headers,” IETF, http://
RFC 2475, S. Blake, et. al., “An Architecture for Differentiated Services,”
RFC 3270, F. Le Faucheur, et. al., “Multi-Protocol Label Switching
(MPLS) Support of Differentiated Services,” IETF, May 2002, http://
RFC 3246, B. Davie, et. al., “An Expedited Forwarding PHB,” IETF, http://
IEEE 802.1Q, “Virtual Bridged Local Area Networks,” http://
standards.ieee.org/getieee802/download/802.1Q-2003.pdf
R. Santitoro, “Introduction to Quality of Service (QoS),” April 2003, http://
www.nortelnetworks.com/products/02/bstk/switches/bps/collateral/
56058.25_022403.pdf
RFC 1990, K. Sklower, et. al., “The PPP Multilink Protocol (MP),” IETF,
August 1996, http://www.ietf.org/rfc/rfc1990.txt
RFC 2686, C. Bormann, “The Multi-Class Extension to Multi-Link PPP,”
IETF, September 1999, http://www.ietf.org/rfc/rfc2686.txt
RFC 1973, W. Simpson, “PPP in Frame Relay,” IETF, June 1996, http://
FRF.12, “Frame Relay Fragmentation Implementation Agreement,”
December 1997, http://www.mplsforum.org/frame/Approved/FRF.12/
frf12.pdf

465
Chapter 20
Achieving QoE: Engineering Network
Performance
François Blouin
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM
Concepts covered
Why network engineering is important
E-Model
Real-time applications engineering & planning process
Hypothetical reference connection
Echo control
Budgeting delay & jitter
Silence suppression considerations
Router schedulers and buffer sizing

Engineering for a traffic profile
Introduction
Historically, TDM voice networks were engineered to ITU-T quality
standards; that is, the end-to-end TDM impairment budget was well
understood, and each operator's network is allowed a distinct portion of
that budget. Element requirements are well-defined and measurable. Where
packet voice standards exist, they are rudimentary and incomplete, thus
requiring further development regarding the needs of adequate packet
based applications planning and engineering.
Packet networks are being deployed as replacements for TDM for
switching voice and multimedia traffic. Packet transmission changes the
impairment budget and network design has to consider additional
impairment, such as delay, distortion, and jitter. The additional packet
network impairments need to be characterized and modeled to accurately
predict the application performance and define workable operating
margins.
Some network impairments are unavoidable: propagation delay (physics),
packetization delay, and legacy equipment. However, many can be
engineered to achieve predictable, acceptable voice quality through careful
control of the remaining impairment margin. Network planning and
engineering provide guidance on the correct choices for each parameter
including optimal packet size, jitter, total end-to-end delay, loss plan, echo
control, choice of codec, link speed, buffer dimensioning and so on.
Engineering is complicated by evolutionary migration to packet networks,
which creates “islands” of packet transmission in the global multioperator
TDM network. The large number of operators offering voice services and
the lack of packet interface standards result in the use of TDM to patch
together different packet domains. Each conversion between TDM hop and
packet networks adds significant impairment to the connection. Network
planning should be done to avoid TDM hops between packet islands. In
this chapter, potential issues related to real-time voice over packet and data
applications will be described along with mitigations and best practice
engineering guidelines.
QoE Engineering Methodology

The end users of network services do not care how service quality is
achieved. What matters to them is how easily they can complete their tasks
and achieve their goals, that is, generally, their Quality of Experience
(QoE). Carriers and transport providers, on the other hand, are very
concerned with defining which QoS mechanisms they should use, traffic
engineering requirements and how to implement, optimize and configure
the network while minimizing cost and maximizing and link utilization.

Chapter 20 Achieving QoE: Engineering Network Performance 467
This section describes in more detail the method and process for
engineering a network to deliver acceptable end-user QoE, also referred as
QoE engineering. There is a distinction between QoE engineering and
traffic engineering. Traffic engineering is the prevailing technique for
mapping traffic flows to ensure optimally-utilized bandwidth and prevent
congestion build up. Traffic engineering is one of the necessary steps of
QoE engineering, but it is not sufficient. To meet specific service quality
targets and user QoE, traffic engineering needs to be supplemented by
additional considerations highlighted in the QoE engineering process. In
order to deliver acceptable service quality and even differentiated services,
QoE should be part of the engineering methodology process; hence, a top-
down approach is proposed as an effective technique to deliver enhanced
customer value while performing network engineering. A four-step QoE
engineering process is shown in Figure 20-2 and will be discussed in the
remaining sections of this chapter for both voice and data services.
QoE Engineering Methodology

Top/Down
Approach
1- Define Service QoE Performance Metrics & Targets
User
(QoE)
Space 2- Identify QoE Contributing Factors & Dependencies
• Impairments - Delay, loss, jitter • Client/server interaction
• Application decomposition • Flow type and duration
3- Determine HRXs, QoS Mechanisms Requirements

• Define service level guarantees (best effort, relative, soft, hard
• N/W transformation phases, call scenarios, HRX topologies
• Echo canceller placement & transmission planning
• Nodal level: scheduler discipline, policing, queue management
• End-to-end: admission control, bandwidth reservation
Network
Architecture
4- Traffic Engineering & Resource Allocation (QoS) space
• Determine traffic demands, distribution & bottleneck links
• Budget allocation: delay, loss, jitter
• Router resource – buffer dimensioning & scheduler share
• BW Provisioning static vs. dynamic/on demand
• Define routing constraints and path layout
Run Simulation and Analyze Network & QoE Behaviour
NO YES Validated => Service

Meet QoE QoE requirements are
Targets satisfied by the QoS
enabled solution
Figure 20-2: QoE Engineering Methodology

This method is essentially a consolidation of fundamental elements of an
end-to-end system in which the interrelationship of QoS-QoE and Traffic
Engineering (TE) is defined based upon a top-down approach starting at
the end-user level. The objective is to facilitate the selection of effective
QoS mechanisms to satisfy a given end-user application QoE.

Voice QoE Engineering

In this section, we review the first two steps described in the above QoE
engineering process model; that is, identifying the QoE performance
metrics and dependencies.
Voice QoE performance metrics & targets

The first step is to determine the metrics used for service quality
performance targets. Once the metrics have been defined, the associated
QoE targets should be determined to achieve the desired voice quality
service level. QoE targets can be based upon absolute threshold or relative
to a known user experience. The latter one is the one preferred for voice
services since end-users have long time experience with telephone systems.
Please note that defining targets is not a trivial task and requires the
appropriate subjective evaluation expertise along with human factors
behavior knowledge. An approach based on known user experience implies
that there might be multiple targets, and no single target is applicable to all
situations. For example, mobile users have different expectations than
wireline users. Similarly, people making international/overseas calls have
different expectations than on local calls. Therefore, there might be
multiple targets required depending on the call scenarios supported by the
network. It has been determined that a difference of 3R1 is not noticeable
by typical users and, therefore, packet network could be engineered within
this margin in order to provide an equivalent replacement technology.
Difference of 3-7R might be noticeable, but most likely acceptable. Larger
R degradations (greater than 7R) are more likely to be noticeable.
User QoE Performance Targets

Services
QoE metrics QoE targets
Conversational Voice
R-factor ∆R PSTN - Packet < 3R
delay < 150 ms
distortion Ie < 3R
Conversational
voice
(CBR and VBR) Path Interruptions Due to Failure
Frequent Interruption 80ms (affects speech
intelligibility)
Infrequent interruption 3 sec (perceived as call
drop)
Table 20-1: Voice QoE metrics and targets.
1. R : Model transmission rating. R is the predicted output quality index of the E-model (ITU-G.107). See
“Chapter 3 Voice Quality” for details.

Voice QoE contributing factors & dependencies

To understand the behavior of QoS mechanisms and their relationship on
QoE, it is important to determine the factors and dependencies that affect
the QoE metric and select the QoS mechanisms that control most
efficiently that factor. This section summarizes QoE metric dependencies
for conversational voice.
Delay, speech codec, packet loss, echo

Four factors have been identified to add impairment to packet networks
beyond that associated with switched-circuit networks and have been
discussed in Chapter 3. These include delay (including delay variation or
jitter), the speech codec, cell/packet loss, and echo. Echo control remains a
main area of concern because the increased delay across the network may
occasionally defeat the loss plan, which controls echo in the local network.
Distortion R User
Satisfaction
1. Speech Coding 100 category
Very
• Introduced by codec and potential Satisfied
transcoding 90
Satisfied
4-Wire
80
2. Talker Echo CODEC
D
A 2-Wire Increasing Some users
D
A
THL
Loop Loss
Distortion dissatisfied
Talker Loss Hybrid
(each
70
direction)
Pad
Echo Many users
Echo
Path M dissatisfied
M M
G
9
60 Increasing
G G Nearly all users
PBX
4 4 End Office 0 Delay dissatisfied
0
0 0 50
0
0 0 0 100 200 300 ms
0
• Echo when delay > 20 msec
0
3. Late/Lost Packets
• Packet loss is caused by buffer
Delays
overflow within the network 4a. Queuing & Jitter 4c. Packetization Delay
• In congestion situations, packets • Large buffer size • Includes intrinsic speech
are lost and jitter is uncontrolled, will add to overall sample accumulation
leading to late packets delay
4b. Network delay
delay + DSP processing
PSTN
MG
LER LER
MG MG
LER LER
MG overhead
• Includes propagation due to
distance and serialization
delay on low speed link
Figure 20-3: Summary of the voice QoE impairments and the impact on the
E-Model “R”
Echo Control and TELR

TELR is a measure of the level of echo signal reflected back to the talker
and is expressed in form of signal loss (or attenuation) going from the
talker all the way to the reflection point and back again to the talker (echo
path).

Eq. 1 TELR = SLR + EL + RLR

where TELR = Talker Echo Loudness Rating
SLR = Send Loudness Rating
RLR = Receive Loudness Rating
EL = Echo Loss
Echo loss is the sum of all the losses in the echo path, including the
Transhybrid Loss (THL), loss pads, and terminal coupling loss (TCLw),
exclusive of the SLR and RLR.
Example 1. TELR calculation with analog terminals is shown in
Figure 20-4. The TELR at Side A is determined as the following:
TELRA= SLRA + ELA + RLRA
= SLRA+(Loss PadB + THLB + Loss PadA)+RLRA

= 11 dB + 6 dB + 14 dB + 6 dB – 3 dB
= 34 dB
A-Side B-Side
2-Wire 4-Wire 2-Wire
Side A Echo Path
SLRA = 11 dB ELA
Side A Side B
Loss PadB
SLR = +11 dB 6 dB RLR = -3 dB
THLB
PSTN 14 dB
RLR = -3 dB SLR = +11 dB

Hybrid Hybrid
Loss PadA
6 dB
RLRA = -3 dB
Figure 20-4: Example of TELR calculation with analog terminals

In this example, the analog terminal loudness ratings (SLR = +11 dB and
RLR = –3 dB) include the loss incurred in a nominal loop length of 2.7 km.
These values are based on updated information provided in the draft
version of TIA/EIA-470-C, Performance and Compatibility Requirements
for Telephone Sets with Loop Signaling. High TELR provides voice
connection with some immunity from degradation with increasing delay.
Calls with lower TELR show impairment at shorter delay. The various
curves in Figure 20-5 show the relationship of R with delay for different
values of TELR (based on nominal loudness values for a digital terminal).
These curves were determined using the ITU E-Model. The E-Model is a
network planning tool that predicts user perception of conversation quality
for a voice connection. For any given value of TELR (holding other
parameters, including terminal loudness, constant), the graph shows how

user satisfaction decreases with increasing delay. From the perspective of

constant delay, the conversation quality of the channel goes down as TELR
decreases (the amplitude of the echo increases). In Figure 20-5, this can be
observed by looking at a specific delay, say 100 ms, on the horizontal axis,
and observing that curves with lower TELR cross the vertical line,
representing that delay at lower and lower R. The black curve in Figure 20-
5 represents the performance of an all-digital connection having the best
possible TELR (65 dB).
The first line of defense against echo is the insertion of loss in the path.,
The introduction of loss reduces the level of echo returned to the talker
(that is, increases the TELR). Loss insertion is of the most benefit in
connections that have two-wire terminations at both ends because the echo
path contains loss in both directions. This constitutes an improvement from
the talker’s echo point of view, but it is at the expense of OLR because the
nonecho signal is also attenuated. There is a limit as to how much loss can
be introduced before conversation becomes noticeably impaired by the fact
that it is hard to hear the person at the other end of the connection. This
limit is typically in the range of 5–7 dB and is governed by OLR
requirements (such as ITU-T Rec. G.111, Ref.15) and terminal loudness
requirements (such as ITU-T P.310, Ref. 16). The maximum permissible
network loss that can be inserted in the connection for the purpose of echo
control will be the difference between the maximum (quietest) OLR limit
and the maximum OLR when no other circuit losses are present
(dependent on the quietest SLR and RLR terminal ratings).

User Satisfaction
100
Analog connection w hich employs only
Very
6 dB loss pads for echo control
satisfied
Analog connection w hich employs
90
ECANs (ERL = 55 dB) for echo control TELR = 69
Decreasing TELR Satisfied TELR = 64
TELR = 59
80
TELR = 54
Some users
TELR = 49
R
dissatisfied
TELR = 44
70
TELR = 39
Many users
dissatisfied TELR = 34
TELR = 29
60
Nearly all users

dissatisfied
50
0 100 200 300 400 500
One-w ay Delay, T (ms)
Figure 20-5:Impact of Echo on user perceived voice quality

The Impact of Echo on user perceived voice quality as shown in Figure 20-
5 is expressed as Transmission Rating R, computed from by the E-Model
as a function of delay and TELR. Low TELR connections are much more
sensitive to delay than high TELR connections.
HRXs and QoS Mechanism Requirements

The purpose of this section is to define a technique to facilitate the
performance analysis of complex network architecture by simplifying it to
simpler single connection path, referred to as Hypothetical Reference
Connection (HRX).
This section will provide guidance to facilitate the selection of effective
QoS mechanisms and to understand their limitations.
HRXs
It has been common practice for carriers to use HRXs to determine budget
allocation standards for nodal roles within their networks, such as
transmission, switching and access delays, loudness ratings, echo signal
path performance and quantization distortion. The connection is
hypothetical in the sense that it takes a sensible stab at the distances and
number and type of equipment involved, usually lumping together common
parts, rather than being a model of a real network connection. HRXs can be

use to determine end-to-end performance and budgets for packet network

artifacts such as codec type, packet size, jitter, packet loss and to allocate
budget to each network component. The use of modeling requires some
concrete scenarios as input to the modeling process. To fully assess the
performance of a network, call scenarios should include both the most
likely call scenarios and some extreme scenarios. Important aspects include
terminal types (analog, digital, cellular), access technologies (POTS,
wireless, broadband DSL/WLAN), the distance (local, long-distance and
international), and the call types (line-to-line, remote offices, PBX,
operator assisted and three way calling). Once the call scenarios are
selected, then the underlying network topologies and detailed HRXs need
to be determined along with the Echo Canceller (ECAN) locations.
Example 2. HRX impairment decomposition for a PBX-to-POTS
long-distance call is shown in Figure 20-6. The impairments are
classified under delay, distortion, sound level and echo. The delay
components should include all delays across the path including
processing, queuing/jitter, propagation and transmission. Similarly,
the distortion components should include packet loss, speech codec,
and transcoding. This process should be executed on all potential
call scenarios—such as three way calling, operator-assisted calls,
voice mail, remote offices, and international calls—to identify the
most sensitive scenarios. At this stage, it is important to determine
all the fixed delays accurately (processing/propagation/
serialization); the variable/queuing delay will be further analyzed in
the traffic engineering step.
In the process of defining the HRXs, it is important to identify the weakest
link, or the bottleneck link, where potential congestion could occur. The
bottleneck link is typically the largest contributor to performance
degradation; therefore, the topology can be simplified by a simple link
rather than a complex network of switches, routers and LAN elements as
shown in Example 3.

Router Router
Router
Router Router Router
MG
900
EC
MG MG 0
EC
MG MG
4K 4K 4K 9K
PBX IXC Tandem Office
Figure 20-6: PBX-to-POTS Call Scenario and associated HRX impairments

breakdown
Example 3. Figure 20-7 shows an Enterprise WAN access example,
whereby the Enterprise LAN client sites are connected to the
content provider by a service provider’s core IP network. In order to
simplify the analysis, the complex network topology can be
reduced to a much simpler HRX while providing very accurate
performance prediction since the largest impairments coming form
the bottleneck link and the corresponding WAN router nodes.

Figure 20-7: Enterprise WAN Access topology

Enterprise WAN access HRX derived form a complex topology. This HRX
will be used throughout this chapter to illustrate the various QoS
mechanisms under different operating conditions. Top graph topology has
been reduced to a single connection path to simplify the analysis. In this
case, the main area of interest is to assess the performance of some QoS
mechanisms implemented in the edge WAN router nodes; therefore, it is
assumed that the service provider core and the Enterprise LANs are well
provisioned hence not adding any significant impairments.
QoS mechanisms selection

The selection of QoS mechanisms varies as a function of traffic type,
transport protocol, as well as the service quality level and user QoE

requirements. In general, QoS mechanisms can be classified into two

broad categories:
Nodal/distributed
End-to-end/centralized
Nodal QoS mechanisms are typically operating solely at the nodal level;
hence, they are performing QoS treatment based upon single node
characteristics without considering other nodes’ status or characteristics.
Typical nodal IP QoS mechanisms are shown on Figure 20-8. They include
traffic classification, bandwidth management, traffic conditioning, queue
management and scheduler.
Meter/ Drop/
Policer Re-mark?
Incoming
Flows Outgoing
Scheduler
Flows
Queue x
Shaper/
Classifier Marker Dropper
BW Management & Traffic Conditioning

Queue
Management
Bandwidth
Traffic Queue Scheduler
Management
Classification Management
& Traffic Conditioning
Classify and place Perform metering, Discard packets Perform datagram
incoming policing, shaping effectively based scheduling by
datagrams in and/or re-marking to upon application using a specified
appropriate ensure that the traffic requirements queuing scheme
queues. entering the domain (RED/WRED, (FIFO, WFQ, PQ)
conforms to the rules ECN)
specified in the SLA.
Figure 20-8: Typical IP QoS mechanisms & operation

A typical network would include one or more of these functions depending
on the service level guarantees requirements, traffic type and end-user QoE
requirements. DiffServ is a typical example of nodal/per hop behavior
architecture. Nodal QoS mechanisms are obviously less complex as they
don’t need to built-up knowledge of an entire path. However, they offer
performance and efficiency limitations in delivering hard QoE2, but this
still might be acceptable depending on the business model and SLA targets.
Optimizing the wrong measures may achieve certain local objectives but
may have disastrous consequences on the emergent properties of the
network and thereby on the quality of service perceived by end-users of
network services.
The other class of QoS mechanisms, end-to-end/centralized QoS
mechanisms, operates on multiple nodes to provide QoS treatment on an
end-to-end basis. Obviously, end-to-end treatment is more complex but
2. Hard QoE means the service quality is guaranteed all the time under any conditions, any load and traffic
patterns absolute limit.

may provide some performance advantage, especially better service level

guarantees. End-to-end QoS mechanisms would typically include voice
Call Admission Control methods (CAC), TCP flow-based admission
control and/or any centralized coordination methods to control traffic flows
admission and bandwidth reservation. RSVP would be an example of an
end-to-end signaling protocol for bandwidth reservation. However, RSVP
has not been widely deployed because of its scalability problem. The
premise of centralized/end-to-end QoS mechanisms is to ensure sufficient
network resources are available to carry delay, loss, jitter for real-time
sensitive information traffic, to meet SLA targets as well as protecting QoE
guarantees made to already admitted flows. A centralized QoS architecture
will be more capable of providing superior service level guarantees than
nodal QoS mechanisms.
In a typical centralized admission control architecture, a flow declares its
needs and constraints (delay, loss, jitter…), as well as the traffic
characteristics it will send into network. The centralized coordinator node
will accept or deny new traffic flows based on availability of network
capacity; it may block call (for example, busy signal) if it cannot meet
needs. These types of mechanisms are far more complex, and some of them
are still under development and/or early stage of deployment. Call servers
and media gateways are now supporting some form of CAC, whereby the
MGs will admit a limited number of calls on a given channel or link size.
So far, end-to-end QoS mechanisms have been principally used for voice
traffic; however, the trend moving forward is on flow-based TCP
admission control for data traffic. WLAN standard 802.11a/b offers a
centralized coordinator mode of operation called Point Coordination
Function (PCF), whereby access to the medium is granted through a
polling scheme. The newly developed 802.11e, which is not yet ratified,
will implement similar type of centralized coordination mode but also
implement bandwidth reservation mechanisms. Figure 20-9 shows an
example where end-to-end QoS engineering could also be applied to
reduce the level of transcoding stage on an intercarrier voice call.
MSC SIP/SIP-T SIP/SIP-T

/MTX CS/SS MCS
WG AMR/SMV PVG G.711 PVG G.726 PVG G.711 MP G.729a

CPE
CPE
VoIP TDM VoIP TDM VoIP
IP network connectivity allows optimized call path
Figure 20-9: Inter-Carrier transcoder optimized end-end calls

Service level guarantees and QoS mechanisms selection

In the previous section, we saw that the selection of QoS mechanisms do
not necessarily lead to a system delivering consistent and continuous QoE
as the operating condition changes. An aspect that also needs to be
considered as part of the QoS strategy is the service level guarantees. The
service level guarantees, referred to here as the system consistency in
delivery, the desired QoE. For instance, if a given QoS mechanisms only
works under moderate load during off peak hours, then the service level
guarantees would be called “soft.” In contrast, if the QoS architecture is
capable of providing low delay, and jitter and loss under extreme operating
conditions, the service level could be considered “hard QoE.” Not all QoS
mechanisms are capable of delivering hard QoE guarantees, while many
can provide “soft” QoE. Some backbone path networks exhibit low delay
and loss; hence, are very suitable for VoIP. However, there is no
consistency across all paths, and many backbone paths exhibit undesirable
characteristics—such as large delay spikes, periodic delay patterns and
outages—that may preclude hard QoE delivery unless adequate
provisioning and efficient traffic engineering is in place. The average
packet delay reported in all studies is predominately dominated by light
speed propagation delay. However, jitter and loss rate are highly dependent
on the time of the day, the provider and the path. The factors that contribute
to the path characteristics variations are as follows:
Equal-cost Multipath (ECMP) and network routing
Packet size distribution
Link utilization & traffic load
Anomalous router behavior – routers performing other functions
(that is, table updates)
Sprint* has reported that 99.9% of the time, the packet delay coast-to-coast
is less than 31 ms, which essentially means that propagation delay is
predominant; however, about 0.1% of the time, packets experience delays
that are longer (sometimes much longer) than this. So the question is what
service level guarantees should we offer? And what fraction of the time of
the service level should be guaranteed? There are no standard guidelines
for defining the service level guarantees. It is usually defined on a case-by-
case basis and depends on each customer’s SLA. Table 20-2 highlights the
QoS mechanisms, along with their capabilities in delivering different
service level guarantees.

QoE Level
Traffic Engineering/QoS Mechanisms
Hard QoE Soft QoE
Traffic classification √ √
Nodal/distributed QoS
Traffic Engineering/QoS Mechanisms Strategies
Traffic buffering √ √
Mechanisms
Traffic scheduling √ √
Rate limiting and policing √ √
Active Queue Management optional
Time deterministic & time slot channel

allocation
Centralized E2E QoS/TE
any
BW reservation/Dynamic provisionning
Mechanisms
Centralized admission control methods √
Centralized coordinator node √
Over-provisioning Alternative to above
Table 20-2: Service Level guarantees vs. QoS mechanisms

In the preceding table, a check mark ( ) indicates that a QoS mechanism is
required for a given service level guarantee. Overprovisioning is obviously
an alternative to QoS mechanisms and could still be a viable solution,
depending on the business model, cost, complexity and efficiency
constraints.
Traffic Engineering & Resources Allocation

The last step in our QoE engineering process is Traffic Engineering (TE)
and resources allocation. TE is a technique for mapping of traffic demands
onto the network infrastructure to achieve specific performance objectives,
such as service quality, efficiency, cost, availability etc. The different
elements of traffic engineering are highlighted in Figure 20-10 and will be
covered in this section.

Figure 20-10: Traffic engineering examples
Traffic demands
One of the first steps in traffic engineering is to identify traffic demands
and traffic sources in terms of distribution, characteristics and aggregate
volume. Traffic demands can be classified in at least two categories:
Constant Bit Rate (CBR) and Variable Bit Rate (VBR). Once traffic
demands have been established, network resources (buffer, scheduler share,
link size) can be allocated. The following sections will describe voice,
video and data traffic source demands.
CBR traffic sources

CBR traffic demands are primarily driven by the applications (for example,
voice, video, FAX, or modem), codec type, payload and protocol overhead.
For CBR traffic sources, the peak rate equals the mean rate. Therefore,
based upon these parameters, it is possible to calculate the amount of
capacity and/or link size to transport a finite number of sources or sessions.
In general, CBR sources are easier to provision as their individual and
aggregate characteristics are well defined and relatively easy to calculate.
The traffic demand examples for CBR sources for various codec and
payload type are shown on Table 20-3. The aggregate is calculated by
multiplying the single source traffic volume x the number of expected
source.

VoATM VoIP/ATM-AAL-5 VoIP/ppp

G.711 G.711/IP G.711/IP G.711/IP G.711/IP G.711/IP G.711/IP
over AAL- over AAL-5 over AAL-5 over AAL-5 over ppp over ppp over ppp
1 (6ms) (10ms) (20ms) (30ms) (10ms) (20ms) (30ms)
codec bit rate (kb/s) 64 64 64 64 64 64 64
speech frame length (ms) 6 10 20 30 10 20 30
voice payload (bytes) 47 80 160 240 80 160 240
voice packet per second 171 100 50 33 100 50 33
RTP header 12 12 12 12 12 12
UDP header 8 8 8 8 8 8
IP header 20 20 20 20 20 20
ATM AAL-5 CPCS header 8 8 8
ppp 7 7 7
payload+protocol overhead 47 128 208 288 127 207 287
ATM AAL-1 SAR header 1
ATM header 5 5 5 5 5 5 5
# of cell per voice packet 1 3 5 6 1 1 1
cell padding 0 26 52 25
% Overhead 11% 50% 40% 25% 37% 23% 16%
Cell rate (cell/sec) 171 300 250 200 100 50 33
Effective Bit Rate (kb/s) 72.5 127.2 106.0 84.8 101.6 82.8 76.5
Table 20-3: CBR voice bandwidth requirements including protocol overhead

Example 4. In order to estimate the number of voice calls
supported on an OC-3 link, the voice call traffic demands need to be
estimated along with the effective link capacity. For a G.711-20 ms
VoIP/AAL-5, the effective throughput requirements are 106 kbits/s
per call (see Table 20-3). The effective OC-3 Link capacity is
calculated by subtracting the SONET overhead and a 5% call
control/signaling resulting in 142 Mbits/s. Therefore, the maximum
number of voice calls is 142 Mbits/s divided by 106 Kbits/s = 1339.
In comparison to a traditional TDM infrastructure capable of 2016
voice calls, the packet solution using CBR codec is not as effective
in terms of call throughput due to the packet infrastructure
overhead.
North America TDM rate structure

DSO
DS1
Mux
…
DS3
Mux
TDM
…
24 DSOs OC3
Mux
28 DS1s network
3 DS3s
Max num of voice band sessions/channels on OC-3 =
24 x 28 x 3 = 2016 DSOs 2016 x 64Kbits/s = 129 Mbits/s
Packet
Circuit
n DS-1s To Packet OC-3c Packet
TDM network
OC3c rate – SONET overhead – 5% traffic overhead for

signalling and control = 155-6-7 = 142 Mbits/s
Figure 20-11: Traffic demands calculation example

VBR traffic sources

In contrast to CBR sources, the variable rate of traffic from VBR sources
makes VBR more complex to provision. For example, a voice source with
silence suppression enabled may generate traffic ranging from a few up to
100 Kb/s, while an MPEG-2 video source may vary from 3–9.8 Mb/s.
Table 20-4 shows typical video codec traffic demands. Other VBR traffic
includes data sources, which are more bursty than VBR voice and video.
Traffic engineering for VBR sources cannot be done using the three
formulas, but instead require more sophisticated analytic equations or must
be done empirically, using simulation.
Application Resolution Bit Rate Codec

(Typical) (Bits/s)
Video N/A 384–768 kb/s H.261, H.263,
conference CBR/VBR DivX
Internet Video 320 x 240 56–300 kb/s DivX, MPEG-4,

streaming CBR RealVideo,
Sorenson,
Windows Media
Broadcast TV 640 x 576 (PAL 3–5 Mb/s CBR/ MPEG-2

& SECAM) VBR
640 x 480
(NTSC) 1.5–2.5 Mb/s MPEG-4 AVC,
CBR/VBR SMPTE VC-1
Video on 528 x 480 / 3.75 CBR MPEG-2

demand 480 x 480 /
352 x 480
DVD 720 x 480 3–9 Mb/s VBR MPEG-2
HDTV 1280 x 720 p 12–19Mb/s MPEG-2

1920 x1080i VBR
6–9 Mb/s VBR/ MPEG-4 AVC,

CBR SMPTE VC-1
Table 20-4: Traffic profiles for video sources
Voice traffic sources with silence suppression

Silence Suppression (SS) is one of the solutions used to reduced bandwidth
and increase the call throughput in packet networks. It relies on a Voice
Activity Detection (VAD) mechanism for each speaker in a voice call.
When silence suppression is used, talkspurts and silence periods are

statistically distributed, resulting in a corresponding distribution of

instantaneous loading at the statistical multiplexer (buffer). The talkspurt
and silence period distributions together define the burstiness of the
incoming voice traffic source into the statistical multiplexer. When silence
suppression is used, voice traffic sources are modeled as VBR sources and
there is the probability that a sufficient number of these sources are active
simultaneously. This may cause the buffer to overflow with subsequent
packet loss. The large variation of traffic load from this “bursty” behavior
of silence-suppression-enabled sources requires careful capacity planning
and buffer size dimensioning. When silence suppression is used, the ratio
of peak/mean (burstiness) traffic demands varies depending on the number
of sources and the voice activity level. As the number of sources increase,
the peak/average ratio is reduced from 3:1 down to 1.5:1 with 64 sources
and further to 1.25:1 for 250 sources using the traditional talker model as
shown on Figure 20-12. Below 24 users, the aggregate is highly bursty and
exceeds the capacity needed for CBR sources; therefore, silence
suppression is not an effective approach when the number of sources is
small (less than 24). As the number of sources increase, the aggregate
traffic demands become less bursty and more predictable hence
advantageous for silence suppression (VBR) sources.
Comparison of AAL-2 Voice Traffic with/without Silence

Suppression
G.729 10ms, CU=5ms
50.0
Peak
Link BW
45.0
42.1 From 1 to 24 sources, the use of
silence suppression creates a
40.0 significant amount of bandwidth
variation due to the ON/OFF
activity of voice activated
35.0
Link BW per Call in KBit/s
No Silence Supression
30.0
Average Witn Silence Supression
link BW
25.0
With 24 sources or more, there is is

20.0 significant bandwidth saving using
silence suppression
15.0
12.3 11.8 11.8 11.8 11.8 11.8
10.0
5.0
0.0
1 4 24 32 48 64 256
Number of source
G.729 10ms AVG BW per Call (CU=5ms)
G.729 with VAD 10ms AVG BW per Call (CU=5ms)
Figure 20-12: Comparison of CBR and VBR voice traffic demand per call for
G.729/AAL-2 codec with/without silence suppression

The columns in Figure 20-12 indicate the average traffic, while the thin
bars indicate the peak bandwidth generated per voice call for the silence
suppression-enabled VBR sources. This graphs highlights the fact that
when only a few silence suppression-enabled sources are multiplexed, the
peak traffic exceeds the CBR sources; therefore, it is not an effective
solution. (Talker model assumes average talkspurt of 0.352 sec, silence gap
of 0.650 sec).
One of the most important elements affecting silence suppression-enabled
voice sources is the voice activity level. Bandwidth required to support
voice calls with silence suppression depends primarily on the voice activity
level; that is, the ratio of talkspurt/(talkspurt + silence) and the mix of voice
calls and voiceband data. The differences in talkspurt/silence distribution
types of several talker models reported in the literature are not exhibiting
significant differences in terms of voice traffic profiles generated at the
aggregate level, and thus are less critical than the effective voice activity
level. The voice activity level varies depending on the talker model used
and the type of service. It generally varies from 35% to 55% for
conversational voice and up to 88% for scripted speech. No single talker
model fits all voice-based applications and services. The voice activity
level assumption is key to effective silence suppression engineering and
design, but it is not fully understood; further characterization is required.
At this time, the 55% activity level is a conservative value for engineering.
Figure 20-13 shows the maximum number of voiceband sessions [1] for
North American TDM and SONET-based capacity on an OC-3 link and
upon the Central Limit Theorem (CLT). This engineering graph can serve
for capacity planning. A reference line has been set to 2016, which
indicates the number of voiceband sessions carried over a TDM
infrastructure; therefore, all the points above it indicate the operating
conditions where silence suppression would offer equivalent or superior
capacity.

Voice Band Sessions and Voice Activity Level w ith Increasing Voice Band Data Contribution
2800
2700
2600
2500
2400
100% Voice - 142Mbps
Maxim um Voice Band Sessions
2300 80% Voice - 142Mbps

60% Voice - 142Mbps
2200
40% Voice - 142Mbps
2100 2016
100% Voice - 129Mbps
2000
80% Voice - 129Mbps
1900 60% Voice - 129Mbps
40% Voice - 129MBPS
1800
1700
1600
1500
1400
0.45 0.50 0.55 0.60
V AD Le ve l
[1]: Voice band sessions includes voice calls & voice band data
142 Mbits/s is derived from OC-3 – SONET overhead – 5% BW for call control/signaling
129 Mbits/s is derived from 2016 channelized DS0s at 64Kbits/s each
Figure 20-13: Maximum number of voiceband sessions [1] for North America
TDM and SONET-based OC-3 capacity as a function of the
voice activity level
Data Traffic demands

Data traffic is another kind of VBR traffic (sometimes less predictable than
VBR voice). TCP flow control, operating at Layer 4, is used for traffic
throttling up and down based upon implicit congestion notification. Despite
numerous attempts to define a typical user profile, it has been difficult to
nail down a suitable average profile. A set of typical individual user traffic
source models was created based on published papers and Nortel internal
studies. Data traffic flows were classified in three distinct groups: long,
short, and super-short, to represent typical real-time data user applications,
such as HTTP/web browsing, telnet and gaming. Examples of data traffic
demands are shown in Table 20-5. The inter-arrival distribution implies that
not every user would be active at the same time, and includes time between
requests.

Traffic Flow Transaction Size & Inter-arrival

Type distribution Distribution
main object size mean: 32 sec
mean: 10Kbytes stdev: 92 sec
Web-browsing, e-commerce
stdev: 25Kbytes Distribution: Weibull
Distribution: Lognormal
inline object size
Short Flows mean: 7.7Kbytes
stdev: 125Kbytes
Distribution: Lognormal
number of inline object
min: 1, max 10, mean 5
Distribution: uniform
mean: 40bytes mean: 300 sec
Telnet,
stdev: 20bytes stdev: 150 sec

login
Distribution: Normal Distribution: Weibull

S-Short Flows
mean: 80bytes mean: 0.06 sec
Gaming
Distribution: Distribution:
Deterministic
Extreme (a=80, b=5.7)
Table 20-5: Typical data application traffic demands based upon industry
published papers and internal Nortel research
Figure 20-14 shows the aggregate data user traffic demand requirement for
various QoE level. As expected, the bandwidth requirement varies as a
function of the number of users as well as with the level of QoE provided.
This engineering table has been derived from network simulation based
upon a predetermined data user traffic profile (see Table 20-5) and, as one
of many typical traffic profiles, is provided as a guideline only. There is no
“one size fits all” traffic profile that would suit all Enterprise type
businesses and customers. So it is expected that some characterization
would be required to derive accurate traffic demand matrixes for specific
user profile. In general, it was found that the average individual data user
traffic would vary 20 kb/s up to 130 kb/s, depending on the number of
sources, aggregation level and QoE targets. The level of aggregation is
highly beneficial due to the statistical multiplexing. For example, for a ten-
user network, it would require about 130 kb/s for each user to deliver
optimal QoE, while it would only need 32 kb/s for a 2000 user network.
Example 5. Determine the bandwidth requirement for an
Enterprise WAN access link size requirements for 100 users. A 100
user network would require 6.3 Mbits/s link bandwidth to deliver
optimal QoE, while it would require about half (2.8 Mbits/s) to
provide acceptable QoE.

Link Bandwidth Requirments

Optimal QoE Acceptable QoE Lower QoE
Total Link Bandwidth Requirement
70.0 (Mbits/s)
• A network with 100 users Number of Optimal QoE Acceptable Lower QoE
60.0 requires 6.2 Mbits to deliver
Bandwidth (Mbits/s)
Users QoE
50.0 optimal application QoE
(green curve). 10 1.3 0.5 0.3
40.0 • Other QoE levels (yellow and 50 3.6 1.6 1.2
pink curve) can be satisfied 100 6.3 2.8 2.0
30.0
with lower bandwidth 200 9.5 5.2 4.0
20.0 500 20.0 13.0 10.0
10.0 1000 35.0 25.0 20.0
2000 64.0 50.0 40.0
0.0
10 100 1000 10000
Number of Users (log scale)
Figure 20-14: Data user traffic demand requirements for various QoE level
Figure 20-14 is based upon Table 20-5 user traffic profiles, including TCP
short and long flows. Long flow traffic represents 80% of all traffic
volume. Heavy Peer-to-Peer (P2P) traffic will most likely skew these
engineering rules and QoE work is ongoing to study the impact of P2P on
provisioning and QoE.
Network Impairments Budgeting–Delay/Jitter and Distortion

In order to meet the desired QoE targets, an impairment budget planning
exercise needs to be performed in order to determine the available margins.
For each level of voice service quality, the margin for allocation is
determined by comparing the proposed packet infrastructure with a
designated benchmark (PSTN or wireless). Impairments can be classified
into two distinct classes: delay/response time and distortion, where
distortion would include packet loss, echo and speech compression.
Delay budgeting
The delay margin for voice can be derived by either using the E-model 3R
delta rule or by using a hard limit maximum one-way delay as a threshold
point that meets the desired QoE. The absolute delay threshold technique
would most likely be appropriate for UDP-based data services. It should be
pointed out that the delay budget should be done from an end-to-end
perspective; that is, that no one gets to use it all. From a pragmatic
perspective, that means that the impairment budget should be allocated
across all of the elements of a connection (see the next example).
Example 6. Determination of delay margin using the 3R rule for a
POTS-to-POTS call going through a packet core is shown on Figure
20-15. Delay impairment sources include PSTN switching offices
(End Office and Tandem Office) plus propagation delay, circuit-to-
packet media gateway (PVG) for voice encoding and decoding, and
IP core routers switching and queuing. The delay margin is
computed by finding the intersection point on the E-model where
the –3R line crosses the reference connection. The delay/jitter

margin of 95 ms is derived by subtracting the maximum allowable

delay for a 3R degradation (167 ms) minus the sum of intrinsic
fixed delay (propagation, IP media gateway, or processing) of 72
ms.
Circuit-to- Circuit-to-
Packet Packet
PVG PVG
POTS EO TO PSTN PSTN TO EO POTS
PVG IP/MPLS PVG

Transport
TDM core TDM core PRI
1 hop/1000km Packet core 1 hop/250km 1 hop/1000km IP
TDM core distance Packet core distance = 2000km TDM core distance
Total call connection distance = 8Kkm

Figure 20-15: Delay source components of POTS-to-POTS call going
through a packet core
G.711/10ms, 0% packe t loss

PSTN Reference Point Target w / maximum jitter
(R = 92.8, T = 52 ms) (R = 89.8, T = 167 ms)
100
94.3
90
-3R
80
R-factor
70 Prop IP MG POTS-to-POTS
POTS A ccess Jitter Margin Performance
POTS Access (95 ms)
60
50
0 100 200 300
Average One-W ay Delay (ms)
Figure 20-16: Delay impairment with E-model PSTN reference and

replacement packet solution “R” target
Note in Figure 20-16 that the –3R represents the voice target for equivalent
QoE performance to the reference PSTN.
A second method for delay impairment budgeting is to use an absolute
delay target limit. For conversational voice services, ITU G.114 suggests
an upper limit of 150 ms of one-way delay for wireline call. This 150 ms is
a safe target that can be used for some wireline calls (POTS-POTS) and
short wireless calls (POTS-Mobile). When propagation delay approaches
150 ms, such as on very long international calls or mobile-to-mobile calls,
then it might not be possible to use this target but still offer PSTN voice
quality services. It should be noted that for those ultra-long international

connections (where propagation approaches 150 ms) or mobile-to-mobile

calls, the user’s expectation is not the same as a local call, so the theoretical
limit of 150 ms is not a realistic nor possible target and the 3R rule is more
appropriate. The use of 150 ms maximum delay should only be used under
specific conditions. The delay in all networks should be such that it is
achieving the QoE target requirements or similar to the network it is
replacing (PSTN wireless or Wireless). In every network, there are intrinsic
delays that cannot be changed or removed, which include propagation,
transmission, equipment switching, and use of mobile access. In addition,
there are other delay contributions that are controllable and can be
optimized to achieve a desired level of performance. These controllable
parameters include voice compression codec, packet size and network
loading, and jitter buffer size. The network loading essentially drives jitter
and packet loss.
Example 7. Figure 20-16 shows how to determine the delay
impairment margin using an absolute delay threshold for a VoDSL
voice call. First, sum up all the fixed delays for all major HRX
elements: results = 102 ms. Second, the fixed delays are subtracted
from the 150 ms delay target, resulting in an available 48 ms jitter
margin. In networks where multiple packet domains exist, the
available margin should be fractioned among all with fairness; that
is, the entire 48 ms is not allocated to the first DSL provider. A
simple approach is to split the jitter margin equally among the
various packet domains; for example, split the DSL into access A
and B. Note that jitter is not a simple additive process, but rather a
complex convolution. However, the additive method remains a “fail
safe” whereby the result is very conservative. Simulation is required
to determine, with accuracy, the jitter induced on a network of
queues or statistical multiplexers.
Home MG + Home MG +
Regional/Core N/W
modem+ client DSL Access DSL Access modem+ client
LER LER
Provider A Provider B
Home N/W DSL Access Regional/Core DSL Access Home N/W T

Client MG DSLAM ATM SW BRAS BRAS ATM SW DSLAM MG Client
D 1 40 5 30 5 20 1 102
Recommended End-to-End Delay: 150ms (including propagation delay)
Legend: D: Fixed Delay (ms), T: Total end-to-end delay (ms)
Note: Home MG delays includes DSL modem frame inter-leaving correction delay + voice encoding/decoding
Figure 20-17: Determination of delay margin using a fix delay threshold

Distortion budgeting
Conversational voice services distortion budgeting includes packet loss,
echo control and codec distortion. For a packet network replacing a
traditional TDM infrastructure, a 3R distortion margin is recommended to
produce unnoticeable degradation. Other margins can also be used,
depending on the business model and service quality expectations.
Although the 3R margin produces equivalent quality 3R margin, the margin
allocation is very small and offers very limited option and flexibility for the
controllable parameters. The codec should solely use G.711 as all other
codecs have an equipment impairment; that is, greater than 3R.
Transcoding should be eliminated as a call traverses multiple service
provider’s networks and/or packet island. Packet-to-packet handoff is
required to prevent additional voice decoding/encoding stages. If TDM
handoff is used between packet islands, the impairments budget is thus
fractioned (see Figure 20-18).
Packetization & De-jitter Packetization & De-jitter

Encoding Encoding
Figure 20-18: Impact of TDM handoff

Concatenated packet networks with TDM handoff replicates packetization,
jitter and codec impairment (transcoding or tandemmed coding),
fractioning the available budget for allocation.
Also, to meet our 3R distortion margin, very low packet loss is required
along with best practice echo control. Packet loss has a significant impact
on voice quality, even where Packet Loss Concealment (PLC) is deployed.
Many packet loss models assume a random distribution of packet loss. In
real networks, packet loss is intimately tied to jitter, both of which are
governed by congestion, and jitter is bursty. Consequently, voice output is
muted (no voice playback). Therefore, it is recommended to have
negligible packet loss, 10–3 or less, to achieve equivalent to PSTN voice
quality. In order to achieve this target, proper congestion/QoS mechanisms
need to be in place. For complete backward compatibility with voiceband
services such as FAX and Modem, stricter packet loss control is required
since these applications are more susceptible to loss. The packet loss target
for voiceband services may range from 10–3 to 10–6.
Packet loss is a “distortion” element aspect for TCP-based data
applications. TCP timeout and retransmission delay will extend the total
transaction time as TCP packets experience losses. Packet loss rate has
been identified as a key QoE contributing factor; therefore, should be

controlled within specific limit to achieve the desired QoE performance.

Table 20-6 shows the TCP-based data application packet loss requirements
for various QoE level. For example, in order to achieve an optimal QoE of
less than 200 ms for interactive application, a packet loss rate of less than
0.1% is required. This loss rate could be relaxed to 1% if a longer response
of 400 ms could be tolerated.
QoE targets and Requirements

Data Application type
Preferred Acceptable
Response time1 200 ms 400 ms

Interactive
(gaming, telnet...) 1.0%
Loss rate2 <0.1%
Responsive Response time1 Two seconds Four seconds

(e-commerce, web,
http...) Loss rate2 <0.1% 0.7%
1:Response time targets are derived from ITU-G.1010 as well as from subjective
studies conducted at Nortel.
2:Recommended packet loss targets are based on modeling and simulation studies
conducted at Nortel.
Table 20-6: TCP-based data application packet loss requirements for

various QoE level
Network Resources: buffer, scheduler share & bandwidth

This section describes some of network resource controls that can be
optimized in typical telecommunication/networking equipment such as
router, switches and media-gateways. Also, engineering guidelines are
provided to facilitate the configuration.
Buffer dimensioning for UDP based voice/data services

Due to the bursty nature of traffic on packet network, buffering
mechanisms are required to absorb temporary congestion and/or traffic
burst as packets are multiplexed, switched or routed. Packet queuing arises
due to contention for the scheduler bandwidth share, both by multiple voice
calls sharing the same queuing priority (intraclass), and between voice and
data packets where data packet has already started transmission
(interclass). Even when a strict priority schedule is used, buffering is
required. The buffer allocated by the routers to the voice traffic should be
such that the max queuing delay is within the available margin (as
discussed in budget allocation). On high speed links—DS3 and above—
provisioning a buffer of approximately 2 ms in depth would guarantee a

loss ratio target of 10–6 or less. Another rule for voice buffer provisioning
is to use 1/10-1/20 packet per G.711 10 ms voice call (see example below).
Example 8. Calculate the buffer size requirements for a scheduler
share of 20% on an interface rate of OC-3 while maintaining a
maximum of 2 ms queueing delay.
Buffer size = 2 ms / 8 bit * share of rate (bits/sec) = 2 ms / 8 bit
*20% * 155 Mbits/s = 7750 Bytes.
Alternatively, the buffer size can be estimated using the one tenth
rule. If a G.711-20 ms voice call requires 100 Kbits, about 300
voice calls can be supported out of a scheduler provisioned for
31 Mbits/s (20% of an OC-3).
Buffer size ~ 1/10 x 300 voice calls ~ 30 voice packet buffers.
Therefore, approximately 30 voice packet buffers will be required
to prevent packet loss on a strict priority scheduler.
Drop Rate Probability vs Offered Load
9.E-04
Drop Probability (M/M/1/K
8.E-04
queueing discipline)
7.E-04
6.E-04
5.E-04
4.E-04
3.E-04
2.E-04
1.E-04
1.E-06
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Offered Load
0.1ms Buffer Size 1ms Buffer Size 2ms buffer size
Figure 20-19: Voice buffer provisioning versus offered load and drop
probability
In a well engineered network, where load is balanced and does not exceed
the provisioned rate, buffers will be lightly utilized so the queuing delays
will be close to 0. Note that the recommended max transmit queue delay/
size is 2 ms.
Buffer dimensioning for TCP based data services

Buffer dimensioning should not be based upon packet queuing delay only,
but rather by data application QoE requirements and their contributing
factors. Data users do not care about packet delay, but care about
application performance and the relationship between the two is not linear.
Applications relying on TCP as transport protocol will not necessarily offer
faster response time with smaller buffer. The optimal buffer size for TCP/
IP-based applications is a complex function and is not based upon a single

element delay, but instead a function of the number of active flows, link
speed and loss rate. For data services, the buffer size would be typically
engineered to control TCP flows loss with 1% or less (0.1% preferred) in
order to minimize TCP timeout and retransmissions. There is a trade-off
between queuing delay and packet. Small queue size (small buffer) implies
low average queuing delay; however, a small queue size does not always
lead to faster TCP application response time. TCP timeouts caused by
insufficient network buffering can actually increase response time.
Therefore, the transport protocol characteristics and its impact on QoE
need to understood before determining the queue size.
Response Time
Data QoE
heavy loss Optimal excessive

(under buffering) Operating point buffering/Queuing
delay
Total Buffering Level

Figure 20-20: Buffer size relationship to QoE for TCP-based data
applications.
Scheduler share
The queuing delay introduced by a scheduler is greatly influenced by the
offered load and the output link capacity. Queuing theory shows that as the
scheduler load increases, the queuing probability increases, and above a
given threshold increases in an exponential fashion and becomes infinite as
it reaches 100% link occupancy (see Figure 20-21).

Source Jitter as a function of Voice Link Utilization and Link Speed (bits/s)
Voice +data traffic - 2 Queues stat mux, strict priority
20ms G.711 voice packets (200 bytes)
1500 bytes data packet size
1000.00
ISDN
100.00
128k
256k
Average Source Jitter (ms)
512K
10.00 1M
T1
10M
DS3
1.00
100M
OC-3
0C-12
0.10
0.01
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
%Voice Loading (% of link capacity)
Figure 20-21: jitter relationship versus loading for various link size jitter=
f(buffer size, link speed & loading)
Figure 20-21 shows that in a well-engineered network where load is
controlled within a given range (typically less than 90%), queuing delay
and jitter are bounded and stable. Where loading is not controlled and link
loading approaches 100%, queuing delay increases exponentially (in
practice, this is limited by the buffer size). Note also that lower speed links
have lower maximum practical operating points; that is, points above
which jitter becomes greatly inflated compared to its value in the unloaded
network.
As the link size diminishes, the maximum operating loading point
diminishes before saturation.
Routers should not allocate more than 95% of interface rate on high-speed
interface (10 Mbits/s and above) to prevent queue buildup and control
packet loss/jitter. For lower speed link, the maximum loading threshold
should be reduced as shown in Figure 20-21.
Bandwidth provisioning
The bandwidth provisioning is part of the resource allocation process by
which a certain amount of link resources will be allocated to traffic
demands to ensure acceptable QoE is delivered to the end-user. Essentially,
performing mapping of traffic flows into network links. Network traffic is

dynamic in nature due to a variety of factors, such as the number of users

varying with time of the day, bursts in traffic due to a failure and so on. An
important network management issue is how to maintain QoE requirements
in this dynamic environment. The approach shown in Table 20-7 highlights
the key steps of bandwidth provisioning for both a native IP and MPLS
implementation. The main difference between the native IP and MPLS
architecture is in the process of mapping traffic demands. In the native IP,
mapping will be accomplished by optimizing routing table routes (IGP and
OSPF), while for the MPLS, mapping will be accomplished through LSPs.
MPLS provides flexibility in mapping bandwidth demand on network
resources.
Native IP MPLS
1. Identify bandwidth demand 1. Identify bandwidth demand for each
between edge routers – voice LSP
and signaling – Depends on the selected LSP topology and
CAC
2. Map bandwidth demand onto
network – For edge-edge LSPs
• Identify b/w demand between the edges
1. Select IGP weights
• Include possible traffic fluctuations
• ‘default’, e.g. hops, distance, or,
• Optimized for efficient b/w utilization 2. Map LSPs onto network
2. Predict utilization of every link – Calculate route for each LSP[1] (comply with
1. Every end-to-end matrix
constraints on router resources)
2. Simulate all possible failures – Calculate back-up LSPs
3. If step 3 identifies links utilized 3. If routes for some LSP cannot be found
more than 95% due to constraints on router resources
• Increase bandwidth of those links and – Split LSPs,
finish engineering, or, – Add more bandwidth to links where constraints
• Add bandwidth where feasible, and/or would be violated and finish engineering,
economical and repeat step 2. – Add bandwidth where feasible, and/or
economical and reroute enough LSPs to make
room.
Table 20-7: Bandwidth provisioning steps for native IP and MPLS networks
Bandwidth provisioning can be done in either static or dynamic
provisioning. Static provisioning is achieved by allocating bandwidth for
the highest load over a time window and by implementing efficient QoS
mechanisms to ensure that sufficient bandwidth is available for the priority
traffic with predetermined constraints. The drawback of this approach is
that the capacity may be highly under utilized when the load is significantly
below the peak load within the time window. The other approach, dynamic
bandwidth provisioning, would solve this problem by using network
resources more efficiently, but on the other end being far more complex and
requires some centralized coordination intelligence to maintain knowledge
on link/capacity availability. Dynamic bandwidth provisioning is currently
under development and is expected to replace traditional static provisioning
as a long-term solution.
Bandwidth provisioning should also include some extra bandwidth to
include redundancy path restoration, call control, and future traffic growth.

Call control and signaling, in a typical network, represents about 1% of the

total traffic. As a safe margin, planning for 5%-10% is suggested.
Jitter buffer
The main purpose of a jitter buffer is to compensate for packet delay
variation, which would affect playout of deterministic packet sequence
such as real-time voice or video. The jitter buffer must be designed for an
expected traffic profile. That is, for dimensioning jitter buffer, the packet
interarrival delay must be known or be within predetermined bounds for
the jitter buffer depth to be adjusted within an expected range. The jitter
buffer wait time can be statically provisioned or adjusted dynamically as a
function of varying network operating conditions. To prevent packet loss
from jitter buffer overflow, the persistence of instantaneous arrival and the
average arrival rate must not exceed the available jitter buffer storage
space. Obviously, the jitter buffer usage would be a function of the
underlying network stability the traffic is traversing. To minimize jitter
buffer wait time, the network jitter should be minimized and/or eliminated
by ensuring that the average arrival rate does not exceed a certain
percentage (utilization/loading) of the outgoing link speed of each
multiplexing stage in the connection. For voice traffic with an assumed
uniform periodic profile at point of origin and a constant bandwidth
requirement, the percentage can be as high as 90%-95%, provided the
traffic is all voice. If shared with data, voice has absolute priority over data.
Under those circumstances, where voice packets originate, traverse, and
terminate over links in excess of 10 Mb/s, one can expect induced jitter
(resultant from the convolution of the behavior of the concatenated
multiplexing stages) to be a few milliseconds at most.
Consequently, a 10 ms jitter buffer should be sufficient.
Jitter buffers can and should be sized independently of a packet
size.
Where those circumstances do not prevail (that is, network loading is not
controlled or bounded), there is little that can be said about determining the
correct jitter buffer settings. Without any firm expectation of the
instantaneous and average traffic profile (that is, knowledge of the total
traffic admitted to the network and its load balance across the network),
then the probability of unbounded persistence and uncontrolled average
loading is increased, but also unbounded. No size of jitter buffer in any
router or receiving media gateway can be considered big enough. One
should always engineer a managed network for well-behaved normal
operation, with a sufficiency of controls and monitors, and a capacity
suited to demand. Jitter buffer wait time of a few milliseconds will have no
significant impact on voice quality, and fully absorbs the packet delay
variation of packet switching.

Delay distribution across an IP backbone

Figure 20-22 shows an Internet trace capture of an unmanaged IP
backbone. An unmanaged IP network operates typically in a best effort
mode with no QoS mechanisms or congestion management and thus no
performance guarantee. Studies by Nortel and others have concluded that
unmanaged IP networks can exhibit widely variable behavior. Delay
across unmanaged IP network can range from few milliseconds during
uncongested periods to hundreds of milliseconds during congestion peaks.
Under normal conditions, you might see delay across this path of about 50
ms with about 10 ms jitter. Unstable periods and jitter bursts up to 300 ms
most likely result from excessive loading on some links. Impairment to
QoE (quality of experience) on voice calls across such a network would
result in unsatisfactory service, even where jitter buffers are capable of
compensating for highly variable jitter.
In contrast, a managed IP network can offer service level guarantees
because implementation of congestion control and efficient QoS
mechanisms keep delay, loss, and jitter within acceptable bounds. This
consistent behavior delivers the desired voice QoE. In a typical managed IP
network, the traffic load at each node and/or switch/router is controlled
within specific limits to prevent the queue build-up that leads to queuing
delay, packet loss, and jitter. A managed IP network controls or eliminates
congestion through careful selection of QoS mechanisms and end-to-end
traffic engineering.
Figure 20-22: Delay distribution across an IP backbone

The graph in Figure 20-22 indicates that under normal condition, the delay
across this path is about 50 ms with approximately 10 ms jitter. This
network experiences some unstable periods where jitter increases up to 300
ms.
The effectiveness of an adaptive jitter buffer and packet loss concealment is
limited by the magnitude of network-induced impairments. Based on
simulation and measurement results, limiting factors include peak delays

that exceed real-time voice delay targets (150-250 ms), as well as the
highly bursty nature of packet loss distribution.
If an adaptive jitter buffer wait time is deployed, it must be able to adapt to
the wide range of jitter distributions that are typical in today’s IP networks.
Adaptation schemes may not perform well with all distributions of jitter.
Tuning of the adaptation algorithm may be necessary to match the delay
variation characteristic of a network. Tuning can be done by adjusting the
weighting used for the calculation of the moving averages and the
thresholds (sensitivity) to the occurrence of spikes in the delay variation. A
single setting for these parameters that works for all traces may not be
feasible. Note that the long-term average packet loss rate and jitter are in
many cases misleading, as they hide transient events that are only visible
on short time scales. Packet loss periods and delays span several orders of
magnitude—distributions of loss and delay bursts have heavy tails. Where
more than 40–60 ms of speech are lost, there is no longer sufficient
information to reconstruct the speech. This places a hard limitation on the
effectiveness of packet loss concealment techniques.
Summary–Network Engineering Guidelines

In order to maximize the end-to-end QoE performance, it will be desirable
to have traffic management and QoS mechanisms that control real-time
applications QoE contributing factors and QoE aware QoS mechanisms.
Some of the contributing factors highlighted in this chapter require end-to-
end control, which differs from nodal-only control performed by
mechanisms such as scheduler or active queue management. Nodal
scheduling mechanism alone is not sufficient to prevent congestion buildup
and application performance QoE degradation.
From Soft to Hard QoE
Nodal QoS QoE Contributing Centralized/End-to-End

Mechanisms Factor Optimization QoS/TM Mechanism
• Scheduling • Packet loss rate • Call/Flow admission control

• Classification • Buffer size (voice calls, TCP sessions)
• Rate limiting & Policing • Link speed • BW reservation
• Flow admission control • Network delays (RTT) • Centralized coordinator
Figure 20-23: QoE – QoS Summary

Additional traffic engineering end-to-end mechanisms are required, such as
end-to-end flow admission control as well as efficient active queue/buffer
management techniques or over-provisioning. It is expected that once
efficient traffic engineering is in place, the requirements for QoS
mechanisms can be greatly reduced.

QoS mechanisms should be engineered and designed to address the

fundamental real-time related issues and most critical QoE contributing
factors as identified in this chapter. In order to deliver hard QoE, more
advanced mechanisms, which may not be commercially available today,
(such as flow and/or call admission control) would be required. Figure 20-
23 summarizes the traffic management/QoS mechanism requirements for
various levels of service quality guaranteed. Nodal QoS mechanisms alone
are insufficient to provide hard QoE; efficient end-to-end/centralized QoS
mechanisms are also required.


Voice and call control traffic must be given the highest scheduling priority
than other services. Shorter packetization delay (frame length) is better.
Packet loss must be randomly distributed and negligible. All handoffs must
be packet based. TDM handoff between multiple packet domains incurs a
distortion penalty.
Submarine links might be problematic as they introduce compression
codec and could lead to transcoding. Wireless-to-wireless international
calls should use Transcoder Free Operation or interwork between wireless
standards directly. Codec selection and speech distortion should be
controlled to ensure no more than 3R between reference and equivalent
quality replacement solution. Compression incurs a distortion penalty. If
compression is required on international submarine links, then it should be
invoked once and used end-to-end (that is, no tandeming G.726-32 with 10
ms is accredited).
The echo control method should account for connection delay and loss/
level plan. The jitter buffer should be designed for an expected traffic
profile and should be engineered independently of the packet size.
Congestion at each node must be controlled to ensure statistical packet
loading is below 95%. If routing is based on a default IGP metrics,
bandwidth must be added to links that could ever be more than 95%
occupied. Routers should not allocate more than 95% of their interfaces to
the voice and call control traffic. Link weights or route tables might be
optimized for more efficient use of network bandwidth. LSP bandwidth
engineering should be optimized based upon on the codec, packetization,
and whether CAC is available.
Selection of QoS mechanisms selection should consider their impact on
QoE, along with the service level guarantees (soft vs. hard QoE). Buffer
size for routers should be engineered with 2 ms buffer depth to ensure
packet loss rate of less than 1 X 10–6. Jitter must be controlled and within
the end-to-end delay budget—10 ms is recommended.

501
Section VI:
Examples
The material presented in previous chapters has been mostly theoretical or
hypothetical. In this section, we offer some concrete examples of network
architectures and configurations. Chapter 21 provides a look at a specific
large Enterprise network; namely, Nortel’s own corporate network, which
we use as a proving ground for our equipment. Chapters 22 and 23 provide
a contrast between the perspectives of the Public Carrier and Private
Enterprise view of real-time networking. Chapter 24 describes the
implementation of Real-Time Control applications in the video space.
While Enterprise networks and Carrier networks generally rely on the same
technologies and QoS protocols, their business environments are
completely different. A Carrier's network and the features it supports
constitute services to be sold. An Enterprise network is generally a
constrained resource and a business tool. These different perspectives
create different challenges and require different strategies to leverage and
implement network, communications, and application convergence.
When a Carrier implements a VoIP, multimedia, and/or converged network,
it works through a process of determining what level of service to provide,
the size of the target market, and which of the available technologies to
implement. Sophisticated simulation tools are used to determine network
requirements and to predict performance. The Carrier then implements a
carefully defined solution within a well-understood usage environment and
operating conditions, and monitor the loads the network carries to ensure
that it does not exceed the load it was designed for.
Typically, the Enterprise situation is the complete inverse. An Enterprise
usually starts with a network that was designed for data. The network is not
well-documented or well characterized. Enterprises consider voice and
real-time multimedia as simply additional data applications. And while
they assume that bandwidth solves all problems, they are also continually
looking for ways to reduce bandwidth and constrain their network growth.
These examples provide contrasting perspectives on how the principles set
out in the previous sections can be used to achieve various network
performance and user quality targets.

502
Disclosure Statement
The business case scenarios and examples used in the following chapters
are intended for illustrative purposes only. These case examples represent
potential results based on certain assumptions, which may not take into
account all factors potentially affecting results. If actual operating factors
differ from assumptions made, actual results may vary. Specific customer
operating factors such as deployment scenarios, actual growth rates, and
competition could cause actual results to vary compared to other
customers.

503
Chapter 21
VoIP and QoS in a Global Enterprise
Sandra Brown
Rob Miller
Shane Fernandes
Gwyneth Edwards
The following example reflects facts and figures obtained during a case
study performed by Nortel in the Spring of 2004. All information —
financial, people, technical—corresponds to Nortel’s environment at the
time of the case study. As applicable, facts will be denoted by “Case Study
Figures”.
This chapter describes a real life implementation of Quality of Service
(QoS), in two parts:
“Voice over IP: Raising the need for Quality of Service”
“The Quality of Service (QoS) design”
Voice over IP: Raising the need for Quality of Service
The introduction of Voice over IP at Nortel

Nortel is a company of 35,000 employees spread across 150 countries and
more than 240 locations. As the company deploys its own technology
across the corporate network, typically in alpha or beta stage, front-running
technologies are usually implemented at Nortel first. Voice over IP (VoIP)
is no exception. With the intent to provide mobile workers (executives,
sales teams and teleworkers) with a low-cost, flexible IP voice solution,
initial trials of VoIP began in 2000. As outlined in Chapter 19, VoIP is an
interactive application that requires little to no delay or jitter because the
user expects the same quality as standard TDM-based voice solutions. This
requirement drove the implementation of Quality of Service (QoS) at
Nortel.
Nortel’s real-time network

To appreciate the complexity of implementing QoS at Nortel, it is
important to understand the scope of the Enterprise network. This section
provides an overview of the application and network environment, as was
indicated in the Nortel 2004 case study.
Copyright © 2004 Nortel Networks Essentials of Real Time Networking

504 Section VI: Examples
The applications
Nortel runs one of the largest real-time Enterprise networks in the world,
equivalent in breadth and scope to a Tier 2 Service Provider. In a typical
month, more than 1,500 terabytes of routed data traffic runs across the
network, headed for one of the 2,700 computer servers. By comparison, the
books in the U.S. Library of Congress, the world’s largest library, contain
about 20 terabytes of text.
The IP network carries data from a variety of sources, grouped in the table
below by traffic category:
Traffic Application Monthly Usage

Category (as indicated in the Nortel 2004
Case Study)
Network Network management The network moves more than

Control: operations, maintenance 15,000 terabytes of data
and engineering globally every month. Growth
applications. over the past 6 months is at
13%
Interactive VoIP for mobile and 12 million minutes of public

work at home voice
employees (24,000+), 19 million minutes of private
global video packetized voice
conferencing
17.1 million audio conferencing
minutes
80,000 audio conferences
Responsive Live and on-demand 107 webcasts

global webcasts for 407K webcast minutes
employees and
customer e-learning; 280 Gigabytes streamed
global business and Largest Clarify deployment
design applications worldwide *(according to
Gartner)
Timely E-mail, file transfer 39 Million Messages

Table 24-1: IP network data sources
Essentials of Real Time Networking Copyright © 2004 Nortel Networks

Chapter 21 VoIP and QoS in a Global Enterprise 505
The network
The following network description is as described in the Nortel 2004 Case
Study. Nortel’s Enterprise network is based on a backbone architecture split
into four Border Gateway Protocol (BGP) regions—Europe, Americas,
Asia and India—each of which is assigned an Autonomous System (AS)
for Internet connectivity and transport. Routing is done hierarchically
through the core, distribution and access layers. Interregional routing is
based on OSPF routing principles and all regional traffic traverses the core.
Between major campuses, the Wide Area Network (WAN) runs over
Optical SONET technology, much of which uses Optical Ethernet. Some
small offices are also connected to the WAN through SONET although
many are connected through Asynchronous Transfer Mode (ATM), Frame
Relay (FR) and Virtual Private Network (VPN) links.
Over the past few years Nortel collapsed literally hundreds of private
virtual lines, frame relay and ATM circuits, public and private voice onto
the converged core and moved much of the public voice onto the private
network. Upgrades to VoIP were done on the line side and are now
evolving to H.323 and SIP trunking.
Figure 21-1 illustrates an overview of the company’s real-time network
architecture.
Figure 21-1: Nortel’s real-time network architecture

The Voice over IP business case1

The Voice over IP Business Case represents facts and figures taken from the
Nortel 2004 Case Study. The introduction of any new technology must be
substantiated by a business case. It is with this in mind that the Information
Services (IS) team approached the deployment at Nortel. In lieu of simply
rolling out Voice over IP throughout major sites and sales offices, Nortel
based the implementation on the business needs of its mobility users. More
than 24,000 employees (two thirds of the employee population) have
mobility requirements, whether they are traveling across the world,
traveling to the next city, or working from home.
Mobility at Nortel
Mobility in the work place has become a standard requirement; however,
Nortel has led the industry in the mobility of its employees. Through secure
VPN solutions, for more than a decade employees have been accessing the
network remotely to perform their work. Therefore, leveraging current
investments and the installed base was a key driver in the deployment of
VoIP; the IS team wanted to enable the mobile worker to be more
productive and more connected than ever before.
The mobile employee

As per the Nortel 2004 Case Study, mobile employees at Nortel can be
grouped into three major categories: teleworkers, executives and
salespeople.
Teleworkers. More than one third (12,000) of Nortel’s employee
population telework either full or part time. Approximately 3000
employees globally work from home and may have high long distance
costs depending on the location of their colleagues. The remaining group of
employees holds space within a Nortel building, either private or shared,
and work from home occasionally. The latter work from home either
through their own choice or due to situations such as inclement weather or
even major blackouts. Many of these employees do not have standard home
setups, such as 1-700 voice services.
Executives. This group at Nortel is highly mobile due to the global nature
of the company. They have tight schedules, are often on the road and need
to be accessible at all times. They typically make 70% of their calls to
Nortel colleagues.
Sales force. The sales team is not only mobile most of the time, but they
must also be able to demonstrate the application of technology that Nortel
holds in its portfolio. The sales employee needs access to the corporate
1. Please refer to the Disclosure Notice in the Section VI Introduction (page 498).

network at any time—from a hotel, an airport or a customer site. Typically,

40% of the sales person’s calls are to other Nortel employees.
Voice over IP benefits

Prior to the implementation of VoIP, mobile workers at Nortel were using
POTS, cell phones, long distance, and calling cards for their voice needs.
According to the Nortel 2004 Case Study, they now leverage Nortel VoIP
solutions through the company’s real-time network to contact colleagues
the world over, reducing both operational and capital costs in the range of
millions of dollars per year.
The business case, based on the profile of the mobile workers listed in the
previous section, provided a payback of eight months and a net present
value of $18 million. Costs included LAN upgrades to support QoS, WAN
upgrades to support QoS and ensure bandwidth, voice switch upgrades,
headsets and IP handsets, and incremental support costs for maintenance,
help desk support, technical support and incremental depreciation.
Future Voice over IP applications

Voice over IP, aside from replacing standard voice services, has positioned
Nortel for the introduction of other interactive applications including e-
learning, video conferencing, application sharing, operator services and
contact center support, to name just a few.
Voice over IP implementation strategy
Challenges of delay and jitter

As described in Chapter 3, successful deployment of VoIP is complex;
challenges cannot be resolved simply by expanding bandwidth. The
solution must balance appropriate bandwidth, QoS and network
management to enable VoIP to operate effectively, with the performance
expected of an interactive application. For example, network management
and troubleshooting of VoIP on a converged network is considerably more
complicated than on a Time Division Multiplexing (TDM) network, as the
former contains other traffic aside from voice.
As an interactive application, voice is sensitive to delay. For an IP data
network, typical application (end-to-end) delay may be 300–400 ms. This
is acceptable for normal non–real-time data applications; any variation in
the delay is experienced by the user as variable response time. However,
delays this large on a VoIP call will destroy simultaneity at the two ends of
the call, adversely affecting turn-taking that is typical of a conversation,
and making it difficult to interrupt. It can significantly influence one user’s
perception of the politeness, honesty, intelligence, or attentiveness of the
other user. Interactive applications, such as voice, require a delay
performance equal to or less than 150 ms end-to-end.

Jitter in the IP layer can impair the voice channel. As noted in Chapter 3,
jitter is the variation in packet arrival: moment-to-moment changes in
network traffic and loading affect the transit times of individual packets.
VoIP and other real-time applications cannot be queued without increasing
the end-to-end delay, which degrades the application performance.
Network engineering and network management need to keep jitter low to
maintain quality for delay-sensitive applications.
So jitter and latency values are very important, and in this example, they
were the key drivers for implementing QoS.
Note: Quality of Service (QoS), as defined in Chapter 2, refers to a
set of technologies—traffic management and QoS mechanisms—
that enable the network administrator to achieve the desired traffic
performance targets. We assume that, in this example, Quality of
Experience (QoE) for VoIP calls is equivalent to that of PSTN
service.
Voice over IP implementation

Beginning in 2000, Nortel implemented IP Line on their Meridian 1
switches, enabling Voice over IP within the LAN. This prepared the
network to serve the targeted user base of mobile workers: the Meridian 1’s
were upgraded to Release 3.0 and a signaling server was added to
transform the M1 into a Call Server 1000. Then the Multimedia
Communications Server (MCS) was turned on for that office.
This evolutionary approach was taken so that the current infrastructure
could be leveraged and payback would be a year or less.
Lessons learned about Voice over IP

Nortel deploys its own technology within the Enterprise, usually before the
products and solutions are generally released. The implementation of VoIP
was no exception; however, large scale deployment began during the
telecommunications down-turn. In hindsight, this provided the IS team at
Nortel an opportunity to ensure that business benefits were realized at
every step. According to the Nortel 2004 Case Study, the lessons learned
are, therefore, applicable for many Enterprises considering VoIP
deployment.
A mobile workforce creates a strong business case for VoIP.
Although productivity benefits and infrastructure savings are
important factors, they may not carry the business case.
A high level of security is required to ensure that viruses, worms
and other attacks do not cripple the data network, especially since
voice is now running across it; the architecture must be secure end-
to-end.

Managing technology refresh with decreasing or flat budgets

demands a positive business case including a short payback period.
The business case must leverage existing investments; technology
cannot be ripped out and replaced.
The IS team must adapt new engineering and operating processes to
support a real-time IP network. Training and education is
fundamental.
Organizations must move to an application-aware network with
QoS.
The Quality of Service (QoS) design

The following QoS strategy and architecture reflects the findings from the
Nortel 2004 Case Study.
Quality of Service, an immediate VoIP requirement

The extensive implementation of Voice over IP within Nortel, coupled with
the nature of IP traffic, with its spikes and intermittent saturation, required
Quality of Service (QoS) at the early stages of implementation. QoS
ensures a consistent level of service for an isochronous, interactive
application such as voice, avoiding the jitter and delay problems that
manifest into less than acceptable Quality of Experience (QoE) for the user.
The Nortel IS team established that employees would have the same
Quality of Experience (QoE) expectations for the VoIP service as they did
for the service it was replacing (the current PSTN). This implied that the
design of the QoS solution had to meet the same end-to-end network
performance targets established for PSTN service (as defined by User QoE
Performance Targets), as follows:
delay equal to or less than 150 ms
no jitter
packet loss of 10–3
The QoS strategy

Nortel’s IS team deployed a QoS design that uses DiffServ Code Point
(DSCP) values for different services and applications, based on Nortel
Networks Service Classes (NNSC), as outlined in Chapter 19. Voice over
IP, an interactive application, is assigned into the higher service classes,
based on its very low tolerance for loss, delay and jitter. At Nortel, the
middle NNSCs are used for priority data applications, such as multicast
and business critical client/server applications, and the lowest NNSCs (for
example Bronze) are used for applications such as e-mail. Please refer to
Table 21-2 for an overview of the DSCPs used.

Application DiffServ Code Point
Network Control DSCP 56 (nc2)
Voice DSCP 46 (ef) + 40 (cs5)
Video DSCP 38 (af41 and af43)
High Priority Data DSCP 30 (af33)
Low Priority Data DSCP 22 (af23)
All other data Not-tagged

Table 21-2: Nortel DiffServ Code Points used within the network core
QoS was implemented based on the size of and infrastructure at the site.
The following sections will provide an overview of the architecture at the
small office, the core and at the Local and Medium Area Network (LAN
and MAN).
Please refer to Figure 21-2 for an overview of the site QoS architectures.
Figure 21-2: QoS architecture by site

Small office QoS architecture

The following small office QoS architecture reflects the findings from the
Nortel 2004 Case Study.
Protecting investment at the small office

The small office QoS architecture is discussed in detail, as this design was
based on a desire to preserve investments in existing hardware, and
therefore, involved adaptation of existing frame relay networks. This
approach allows for the evolution within the Enterprise to an affordable
QoS solution.
Nortel’s small offices use a hub and spoke architecture, where the small
offices hub into a larger central office. The challenge with implementing
Voice over IP at the small office, onto the existing data network, arises
from the competition between smaller voice packets and large data packets.
By nature of design, the smaller voice packets will sometimes arrive behind
the larger data packets. Even where the voice packet is given priority in the
queue, where the data packet has already begun transmission, the voice
packet will have to wait. This results in increased jitter, and an increased
chance that some voice packets will arrive outside the acceptable jitter
buffer timer of the voice applications. One or two lost packets may result in
distortion of the VoIP output, and longer bursts of packet loss will result in
the muting of the output signal. Quality of Service is needed to prevent this
problem.
The small office QoS solution

Nortel’s small offices use BayRS routers. Given that a key component of
the QoS strategy was to leverage existing investment and avoid new
hardware purchases, a creative, inexpensive, and effective QoS strategy
was conceived to resolve the issue outlined above. By taking advantage of
NNSCs, DSCPs were mapped to ATM and frame relay service categories
so that QoS could be achieved between the small and larger offices. Figure
21-3 shows the QoS architecture for the small offices.

Figure 21-3: Small office architecture design for Quality of Service
Hub-site ATM conversion

At the hub site, a BN router with an ATM card is used to concentrate all the
remote small office frame relay links. The carrier service provider performs
FRATM conversion (converting frame relay to ATM), so that ATM access
can be used at the hub site and frame relay at the remote sites. Nortel then
uses a Passport 7400 to direct the remote sites’ ATM virtual circuits to the
hub site’s BN ATM interface. The Passport also provides the required ATM
QoS functionality by mapping Nortel Networks Service Classes to ATM
Service Categories, specifically rt-VBR for the voice packets.
Traffic shaping
At the small office, two frame relay Data Link Connection Identifiers
(DLCI) are used to separate Voice over IP from all other data application
traffic across the Wide Area Network (WAN), with higher priority given to
the VoIP traffic. To direct the traffic to the appropriate DLCI, Forward Next
Hop (FNH) filters based on the DSCP/TOS bits within the IP header are
implemented on the Ethernet ports as the IP traffic ingresses the router
ports. BayRS Protocol Priority Queuing (PPQ) is implemented at the small
site’s router.

Sidebar: BayRS protocol priority queuing

The queuing process
With protocol prioritization enabled on an interface, the router sends
each packet leaving an interface to one of three priority queues: high,
normal or low. The router automatically queues packets that do not
match a priority filter to the Normal queue. To send traffic to the other
queues, the network engineer can create outbound traffic filters that
include a prioritizing action. These are called priority filters.
The dequeuing process
After queuing packets, the router empties the priority queues by sending
the traffic to the transmit queue using one of two dequeuing algorithms:
Bandwidth Allocation Algorithm
Strict Dequeuing Algorithm
By default, protocol prioritization uses the bandwidth allocation algorithm to
send traffic from the three priority queues to the transmit queue.
The VoIP DLCI is a “shaped” DLCI and the Data DLCI is “unshaped.” By
default, shaped DLCI traffic is prioritized over unshaped DLCI traffic. All
VoIP signaling and media path messages are assigned an Expedited
Forwarding (EF) DiffServ Code Point (DSCP) of 46.
To prevent traffic congestion on one DLCI from causing packet drops on an
uncongested DLCI, clipping is enabled on the ATM interface card that sits
on the BN router. “Clipping enabled” implies that the BN ATM card will
drop all packets in excess of the frame relay Sustainable Cell Rates (SCR)
and Peak Cell Rate (PCR) values. The data circuit will typically drop
packets first because data is bursty in nature, thus ensuring that the VoIP
traffic is not dropped by the BN router.
Leveraging the Passport 7400

In this small-site design, the Passport 7400 provides two key functions:
It allows for the consolidation of all services onto the ATM access
at the core site.
It provides the ATM QoS functionality required to support voice
and data to remote locations.
With the use of ATM, both large and small packets are segmented into
smaller 53 byte packets (ATM standard). Smaller packet sizes help to
resolve the issue that exists on serial links with smaller voice packets being
queued behind larger data packets. The Passport 7400 provides the ATM

QoS functionality by inserting the higher priority voice packet flow in

between the lower priority data packets, thus reducing VoIP delay and jitter
variation.
With the use of the Forward Next Hop (FNH) filters, the VoIP and Data
traffic have been separated into two different virtual circuits as the traffic
egresses the BN and ingresses the PP7400.
All Voice over IP traffic from the BN arrives on a PP7400 virtual
circuit with an ATM service category of rt-VBR.
All Data from the BN arrives on a PP7400 virtual circuit with an
ATM service category of nrt-VBR.
Both virtual circuits pass through the Virtual Path Terminator (VPT) that
shapes the combined traffic to the access speed of the remote end’s link
speed.
As the ATM traffic passes through the PP7400’s, VPT priority is given to
the rt-VBR VCC (voice traffic). If the combined total of the two VCCs is
higher than the VPT shaped value, the nrt-VBR (data) traffic will be
buffered and dropped.
Future QoS implementation

Nortel has found the above design to be an effective and inexpensive means
to provide QoS for VoIP to its small office sites. Applications such as video
and streaming are currently placed into the data queue and are not
prioritized in a high queue.
Future evolutions of the Nortel design to move towards higher capacity and
consolidated network designs will use next generation Nortel products.
QoS will be enabled simply as a by-product of this evolution to new
network capability, and multiqueue QoS will be implemented at that time.
Nortel’s core network QoS strategy

The following Nortel core network Qos strategy reflects the findings from
the Nortel 2004 Case Study. The Nortel core is defined as the optical
network, which provides WAN connectivity between the major Nortel
campuses. The core network connects between offices in Texas and North
Carolina, USA; Brampton and Ottawa, Canada; Châteaufort, France;
Maidenhead, England; Beijing and Hong Kong, China; and, Chatswood,
Australia.

The existing Nortel core router architecture supports an eight queue QOS
design, three of which are currently used, as follows:
Voice is prioritized into the highest queue
Video and multicast into the next queue
All other traffic into a lower queue.
The optical network is fronted by a Passport 7480, which is used to convert
the traffic into ATM packets. Please see Figure 21-4 for an illustration of
the core QoS strategy.
Figure 21-4: Core QoS strategy
Nortel’s LAN/MAN strategy

The following Nortel LAN/MAN strategy reflects the findings from the
Nortel 2004 Case Study. QoS is required not only in the WAN, but also on
the local area network, in order to provide an end-to-end service. Using
Nortel Networks Service Classes, the IS team enabled QoS on the BPS,
Baystack* 460/470, and Passport 7400 products to allow them to prioritize
and pass on the DiffServ Code Points (DSCP) to the next device up the
chain.

Lessons learned
For companies wanting to move to a real-time network, the following
lessons learned can assist in the implementation of Quality of Service
(QoS):
The QoS strategy should be driven by the needs of the applications
that run over the corporate network.
The applications must be categorized by traffic category to
determine the QoS requirements and priorities (let Quality of
Experience drive the categorization).
Consider not only current interactive applications but also future
applications that will run across the real-time network.
Even if QoS is implemented on a site-by-site basis, a complete QoS
strategy should be defined prior to deployment to ensure that the
network becomes real-time end-to-end.
To minimize costs, leverage current infrastructure and managed
services, especially in the small sites.
To simplify QoS implementation, take advantage of the DiffServ
Code Points (DSCP) mappings to Nortel Networks Service Classes
(NNSC) and the service categories of the various transport
technologies such as IP, ATM and frame relay.

517
Chapter 22
Real-Time Carrier Examples
Edited by Kathy Joyner
The telecommunications marketplace has changed radically in the past few

years, and the rate of change is exploding rather than levelling off.
Consumers have quickly adapted to the ability to communicate any time,
anywhere, but are demanding integration of the ever-increasing number of
communication devices. As well, consumers are demanding rich new
features to personalize and enhance that communication, and increase
mobility. Enterprises have demands of their own, driven by the need to
increase productivity and concentrate on their core business.
In addition to being faced with these challenges, wireline service providers
are coping with ever-increasing competition, brand erosion, and financial
pressures. Revenue from traditional wireline services is nearly flat, while
less profitable services are thriving (RHK, Telecom Economics Update,
Nov. 2003). Ubiquity is driving down the cost of Internet Protocol (IP)-
based services, even as popularity and demand for these services grow.
Traditional service providers are losing ownership of their traditional
customer base as wireless service replaces traditional wireline service, and
broadband service erodes the market for second lines.
The next wave of revenue generating services can be achieved only by
combining new services with an intelligent, service-aware network. Nortel
is ready to help carriers achieve their objectives, including convergence of
their networks and value-rich, revenue-generating services.
This chapter provides a number of examples of the changes facing the
telecommunications industry and some solutions to challenges caused by
those changes.
Centrex IP
An April 2003 InfoTech* report entitled “Enterprise Convergence: The
Race for IP Telephony Supremacy” shows that by the end of 2004, over
seventy percent of U.S. Enterprises will have implemented IP Telephony in
at least one site. This escalation in demand for IP Telephony is driven by
the desire to reduce costs while retaining the ability to enhance employee
mobility and increase worker productivity.
It is not surprising then that Anycarrier.com has experienced a five percent
decline in its Centrex customer base over the last eighteen months. In fact,
research by IDC (2003) shows that service providers as a whole are seeing

their Centrex installed base eroding by three to twelve percent per year
(depending on segment and service provider).
Understanding that the erosion of its Centrex customer base is only going
to accelerate, Anycarrier.com initiated a study to find a solution that would
allow it to retain its current Centrex customers and also add high-demand
VoIP services as part of a comprehensive product offering. Based on the
results of that study, Anycarrier.com determined that Centrex IP is the only
solution that provides it with an evolutionary approach to VoIP, allowing it
to retain its existing Centrex revenues while providing a platform for new,
IP-based business services.
Technical challenge
With Centrex IP, Anycarrier.com can offer the reliability and rich feature
set of hosted Centrex in conjunction with the next-generation services of
VoIP, allowing it to retain and grow its Centrex base. Because Centrex IP
builds on the industry-leading business voice benefits of Centrex,
businesses can take advantage of the benefits of IP Telephony in a flexible,
cost-effective and low risk way. Key market segments are as follows:
Existing Centrex Base. Companies who are already seeing the
benefits of the full feature set and reliability of Anycarrier.com’s
Centrex service are the prime target for the move to Centrex IP.
Medium/Large Enterprises. Medium to large companies across
many industries also provide a great opportunity to introduce
Centrex IP. In many cases, these companies have already
considered implementing VoIP in some fashion and may have a
budget set aside for that step. In addition, these companies are also
under competitive pressure to increase employee productivity by
providing better communications, while simultaneously lowering
operating and IT costs.
Within the Medium/Large Enterprise segment, those companies with the
following characteristics are the primary targets for Centrex IP services:
New Branches. Companies that are opening a new branch or site
for their company. The branch will need to be set up with a cost-
effective extension of services to connect and communicate with
the main corporate site.
Major Renovations. Companies that are overhauling/rebuilding
their office space or telecommunications systems. This renovation
may provide the opportunity to upgrade their telephone and LAN
infrastructure.
Small Businesses. Small businesses across many different
industries are another potential opportunity for Centrex IP services.
These companies typically make changes more quickly and easily.

Chapter 22 Real-Time Carrier Examples 519
Like their larger counterparts, small businesses also feel the

pressure to increase employee productivity, while maintaining
maximum financial flexibility.
Solution
Network diagram
From a high level perspective, Centrex services can continue to be offered
from either existing switch platforms or from newer call server platforms.
The diagrams below illustrate how the migration to an IP-based Centrex
service can be accommodated.
Figure 22-1: Centrex IP network diagram
Architecture overview
With Nortel’s Centrex IP solution, Anycarrier.com can offer full-featured
Centrex services over an IP infrastructure using two primary components:
Centrex IP Client Manager with a DMS*-100/5000 or a
Communication Server 2000
IP phones
Key elements
Centrex IP Client Manager. The Centrex IP Client Manager
(CICM) is a high-availability, NEBS-compliant platform that is
hosted from a DMS-100/500. The CICM is responsible for hosting

IP telephones or PC Clients and serves as a VoIP gateway when

used with the DMS-100/500 switch, enabling delivery of Centrex
IPs rich feature set. An alternative implementation uses a
Communication Server 2000 softswitch.
Communication Server 2000. The Communication Server (CS)
2000 is the primary network intelligence component for the Nortel
carrier VoIP solution. A superclass softswitch delivering market-
differentiating local, long distance, and tandem services, the CS
2000 provides high-capacity, centralized call processing and
service transaction logic, including translations, routing control,
network signaling, and the creation of billing records. The CS 2000
also directs access gateways to establish and tear down virtual
connections for delivery of packetized voice and data traffic over
the packet network including support of IP phones.
i2002/i2004 Internet Telephone. The i2002/i2004 Internet
Telephones are full-featured IP business sets with outstanding voice
quality. Key features include an LCD screen with supporting soft
keys for user configuration and operational status. A high-quality
speakerphone is integrated in the i2002/i2004. The i2002/i2004
phones connect to a LAN using a standard RJ-45 interface. A LAN-
powered option simplifies wiring and provides uninterrupted phone
service in the event of a power outage. The i2002/i2004 phones
allow an Enterprise to balance user requirements for high
functionality with the benefits of streamlined management and
reduced facilities costs. The i2002/i2004 phones are supported by
the Nortel Enterprise portfolio; thereby, ensuring investment
protection as business needs change.
Key take aways

Centrex IP delivers over 200+ Centrex features over an IP infrastructure.
Centrex IP is identical to the hosted Centrex service of today, except that
services can be delivered over IP as well as through circuit switching. Now
businesses can migrate gradually from their existing Centrex service to
voice over IP, at their own unique pace. To our knowledge, no other hosted
service has this capability. Customers get the best of both worlds—the
advantages of a gradual migration to VoIP communications with no
disruption of their current features, dial plan and billing.
With Centrex IP from Anycarrier.com, Enterprises can perform the
following functions:
Provide uniform services and features for all users regardless of
location
Extend the same advanced business features to Centrex IP and
Centrex users

Support uniform dialing plans and abbreviated dialing

arrangements on a company-wide basis regardless of location
Avoid toll charges for calls between different company locations
Administer one large Centrex group rather than several separate
groups
Integrate with current network-hosted solutions such as Voice Mail
and IVR
For the Enterprise, Centrex IP brings the advantages of new technology
with all of the traditional advantages of a hosted solution:
Minimum capital investment required
Limited information technology resources required for VoIP
upgrade
High availability of mission-critical telephone service
No risk of technology obsolescence
Strategic advantages and flexibility of outsourcing
Local
As wireline revenues continue to decrease, local service providers are
looking for ways to reduce costs and converge networks so that voice, data,
and wireless can leverage the same network. They also want to lay the
foundation for new services that provide additional revenue opportunities.
For these reasons, major local exchange carriers are taking on the challenge
of converting their Class 5 circuit switches to packet switches.
Drivers for considering migrating to a packet network vary by service
provider. However, common requirements are as follows:
Meeting market demands for data services
Delivering solutions cost-effectively
Providing single, integrated, carrier-grade packet network for voice,
high-speed data, and special services with efficient network
management capabilities
Reducing capital and operating costs
Finding sources of new revenue
In the end, many service providers determine that it is more cost-effective
to migrate to new packet technology than to grow and maintain their
existing circuit switches.

Technical challenge
Service providers are closely reviewing their technology choices in order to
decide whether to continue to grow and maintain the circuit-switched
network or migrate to a packet network.
Most service providers have a wide range of circuit switching and back-
office equipment, requiring experienced craft personnel to maintain and
manage circuit switches from different vendors. From a network support
perspective, there are many different products to understand and manage
on a daily basis. In addition to the various and dated circuit switching
equipment in networks today, some switches do not support Local Number
Portability (LNP), a regulatory requirement that must be provided to all
subscribers. In addition, capital expenditure decisions are looming for
many service providers.
From a business perspective, packet networks can deliver operational cost
savings. For example, migration can reduce the number and different types
of back-office systems, simplifying network management. In addition, the
number of nodes in the network is decreased; in one real-world case, by
almost 75 percent. Several Class 5 switches can be collapsed into one
centrally located communication server that serves a much wider
geographic area. Craft personnel no longer have to know and manage
multiple types of back-office systems. In addition, they no longer have to
manage as many elements because separate layers for Tandems and
Remotes, as well as multiple networks, are eliminated.
Solution
Network diagram
From a high level perspective, convergence offers an opportunity to reduce
complexity and to reduce costs over traditional TDM Class 4 and Class 5
networks. As the diagram below illustrates, the transition from a TDM-
based network to a packet-based (IP or ATM) call server network
significantly reduces the number of trunks that have to be maintained, as
well as reducing the overall load on the network.

Radically Simplified Networks

TDM Class 4 & 5 Packet SuperClass
E E IXC
#1 IXC
#2
EO
E E EO IXC
IXC #2 #3
EO IXC
IXC #1 EO #4
IXC #3
Packet
Network
TANDEM
IXC #4 EO EO EO
EO
E E EO EO
EO EO EO EO
TANDEM TANDEM
E E
EO EO
15 Office Nodes 2 Call Servers

120+ Trunk Groups 90% Fewer Trunks 10 Trunk Groups
40% Fewer Call Attempts
Tandem Layer Absorbed
Figure 22-2: Local network diagram
Solution Details
Nortel provides a Carrier Voice over IP Local Solution. Whether a service
provider is interested in ATM or IP, Nortel can provide the solution. With
decades of Class 5 experience, Nortel is uniquely qualified to provide
service providers a packet solution that can evolve their circuit networks.
This solution delivers full feature transparency delivering a full set of Class
4 and Class 5 features with over 3,000 features in every software load.
The major components of the Nortel VoIP Local Solution include:
Communication Server 2000 superclass softswitches providing
comprehensive services, carrier-grade attributes, and regulatory
features
Media Gateway 9000, a line gateway supporting both broad- and
narrowband services
Media Gateway 4000, a trunking gateway used in ATM networks
(North America only)
Packet Voice Gateway 7000 or 15000, a trunking gateway used in
ATM or IP networks
Service providers may also use Nortel Multiservice Switches to provide
high-capacity, carrier-grade switching. While this is not a requirement

(because Nortel’s solutions are interoperable with a wide range of other

vendors’ equipment), many service providers have chosen Nortel
Multiservice Switches because this choice gives them the convenience of
working with a single vendor.
Most local service providers need to support a variety of different line
types (coin, key, and PBX) as well as a variety of different phone types
(traditional residential phones, business sets, and IP phones). In order to
keep evolution of the network inconspicuous to the end user, Nortel
Communication Servers offer full-feature transparency of 3000 telephony
services, ensuring that the same voice features available today from a
circuit switch can be offered in a packet network. This means that service
providers don’t have to worry about losing the ability to offer popular
features. In addition, the functionality of voice features does not change,
eliminating the need for end users to relearn use of these features.
Finally, the packet network enables local service providers to expand into
new markets, incorporating an edge-out strategy into other urban areas.
Key take aways

Nortel delivers a turnkey Local VoIP Migration solution, minimizing the
number of service provider personnel required to support the migration.
This turnkey solution includes:
Global professional services to assist with network planning,
engineering and design
Installation and activation of packet equipment
Performing physical cutover
Delivering training and documentation
Service providers around the globe are deploying the Nortel VoIP Local
Solution. Two major North American service providers have made public
announcements regarding their Voice over IP plans: Verizon* has
announced plans for deployment of the Nortel VoIP Local Solution, and
Sprint has live offices already converted to packet. Other service providers
in the Caribbean, Latin America, and Asia Pacific have also announced
their intention to deploy the Nortel VoIP Local Solution.
Long distance
Long distance providers and new carriers alike are faced with a paradox:
the total minutes of use for long distance services is expanding, but the
revenues per minute are decreasing. However, there is still tremendous
potential in this market, creating opportunities for some and challenges for
others. Ascendant carriers are staying ahead of the competition by reducing
operating costs, expanding capacity, and delivering reliable services.

Nortel Carrier VoIP Long Distance Solution is an ideal step for long
distance service providers. This low-risk solution helps lower transport/
transit and capital costs with the efficiencies of multivendor packet
telephony. Packet trunking is the economical engine that can help pay for
network transformation today as the service provider explores new revenue
opportunities enabled by new voice and multimedia services and easier
access into other markets.
Technical challenge
This solution delivers full-featured, carrier-grade telephony, data, and
multimedia services over multiservice packet networks. It uses open
standards packet technology for the packet backbone. Carriers can chose
either AAL2 protocol or IP transport to provide a full-featured packet
transit application.
Packet networks offer cost efficiency, open standards, and fast time-to-
market for new packet services, without compromising the values of
traditional telephony, including service richness, voice quality, reliability,
scalability, and manageability.
This solution is based on Nortel’s Packet Trunking application and allows
service providers to deploy their own differentiating telephony, data, and
multimedia services.
This solution also lays the foundation for delivery of local and transit
services for business and residential customers with the future addition of
line-side multiservice gateways. The service provider can also add cable or
wireless gateways to explore other market opportunities, or take advantage
of Enterprise network connectivity and SIP capabilities to deliver new
services.
Solution
Network diagram
From a high level perspective, the transition to an IP or ATM-based packet
network can result in large savings in long distance costs. The diagram
below shows a comparison of long distance trunking requirements for a
TDM-based trunk network and a packet-based network. The larger the
number of flows over a link or Trunk Group, the less difference between
statistical worst case and average. Because in the IP case we consider all of
the traffic on a link when sizing the link and do not have to segment it up
into a smaller point to point flows as in the TDM case (that is, individual
Trunk Groups, each sized according to a statistical worst case), it can be
more efficient.

Long Distance Savings

40K 20K 20K 40K
DMS DMS 35K 35K
40K
20K 20K 40K
20K 20K
20K 20K 20K 35K

20K 35K
40K DMS 20K 20K
DMS 40K
40K
40K
2 Nodes – 2 Nodes Collapsed

4 Nodes 70K Interoffice DPT Trunks per CS
60K Interoffice Trunks per node 2000
•60K * 4 Nodes = 240K Trunks - 70K * 2 Nodes = 140 Trunks
40K Access Trunks per node - 80K trunks reduced via collapse
- 20K trunks reduced using
shared DPT trunks
58% Reduction in Ports
80K Access Trunks per CS 2000 –
No Net change in Access Trunks
Figure 22-3: Long distance network diagram1
Solution Details
Instead of managing multiple overlay networks, the service provider can
deliver all types of services over a single infrastructure. This design allows
more choices in service deployment and vendor selection to help decrease
long-term capital costs. H.248-compliant multiservice gateways connect
existing trunks to the backbone, with no need to modify existing facilities
or their originating multivendor offices. The efficiencies of a packetized
backbone can reduce ongoing operating costs by twenty to forty percent as
proven by a Nortel business case1. In contrast to today's individually
engineered fixed-bandwidth trunks, a packet network efficiently routes all
types of traffic by allocating and sharing network resources on demand.
The packet network also helps reduce cross-connects, multiplexers, IMT
facilities, and associated peripherals, reducing capital expenses by twenty
to thirty percent, again as shown in a Nortel business case1.

This solution also offers reliability, security, and quality of service. The
service provider can transition a node-centric, hierarchical topology to a
simplified architecture where a converged network performs like a single,
unified switch. The streamlined network design offers greater service
capacity, variety, and speed-to-market, all with fewer nodes. And because
fault-tolerant elements are distributed across the network, single points of
failure are removed and superior survivability is realized, with no sacrifice
in voice quality, latency, or capacity.
The Carrier VoIP Long Distance solution offers a standards-based
switching and routing infrastructure that transports today's revenue
generating services while supporting competitive, next generation services
– all over a high-capacity ATM or IP backbone. The service provider can
deliver leading-edge long distance applications while reducing transport
costs and deferring or eliminating future capital expenses.
The following listing summarizes key network elements in this solution.
All are built to meet or exceed carrier-grade standards and protocols set by
ITU, Telcordia, ANSI, ETSI, IETF, ATMF, and other standards bodies.
Gateways. With an ATM AAL2 backbone or IP—the robust Packet
Voice Gateway 15000 connects standard TDM trunks to the service
provider's packet backbone. This trunk gateway appears as a
tandem/transit office termination to any vendor's circuit-switching
office.
Superclass Softswitch. The first vendor to deliver a superclass
softswitch, Nortel offers a choice of two platforms; the
Communication Server 2000 and Communication Server 2200
(previously the Communication Server 2000 Compact). A
superclass softswitch offers the critical attributes associated with
successful softswitch deployment, such as consolidated local, long
distance, wireless, and cable applications; comprehensive service
(3000+ features); regulatory capabilities; and carrier-grade
attributes. Both of these platforms are designed to control
multivendor gateways and provide the call control and other
network intelligence required to deliver revenue-enhancing
services.
Multiservice Switches (MSS). As a high-capacity ATM switch, the
Nortel Multiservice Switch 15000 supports ATM IP, frame relay,
circuit emulation, and voice services. This system scales from 40
Gbps of redundant bearer capacity to terabits. High-capacity, fault-
tolerant Layer 2 switching/routing is provided by the Ethernet
Switch 8600, which aggregates local IP traffic providing an
interface to a high-speed optical backbone for IP voice and
signaling traffic.

Signaling Network Interface. With the Universal Signaling Point,

the network fully communicates with PSTN SS7 messaging for
seamless interworking with established networks and Intelligent
Networking services. This server supports both traditional and next-
generation packet-based services.
Audio server. In a small footprint, the Media Server 2000 series
provides audio resources including announcements, dynamic audio
recording, conferencing, speech recognition, lawful intercept, and
more.
Management. A comprehensive set of network management tools
and applications work together seamlessly to provide both element
and network management.
Key take aways

The service provider who is already delivering long distance services can
incorporate current facilities to extend the service life of current
investments. The service provider who is building a new network begins
with a carrier-grade, high-capacity, future-ready network that offers
immediate revenue opportunities to power future growth.
The Carrier VoIP Long Distance Solution helps create successful futures
for carriers by laying the foundation for delivering packet-enhanced
services in other markets. And this solution enables profitable, high-
performance packet trunking with the full traditional voice service set that
service providers demand and have come to expect.
Multimedia
In the past, communications were based on a single media: voice. In the
21st century, communications require the integration of multiple media. To
ensure effective communication for both Enterprises and end users, next-
generation SIP services allow consumers to integrate voice, video, and data
into a conversation as simply an easily as they pick up the telephone and
make a simple voice call today.
Technical challenge
For the purpose of this example, we have chosen to focus on a service
provider who is solely deploying multimedia services for the residential
and small office/home office markets. Similar multimedia service offerings
are available for medium to large Enterprises through a carrier-hosted
model.
To take advantage of this market opportunity, 123com has initiated a
program to design and deploy unique consumer multimedia service
offerings that create opportunities for new revenue streams, increase

penetration for broadband and primary line telephone services, improve

customer retention, and generate subscriptions in new territories.
Solution
Network diagram
The network diagram below illustrates how five “Go to Market” services
can be offered. These services are as follows:
Voice and multimedia over a broadband connection to the Home
Office
Voice and multimedia from a soft client over a broadband
connection to the residence
Voice and bundled long distance over a broadband connected
Integrated Access Device (IAD) in the residence
Personal Agent web portal services to enhance existing 123com
residential voice subscribers
Voice and multimedia Remote Access over a broadband connection
to the Internet
Each of the service offers makes use of 123com’s or other carrier’s high-
speed data services, the public Internet, 123com’s IP Backbone, and
123com’s PSTN network.

Figure 22-4: Multimedia network diagram
Solution Details
For the residential consumer market, 123com considered four service
offerings, described below.
123com Multimedia Communications Center. Intended for the
installed base of broadband customers and telephone customers,
this service offers advance multimedia features such as video
calling, picture ID, file transfers, and Web pushes using the
customer's personal computer. Unlimited on-net calls are provided
as part of the service, along with an optional outbound long
distance calling plan. Inbound calls from the traditional telephone
network will be allowed in the markets where 123com has primary
line service or selected states where this type of service is allowed
from a regulatory perspective.
123com Broadband Telephone Service. Intended for the installed
base of broadband customers who don’t use or have a personal

computer, this service provides access through existing telephones

with an adapter that plugs into the cable modem. Unlimited on-net
calls are allowed, with an optional outbound long distance calling
plan. Inbound calls from the PSTN will be allowed in the markets
where 123com has primary line service or selected states where this
type of service is allowed from a regulatory perspective.
123com Broadband Home Office Package. This service greatly
increases the productivity of home and small office
communications. In addition, the bundling of an i2002 or i2004 IP
phone with PC-based software provides a seamless system that
works together. In standalone mode, the IP phone provides “always
available” communications, even when the PC is turned off. In PC-
controlled mode, the IP phone provides the voice to be used with
the PC services.
123com Home Office Multiline Service. Intended for the installed
base of 123com telephone customers, this service provides find me/
follow me services through a Web-based client. There is no
software to download onto the home computer and a broadband
connection is not required to access this feature. Subscribers simply
use the Web client to set up call screening and routing preferences.
The services described above are delivered by the Service Core, comprised
of the components of the Multimedia Communication Server (MCS) 5200
and (optionally) the Communication Server 2000 (CS 2000). PSTN
trunking can be provided through the Packet Voice Gateway providing
trunk gateway functionality in some markets and the MCS 5200 PRI
gateway in markets that aren’t currently being served by a CS 2000. These
gateways are used to provide the interconnection of packet and TDM
networks, and services provided to the residential, SOHO, and small
business Internet users.
Key take aways

This migration solution offers consumer multimedia service offerings that
create opportunities for new revenue streams, increase penetration for
broadband and primary-line telephone services, improve customer
retention, and generate subscriptions in new territories.
Cable
While data and video on the Internet aren't new, having voice included in
the mix is. Using Internet Protocol (IP) telephony or Voice over IP (VoIP)
technologies, phone conversations are converted into packages of data to be
sent over the Internet in ways similar to e-mail and web sites. Phone calls
anywhere in the world can be significantly less expensive with IP
telephony. And because voice, data and video are all using one network,
packaged as similar data packets, services can be bundled together from

one service provider with the flexibility for the user to receive the
information on any communications device, regardless of location—office,
home, or on the road.
According to market research on communications preferences completed
by Pollara Inc. for Nortel, consumers are growing impatient with various
communications devices that don't work together. The research found that
today's consumers expect instant communications but instead are plagued
by having to navigate cumbersome menus on each device they own when
they want to reach someone.
Traditional cable providers are aggressively moving into service areas
formerly dominated by traditional voice providers. The incumbent service
provider is faced with a two-fold challenge: find a way to generate new
revenues, and curb subscriber flight to satellite service.
The current opportunity for cable service providers implementing VoIP
technologies is to simplify communications while, at the same time,
creating a user friendly communications environment that seamlessly
adapts to the lifestyle and needs of each individual. The technology that is
being used with VoIP to give businesses and consumers a wider range of
services and the ability to fine tune the management of their
communications is called Session Initiation Protocol (SIP).
Technical challenge
When making the decision to enter the VoIP business, the service provider
should be aware that VoIP creates more than one business opportunity. A
good part of the reason lies in the underlying technology. The concept is to
convert an analog voice signal into a series of ones and zeros that can be
reconstructed into the original analog format without perceptible loss of
quality. Once any information is converted into this digital form, all the
services developed for data switching, routing, and storage become
available as tools to tailor voice product for the service provider's market.
One of these tools is IP, which is the underlying technology used to move
information on most data networks, including the public Internet.
VoIP opportunities for cable include primary line telephone service, long
distance, SIP-based broadband voice, and business services. This section
will concentrate on primary line service.
Primary line service is a one-for-one replacement of the incumbent
telephone company's service, because it is carrier-grade telephony. This
means it must be highly reliable, scalable, feature-rich, maintainable
without service outage, and include the ability to track and measure key
performance metrics. Most importantly, carrier-grade means a quality of
service (QoS) for end-to-end transmission that keeps voice quality within
the levels expected by consumers for commercial telephone service.

Solution
Network diagram
From a high level perspective, the same packet network can be used to
deliver multiple media to wireline customers including voice, video and
data. The diagram below illustrates connection of a call server to a Cable
based customer for providing voice data, and video services.
C all M anagem ent S erver
H eaden d
M ed ia G atew a y IP R o uter C M TS
p acket netw ork
PSTN
HFC
E m b edd ed
M TA
Figure 22-5: PacketCableTM end-to-end VoIP architecture

Looking at the Nortel cable network solution in slightly more detail,
multiple services can be offered over the same core network to reach a
wide variety of users.

Figure 22-6: Nortel cable VoIP Solution
Solution Details
There are several options for using VoIP as the underlying technology for
primary line technology. The diagram at the end of this section shows the
architecture detailed by the CableLabs PacketCableTM specifications for
end-to-end VoIP, which is being deployed today. In this scenario, a standard
subscriber telephone connects through existing phone wiring to a new
Embedded Multimedia Terminal Adapter (E-MTA) that may be located on
the side of a home or within the home, depending on the packaging. The E-
MTA does the analog-to-digital conversion and packetizing functions, and
its embedded cable modem communicates with a Cable Modem
Termination System (CMTS) at the headend.
The network side of the CMTS typically includes the ability to route
signaling and voice packets. Signaling packets are exchanged with
softswitches in the service provider's network to set up and supervise the
call. Voice packets are routed to the called party through a packet network
to a remote CMTS or the PSTN via a media gateway. In addition to
handling call setup and supervision, the Call Management Server (CMS),
which at Nortel we consider a communication server, is the source of
revenue-generating subscriber features.

The stringent requirements of carrier-grade primary line service demand a

two-way network with a robust set of network elements that enables
increased end-to-end QoS and security. Nortel has worked with cable
operators since the mid-1990s to satisfy this need with HFC-based cable
telephony. DMS circuit switches currently provide TDM-based carrier-
grade telephony to 4.9 million cable subscribers. The call processing and
feature code that serves the DMS switch has been ported to Nortel
softswitches as the basis for its cable VoIP offering: the Communication
Server 2000 (CS) and the Communication Server 2000 Compact. The CS
2000 is unique because it supports both VoIP and TDM circuit-switched
telephony. A network using these elements is shown in the second diagram.
The Multimedia Communication Server 5200 can be integrated with the
above solution to add new multimedia features to the voice services that
can be offered by the cable provider. By adding multimedia services to the
voice and video bundle, cable operators can increase their revenue
opportunities by offering multimedia services that market research by
Pollara, Inc. has shown that residential customers want—and are willing to
pay for. These services simplify and enhance the consumer’s
communications experience by consolidating multiple communications
devices into one number and one mailbox. And, they enable greater control
over the end-user’s communications by automating repetitive tasks.
Key takeaways
Nortel can help the service provider make the transition. Nortel has a rich
history in telephony and a proven record in VoIP for cable, as well as an
installed base of more than 140 DMS switches carrying 4.9 million cable
telephony lines globally.
On the VoIP front, Nortel has been active in PacketCableTM
interoperability work at CableLabs. Its softswitch is PacketCableTM
qualified, and it has had a visiting engineer on-site at CableLabs for years.
Several cable operators have deployed Nortel VoIP solutions, giving it real-
world experience.
In addition, Nortel recognizes that VoIP networks come in all sizes and
should support all service types, and offers two versions of its carrier-class
Communication Server (CS). The CS 2200 (formerly known as the
Communication Server 2000 Compact) softswitch occupies a smaller
footprint, consumes less power, and is built on a commercially available
Compact PCI platform with open base software architecture. The CS 2000
is built on the DMS XA-Core multiprocessor platform. Both of these
platforms provide cable operators with the same powerful ability to deliver
local, long distance, and tandem VoIP services on a single platform. Both
platforms deliver the same applications, protocols, and functionality by
fulfilling the softswitch promise: software functionality independent of
hardware platform.

Nortel Softswitch portfolio delivers unparalleled scalability and reliability,

with the critical attributes that guarantee a successful migration: strategic
architecture, regulatory features, carrier-grade attributes, and
comprehensive services.
Broadband
The broadband market in North America continues to be a dynamic sector
as the competitive landscape and consumer demand for new
communication services continue to evolve. Driven by the need to find new
sources of revenue, service providers are looking for ways to unleash the
potential of broadband networks.
Nortel understands that wireline service providers need to deliver value-
rich service bundles—services such as VoIP, Multimedia Communication
Services (integrated voice, video, and data), broadcast and IP-video
(television), and data services. Our next-generation broadband solutions
are ultra-broadband ready, meaning that they have the high bandwidth,
Quality of Experience attributes needed to deliver the new “triple play”
service set (voice, data, and video).
Wireline carriers are losing customer ownership as their strategic position
slips. Cable competitors are targeting their customers with value-priced and
value-added alternatives to basic phone services. If the wireline carriers are
to survive and thrive, real service differentiation is needed.
Technical challenge
According to The Yankee Group*, significant capital spending will occur
in the broadband access market in the next four to five years—
approximately US$5 billion annually. This longer term spending trend is
being driven by a need for service providers to replace much of the existing
broadband equipment with a newer generation of infrastructure that is
capable of supporting a “triple play” business model.
Solution
Network diagram
The Broadband market has a wide range of technologies, all of which can
leverage a packet-based network to provide voice or other services. The
following diagram illustrates some of these technologies including voice
service through a traditional copper loop from a central office, voice
service through a Digital Loop Carrier (DLC), voice and other services
through Digital Subscriber Loop (for example, ADSL), voice and other
services through Fiber to the Curb (FTTC), and Voice and other services
through Fiber to the Home (FTTH).

Figure 22-7. Broadband market technologies
Solution Details
Nortel has significantly expanded its Broadband Networks portfolio to
enable traditional wireline service providers to deliver a new set of value-
rich, revenue-generating services to consumers and small-to-medium
business customers over a high-bandwidth, ultra-broadband infrastructure.
Nortel Broadband Access Solutions couples best-in-class access products
from strategic alliances with a world-class portfolio of voice, date, and
transport products.
Strategic alliances provide a complete range of access products including
the following:
DSLAM, PON, and Mini-RAM products from ECI* Telecom
Multiservice Broadband Loop Carrier products for the North
American market from Calix*
Multiservice access products for the European or ETSI market from
KEYMILE*.
This powerful combination of new and existing products enables service
providers to deliver high-value, revenue generating services through a
reliable and scalable broadband infrastructure.
Nortel Broadband Fiber Solutions are based on leading Optical Access
technologies such as Fiber-to-the-Premise, Curb, or Business (FTTx)
utilizing PON technology. These powerful access products offer the
convergence of voice, video and data services over a single fiber
infrastructure, thereby delivering ubiquitous and seamless solutions and
eliminating the network bottleneck. Features include future proof and full
service set offering, reduced Capex for full service set network, reduced

Opex for broadband access network, and complementing DSL-based

solutions.
Broadband Fiber Solutions enable service providers to offer customers
“triple play” services within high-bandwidth service packages that include:
multichannel TV, video-on-demand, high-speed Internet access, multiple
voice lines, games-on-demand, e-mail and more. Nortel is teaming with
other industry leaders, such as ECI Telecom, in bringing technology to the
forefront of this converged network transformation.
The Broadband Fiber Solutions series offers the ability to support the
modular combinations of POTS, Ethernet, xDSL, T1/E1, and RF interfaces
to address any desired service mix, while enabling services that cost-
effectively grow, as the network evolves. Further, Nortel Broadband
Copper Solutions for wireline applications are comprised of next-
generation broadband DLCs, DSLAMs, and mini-RAMs.
Broadband Access Services Gateway 7700. Optical Line
Termination (OLT) that supports all FTTx types along with point-
to-multipoint and point-to-point configurations with multiple
subscriber interfaces all on the same platform, in any combination,
and without restriction.
Nortel Broadband Copper Solutions comprise next-generation, ultra-
broadband DLCs, DSLAMs, and Mini-RAMs that are capable of
delivering high bandwidth services such as triple play and multimedia
communication services. Together with Nortel’s existing voice, data, and
transport products, these broadband access products can be combined to
form complete end-to-end solutions, or as a subset of solution bundles, to
meet the needs of individual service providers. Nortel Broadband Copper
Solutions offer wireline carriers the revenue-generating broadband service
infrastructure they need to enhance long-term competitiveness.
These Broadband Copper Solutions include products offered under the
Nortel brand as well as products available through strategic alliances.
Broadband Access Services (BAS) Gateway 7700 & 7500, Nortel
BAS Gateway 7700 is our core, full-size DSLAM product,
delivering up to 960 subscriber lines, including all varieties of DSL
and fiber, from a single shelf. The BAS Gateway 7700 is suitable
for mass deployments from central offices or from environmentally
controlled street cabinets. These products are built upon
technologies from a strategic alliance with ECI Telecom.
C7*. The C7 is a Multiservice Broadband Loop Carrier (MS-BLC)
that fits into existing networks, integrating with traditional
operations and support systems. It also supports both fiber and
copper-based services while providing a bridge for the delivery of
packet-based broadband services to every subscriber.

KEYMILE UMUX 1500*. For markets requiring adherence to

ETSI standards, the KEYMILE UMUX 1500 is a carrier-grade
multiservice access platform, offering a wide range of services and
support for different technologies (ATM, IP, SDH & PDH) on a
single platform.
Figure 22-8:The Nortel Ultra Broadband portfolio
Key take aways

Nortel is delivering Ultra Broadband solutions and services that enable the
future success of service providers through:
Elimination of stranded investments through scalable and evolvable
network infrastructure
Future-proof capacity for new services and unplanned
bandwidth requirements
Support copper and fiber deployment scenarios from same
platform
Enabling of services innovation by providing the capability to
deliver new range of triple play and multimedia services
Carrier VoIP and DMS-10 integration to provide VoIP services
to residential subscribers
Features to enable video/TV services to the home
End-to-end networks that are services manageable and network
manageable

Services management provides the ability to streamline the

process of turning up new services through automation towards
the goal of zero-touch provisioning
Integration of the Calix EMS with the Nortel Integrated
Element Management System (IEMS) platform
End-to-end solutions that are highly secure and reliable
Porting security expertise to Nortel access solutions
Working to minimize denial of service attacks
Rigorous modeling and testing of solutions to ensure reliability
Conclusion
Network convergence
The varied networks that exist today have evolved in parallel, and each
offers important attributes of its own. TDM networks are reliable, secure,
easy to use, and optimized for voice traffic. Data networks are efficient,
scalable, and optimized for packet traffic. Wireless networks are
ubiquitous, convenient, and optimized for mobility. The solution is to
transform these networks to maximize profit and market share without
sacrificing the valuable attributes of each. These dual goals can be achieved
by migration: transforming traditional networks into packet-based
networks that can offer all of the features and attributes of traditional
service in a simplified, cost-effective, service-enabling manner.
Nortel offers a broad and deep portfolio of services that fully leverage the
packet-based networks that service providers need to build. Service
categories are as follows:
Data networking services such as Virtual Private Networks
Mobility services such as voice over Wireless LAN
Integrated voice, video and data
Personalization of services to include content delivery and security
Next-generation residential and business services
Optical broadband services such as Storage Area Networks
(SANs), and optical Ethernet
Carrier-grade reliability
Nortel has a strong track record for service delivery that spans decades of
innovation while providing customer support with the implementation and
maintenance of these service-bearing networks. Carrier-grade service is
dependable, secure, and evolvable. We understand the importance of these
attributes, and support our customers in maintaining them.

Our efforts toward convergence are fully supportive of what service

providers want to do with their networks. Networks have been built in
“stovepipes” over many years, each network optimized to deliver one type
of service: voice, data, mobile, public, private, virtual, or dedicated. The
packet technology available today enables providers to converge more and
more service types on a single packet core network, resulting in greater
efficiency, lower operating costs, and the potential for new service
revenues. Nortel has a comprehensive packet-based portfolio today that
supports convergence of all types.
Nortel investments are focused on the high value control points in the
network (services edge, VoIP, and converged core) and on aggregating this
value to create multimedia multiservice broadband networks that feature
packetization, mobility, integrated multimedia services and applications,
and broadband networking.
Nortel is creating new value for the service provider by performing the
following actions:
Bringing carrier-grade scale and reliability to VoIP and carrier
capabilities to the Enterprise
Continuing to deliver the mobility value of wireless networks while
extending it to all networks
Delivering platforms to facilitate rich integrated multimedia
services
Creating common reference architecture across the product line to
facilitate convergence
Driving cost performance and scaling of services
Service intelligence
Nortel understands the capabilities and constraints in the transformed
network and how they need to be used, and can add new capabilities into
the network quickly, to give service providers the edge in offering new
services. This capability is called Service Intelligence, and it guarantees
network performance based on the end user's requests, network resources,
and service application needs. Service Intelligence allows IP networks to
move beyond “best effort” and dynamically adapt to customer
requirements.
Given convergence and higher performance levels, the next challenge to
value-rich service is resource allocation. The transformed network must
make intelligent use of resources through policy, authentication, billing and
QoS. When this is in place, the service provider can deliver with
confidence the services that users demand.

The transformed network

All of these pieces fit together to create the transformed network. It is
simplified through convergence, robust because it retains carrier-grade
attributes, and more controllable through Service Intelligence. The result is
a network that delivers value-rich services, lowers costs, and builds the
service provider’s brand.
The market in which service providers operate in has changed. End users
have increasingly challenging requirements to meet, and service providers
are beset by nontraditional competition and margin erosion.
The solution lies in transforming the network to lower costs, enhance
performance, and enable new services.
Nortel is ready to help carriers achieve the twin objectives of converging
their networks and delivering value-rich services that enhance customer
loyalty and increase revenue. Through our experience, market leadership,
broad and deep product portfolio, and partnerships with value-add vendors,
we can help the service provider meet and conquer both today's challenges
and tomorrow's.
The examples provided in this chapter illustrate how Nortel can help
service providers transform their networks to meet today's market and
service requirements, and thrive in the new telecommunications
marketplace.
References
IDC, U.S. Hosted IP Voice: Market Analysis and Forecast 2002-2007.
Author: Thomas S. Valovic, IDC Study No. 28803, released January, 2003.
InfoTech, Enterprise Convergence: The Race for IP Telephony Supremacy,
report released April, 2003.

543
Chapter 23
Private Network Examples
Stéphane Duval
Tim Mendonca
The data solution

The first part of this chapter describes the data solution of this private
network example.
Introduction
The purpose of this chapter is to demonstrate the ability to satisfy customer
requirements for real-time, converged networks with the technologies
discussed in this book using Nortel products as the example.
The focus is voice over a data infrastructure. It is not the intent of this
section to describe and explain data routed and routing protocol standards.
Also, the example will focus on the Headquarters that incorporates all
aspects of deployment. By adjusting the scale of the deployment, the
solution can be adapted to all sizes of organizations.
This example is by no means the only approach that can achieve the needed
QoE results. Used as a model, it can be adapted to develop a custom
solution based on unique organizational needs. A large variety of
interchangeable products from Nortel create limitless deployment options
for converged infrastructures.
Starting with the definition of the four types of convergence, Quality of
Experience (QoE) and Quality of Service (QoS), the identification,
definition, categorization, and characterization of a set of Solution Design
Attributes (SDA) to address different aspects of a solution’s architecture, a
common convergence vocabulary will be established. The goal is to
systematically gain a clear understanding of issues that arise in Enterprise
data networks when deploying real-time applications and learn analysis
and design techniques to ensure a high level of customer satisfaction and
network performance.
Getting Started
Business success in moving to a converged network will rely on knowing
the underlying criteria affecting the current state of your infrastructure.
This knowledge assists in the development of processes to evolve your

existing network capabilities in order to achieve performance levels

adequate for real-time applications.
Business drivers for convergence

Many Enterprises embark on a convergence path to lower costs, but the
benefits extend beyond an improved total cost of ownership. In fact, the
productivity gains and competitive benefits that are realized through an
integrated network of voice, video, data, and applications quickly changes
a tactical decision that an Enterprise makes to a strategic decision that helps
the Enterprise run its business better and improve its customer service.
Scope the project

Based on the convergence capabilities you intend to implement, start by
defining the scale and scope of the project, in terms of the number of users,
number of sites, and the types of applications you will run. What is the end
result or service level you are trying to deliver to the end-users?
Identify implications on organizations with whom you partner for success.
It is absolutely critical that any convergence projects be undertaken with
the inclusion of, and in partnership with, the various IT departments. In
addition, identify the impacts on operations personnel and business units
that will benefit from improved employee productivity and enhanced
customer service. Assess the impact on human resources and the security
organization, since they will need to ensure IP Telephony and multimedia
are integrated into the Enterprise security policy.
Current state of the network

Looking at the communications services your Enterprise provides, you
need to determine the current state of the telephony and data
infrastructures. From a network infrastructure point of view, a thorough
understanding of the capabilities of the network backbone and the
technologies used to transmit voice, data, and video on the network is
needed. Will a phased approach be possible or will a forklift upgrade be
necessary? An inventory of your network capabilities will help you make
an informed decision.
Using design attributes to assess how your current infrastructure addresses
such issues as latency, jitters, and reconvergence mechanisms (spanning
tree, Split Multilink trunking) is crucial to the success of this type of
project. The goal is to achieve the development of a clear and accurate
evaluation of the infrastructure. Failure to achieve this result will ultimately
tarnish the QoE end-user expectations upon completion of the project.
Transition risk assessment

A number of business parameters affect the roadmap for a particular
Enterprise. It is important to consider the impact that moving to a

Chapter 23 Private Network Examples 545
converged network will have on business continuity, organizational

dynamics, security, and scalability. Fundamentally, you need to evaluate
the level of risk you are willing to accept in transitioning to convergence.
This evaluation will help you determine a time frame to reach network
convergence that makes sense for your business.
Business continuity
During the transition to a converged network, you will need to consider
what measures you can take to ensure the continuity of external services for
customers and the availability of communication applications needed by
employees, suppliers, and partners. Plan disaster recovery and redundancy
for mission-critical operations that are essential for conducting business.
Organizational dynamics
Moving to a single network that carries voice, data, and video traffic will
necessitate a new IT paradigm for managing the unified infrastructure.
Consolidated management policies will be needed, as will a redefinition of
roles and responsibilities of network management personnel previously
aligned with either the voice or data side of the Enterprise. During the
transition period, you need to consider that there will be real costs
associated with personnel realignment and retraining activities. Assess the
impact that moving to converged applications will have on how employees
carry out their day-to-day tasks.
The importance of Network Health Check (NHC)

When implementing any real-time network, a Network Health Check
(NHC) should be conducted before implementing service. This is a wise
step for not only existing networks but for those that are brand new. A NHC
will potentially help identify a number of anomalies found in networks
including a phenomenon known as duplex mismatch. Although the
phenomenon of duplex mismatch is fairly well known, it still occurs in
many networks. This is due to historical issues around the operation of
auto-negotiation features; companies establishing policies that interfaces
will be statically set for network planning and performance reasons;
misunderstanding of how auto-negotiation works; certain interfaces only
able to run 100 M full-duplex when set to autonegotiate; and especially a
mix of the above.
Designing the Real-time Networking Solution Infrastructure

The following examples of real-time networks require that we make some
simplifying assumptions. These are fictional network deployments.
Therefore, we need to establish customer requirements. We assume that our
fictional customer requires the following:

Services. The services for a real-time network include voice, video

streaming, video/audio conferencing, e-commerce, file, database,
and print services.
Performance requirements. Industry accepted performance
standards for a converged real-time network include resiliency,
availability, and security.
Network size and geography. An analysis of service quality
requirements for all areas of the network has been completed and a
corporate network structure analysis focusing on national
telecommunications requirements has been completed and
confirmed by the carrier or/and service provider. The network
design considerations include a data center with regionally
distributed voice/data services, one National headquarters building
in New York with 3000 employees with eight floors, two regional
offices—one in Santa Clara and the other in Orlando with 400-500
employees with four floors—and two district offices, one in Dallas
and the other in Minneapolis with fifty employees with two floors.
For this example, the customer wants to implement VoIP to all employees
with a multimedia overlay. All employees will have access to unified
messaging. Additionally, the customer wants to deploy a VoIP contact
center that is deployed across the three major sites.
Proven
Scalability Reliability Internet Performance
& &
Efficiency Service Management
Provider
District District
Office Office
Solution Redundancy
Resiliency
Regional Regional
Office Office
Virtual Circuit
Enterprise Solution
Physical Circuit
Corporate
Office
Figure 23-1: Example network

Call Centers represent a unique challenge to Real-Time Converged

Enterprise Networks. When sizing a network for traditional VoIP
applications, an average usage is 6–12 CCS (36 CCS = one hour) and
generally has trunking requirements of 25%. In the case of Call centers
average, usage is usually figured at 36+ CCS and trunking requirements
range from 125% to 200%. Furthermore, Call Centers are generally
customer facing and mission critical therefore they require the highest QoE
possible.
We will assume that proper call load and network sizing studies have been
completed and the proper amount of bandwidth exists in the network to
support call loads. Note that even when there is adequate bandwidths in a
network to support all applications, QoS mechanisms play an important
role in providing QoE.
The last assumption identifies scarcity levels of resources. For this
example, we will assume that needed resources are available.
Developing the core network

Several options exist when designing a telecommunication infrastructure
for an organization. The first step was to identify the purpose of the
network. In this design demonstration, we identified a VoIP, multimedia
and data solution with specific QoE requirements. Core telecommunication
choices are as follows:
Frame relay, ATM
Multiprotocol Label Switching (MPLS)
Wave Division Multiplexing (Coarse Wavelength Division
Multiplexing [CWDM] or Dense Wavelength Division
Multiplexing [DWDM])
Resilient Packet Ring (RPR)
Routed (Internet Service Provider)

ISP1
ISP2
Figure 23-2: Core network
For the purpose of this design example1, we chose ATM and frame relay
circuits supplied by a local service provider. ATM is chosen for the core
network because of its proven capability to deliver QoS in a packet
environment. Frame relay has been chosen for the branch/district offices
based on its ability to deliver high bandwidth at low cost. While frame
relay can prove to be challenging when it comes to real-time, converged
networks, we have kept it in this example due to its widespread adoption
and low cost.
Headquarters and the high performance data center

HDC must also support critical business applications. Enterprises can use
secure HDC to host business applications, implement firewalls or virtual
private networks, and provide storage services and content delivery of
static and streaming media. Consolidation of data services and centers
allows Enterprises to centralize critical computing resources, create virtual
data centers that span multiple locations, and reduce operational costs
without the performance penalties or security concerns typically associated
with remote access. Some key functionality of an HDC are as follows:

Distributed intelligence and Web-optimized components

Improve business continuity through massive processing
throughput and transport bandwidth
Support delivery of critical business applications for employees,
business partners, and customers
Consolidate or outsource data center functions without performance
penalty or security concerns
Extend network resources and reach across public shared facilities
Convergence-ready infrastructure to deploy VoIP services
A service strategy must be developed in order to provide services to end
users. The service strategy must include an infrastructure that can provide
these services. In a previous section, solution design attributes were
identified that must be addressed in the development of a high-performance
data center.
Headquarters and the Secure High Performance Data Center

To simplify the explanation of the design, lets start by describing how to
secure the access to the Data Center from the WAN links and Internet.
Afterwards, the design will be expanded to include the following abilities
to the data center:
Ethernet Switching and Load Balancing
Content Delivery Networking
Solutions Management
Closets and Aggregation Points
Voice Communication Services
In the previous telecommunication infrastructure analysis section, it was
decided that ATM would be used to connect the headquarters with the
regional offices and frame relay would be used from the regional offices to
smaller district offices.

Scalability Proven
& Reliability
Efficiency Contivity
Alteon Switched
Effectiveness, or
Firewall
Passport
Ability
&
Security Redundancy
WAN
Solution Service
Resiliency Provider
DMZ
Performance
&
Redundancy
Management Contivity
or Internet
Alteon Switched
Firewall
Passport
Service
Provider
Figure 23-3: Secure high performance data center

Nortel provides two alternatives for identifying the ATM to be used.
Passport* 7K multiservice platform or a Contivity* VPN device can be
used to establish an ATM link to the regional offices. Both provide ATM
connectivity but only one can provide Layer 3 services, firewalling and
VPN tunneling ability to secure WAN links. This requirement was
established during earlier assessments and the Nortel Contivity solution
was identified as meeting these requirements.
Even though these links are internal to the organization, security is still a
concern. Organizations may be just as vulnerable to internal security
breaches as to external ones. To secure a 4 Gbps ATM link, a high
performance firewall will be needed. The Alteon* Switched Firewall will
address the security requirements for these links. Any firewall can be used;
but, typically, most firewalls are limited to 1 Gbps throughput. Nortel’s
Firewall solution provides an accelerated switched firewall with four
Firewall Directors and two Firewall Accelerators used to address resiliency,
redundancy, performance, and efficiency of the security solution. It also
provides much higher throughput (up to 4.2 Gbps).2

Attached to the Firewalls will be a Demilitarized Zone (DMZ) to provide

Web services and LDAP directory services.
WAN
Internet
Service Service
Provider Provider
Contivity VPN / Firewall

Firewall Security
DMZ Secure Voice Zone (SVZ)
SIP/PRI Succession
Gateway IP PBX
FTP & Web Servers LDAP Symposium OTM Call MSC 5100
Servers Call Center Pilot PSTN
Server
Alteon Switched
Firewall Firewall Security
Passport 8600 /w Web Switch Module (WSM)
Data VLAN & Telephony VLAN
IP Telephones
Software Telephones
Desktop PCs
Figure 23-4: Firewall security

The diagram describes three main areas, Demilitarized Zone (DMZ),
Secure Voice Zone (SVZ), and LAN (data center and user access). Notice
that the DMZ and SVZ are isolated. The first level of security for External
user access is protected by the Contivity CheckPoint firewall. The
Contivity also provides secure branch VPN tunnels to remote office
location and individual user tunnels for teleworkers from the Internet. With
a valid tunnel established, secure users are granted access to the second
level of security. The Alteon switched firewall is commonly referred as an
ASF. Based on established filters, users are granted access to the LAN,
Data center and SVZ. Both Firewalls protect the DMZ from hacking
attacks such as Denial of Service Attacks (DoS) and SYN-ACK attacks.
Filters can also be established to detect malicious code.
Both the Contivity Platform and the ASF are scalable, redundant, and can
be configured for resiliency to ensure a Quality of Experience and high
level of availability.
The Secure Voice Zone (SVZ) and DMZ are protected from both internal
and external users and can be extended to include Data center application
servers.

The Passport 8600 switches are interconnected through an Interswitch

Trunk (IST) and will provide a resilient, redundant Split Multilink Trunk
(SMLT) core for the data center. The two Passport 8600 switches
connected in series provide the SMLT capabilities that connect all LAN
wiring closets with fill redundancy and subsecond resiliency.
Ethernet Switching and Load Balancing

The Passport 8600 integrated with a Web-switch Module (WSM) will load
balance the two Contivity at the customer edge to provide VPN services to
secure remote access for internet users and internal locations.
Internet Services can either be provided through the current ATM service
provider or through another. If another Service Provider (SP) is used, the
connections from the SP will terminate on the Firewall. Multiple Internet
connections by different Service Providers can be used to enhance
resiliency and redundancy, availability, and reliability of internet services.
Optivity Proven
NMS
Scalability Reliability
& & Alteon Switched
Efficiency InteroperabilityPassport
8600
Firewall
Contivity
Application
Switch WSM
Dual-homed
Servers
Redundancy WAN
Service
Performance Redundancy
Provider
& DMZ
Dual-homed Management
Servers
Application
Switch WSM
Contivity
Effectiveness, Solution Internet

Ability Resiliency
Alteon Switched
Firewall
Service
& Provider
Security
Figure 23-5: Ethernet switching

The Passport 8600 was selected for many reasons, but primarily for SMLT.
SMLT enhances resiliency capabilities of the network by reducing the
reconvergence times to below one second. Third party validation of SMLT
resulted in a recorded less than one second of packet loss. Spanning tree
(802.1d) or EnhSTP (802.1w) cannot guaranty that no session failures will
occur in the event of a module or switch failure. SMLT in the event of any
failure, link, Passport module, or a switch within the pair, will result in less
loss than the required three seconds time out for a network session. So, in
the event of a failure in any component of the two Passport 8600, the data

session and voice calls will not be terminated and user performance impact
will be kept to a minimum.
The design addresses resiliency

Resiliency is a Solution Design Attribute (SDA) that defines multiple
concurrent network data flow paths from source to destination and provides
a reconvergence method that reduces the amount of time needed for a
physical or logical recovery of network in order to ensure client session
integrity.
Resiliency in a network environment refers to those logical protocol
mechanisms that both sense network outages and switch to new resources
quick enough to maintain sessions or have a second path that can be
switched to in time to maintain sessions. Resiliency is similar to the
concept of redundancy; however, redundancy in our definition refers only
to hardware and has a different set of qualities. Resiliency includes such
protocol and mechanisms as SMLT, Load Balancing, Virtual Router
Redundancy Protocol (VRRP).
SMLT dramatically improves the reliability of Layer 2 networks. It
operates between a building’s wiring closet (edge) switches and the core or
aggregation switches for the building. An IST makes a pair of Layer 3
switches appear to be a single switch to all devices attached. They are also
in Active-Active mode and load share incoming traffic utilizing capital
investment to its full potential. No links are blocked and bandwidth is
maximized.
SMLT enables rapid fault detection and forwarding path modification
allowing this type of environment to recover from a partial or full failure of
an aggregate switch. This innovation adheres to reliability standards
eliminating any single point of failure in the LAN/WAN engineering
design. Compared with Enhanced STP, SMLT recovers from a failure of a
link, module and aggregate switch in less then one second. Since all SMLT
links are predefined, SMLT is only concerned with the ports that are
associated with the specific SMLT link that failed and does not require any
adjustment to any other portion of the network Although SMLT is
primarily designed to enhance Layer 2, it also provides benefits for Layer 3
(Routed SMLT, R-SMLT) networks as well, by reducing network
convergence delays.
For Layer 2 bandwidth optimization, SMLT allows traffic to utilize all of
the available links between switches and devices. Unlike STP that blocks
ports whenever a loop is detected, SMLT sends traffic over all available
links. Load sharing is achieved by the MLT path selection algorithm used
on the edge switch. This is accomplished on a SRC/DST MAC address
basis. Load sharing in the reverse direction is achieved naturally by traffic
arriving at both aggregation switches and being directly forwarded to the
edge switch and not over the IST trunk.

Avoidance of spanning tree recalculation from the edge switches uses a

standard MLT connection, which make the two aggregation switches
appear to be a single switch. In this scenario, spanning tree is disabled and
convergence times if a switch or link failures will be in the subsecond
range. Both STP and SMLT eliminate a single point of failure within a
network topology. In a typical SMLT configuration, any single link or one
of the aggregation switches can fail and normal operation will be restored
within one second. This can be compared with STP, which is dependent on
the size of the network to quantify the convergence time.
This subsecond reconvergence allows IP session to stay connected, which
is crucial to IP telephony applications. Because today's IP infrastructures
demand high availability, the technology used should support redundant
(and load sharing) switch fabric/CPU modules with subsecond fail-over.
An enhanced resiliency can be provided using Split-Multiink Trunking (S-
MLT) and Routed Split Multilink Trunking (R-SMLT), eliminating the
need for the Virtual Router Redundancy Protocol (VRRP).
Nortel has developed two protocols SMLT and R-SMLT eliminating the
need for inadequate re-convergence delays delivered by Spanning tree
(STP), Enhanced STP (802.1z) and VRRP for deploying real-time
networking solutions.
Adding in application switches and keeping in mind redundancy and
resiliency, SMLT will be used to connect two external Alteon application
switches. These application switches will control load balancing and
content redirection. The application switches will be connected to the
Passport 8600s and provide load balancing services to the Dual-homed File
and print servers.
The design addresses reliability

Reliability is an SDA defined as any component or feature in a system that
consistently produces the same results, meeting or exceeding its
specifications.
Reliability comes mainly from the voice world. It was an extension of the
concepts of redundancy and MTBF. In other words, it took into account the
combined effects of redundancy and MTBF to the effect on uptime of the
system. Five nines equates to a total of twenty minutes downtime a year.
The design addresses redundancy

Redundancy is an SDA that reduces the probability of errors in
transmission by duplicating hardware devices within a solution’s design or
component with a device.
In our definition, redundancy refers strictly to hardware. This is done
because of the different nature of redundancy between hardware and
software. In the case of hardware, redundancy can range from partial

redundancy such as incrementally providing redundant components such as

power supplies, link hardware, and central processing all the way to what is
referred to AB units where the entire hardware platform is duplicated
redundancy also comes in different levels of redundancy that is hot, warm
and cold.
Hot (Active-Active) redundancy is defined as the capability of
switching to hardware automatically without loosing sessions.
Warm (Active-Standby) redundancy is defined as the capability of
switching to hardware automatically; however, sessions may be lost
due to slow convergence.
Cold redundancy is defined as the capability to manually switch to
redundant hardware.
The design addresses performance

Performance is an SDA that quantifies the maximum throughput capacity
of a solution’s design and identifies bottleneck and potential delays within
the solution.
Performance should be characterized with end-to-end metrics and
measured with objective tools. For instance, if we have set a QoE
expectation of near toll quality, then we should be able to transmit a voice
flow across the network and achieve R = 80 or higher. The “R” score takes
into account effects of delay, loss and jitter.
Performance targets should be set for all applications that are
determined as requiring a certain level of QoE, for instance
The first example on voice of having an “R” score of 80 or higher
ERP applications may be identified as having a certain QoE based
on response time or other criteria
Certain session-oriented applications may have a QoE that focuses
on loss
The design addresses efficiency

Efficiency is an SDA that quantifies input and output processing delays
within a solution.
In our definition, efficiency refers to the traditional impairments identified
in real-time protocol analysis: jitter, delay, loss and throughput.
Here we refer to the networks design and protocols implemented ability to
optimize latency, jitter, throughput and packet loss to meet our QoE
objectives.

The design addresses effectiveness

Effectiveness is an SDA that quantifies the quality of being able to perform
intended IP Telephony and data application functions.
We define effectiveness as the tenant that determines on a subjective level if
we have met our QoE targets.
Do users perceive the quality of a voice call commensurate to that
committed that is toll quality, near toll quality, and cell quality?
Do data users perceive their QoE acceptable?
Does the network provide the level of availability specified in QoE?
The design addresses Security

Security is defined as a tenant in QoE because it and/or the lack of it can
impact many of the other tenants to QoE.
During the move to a converged network, we will need to ensure that real-
time applications are secure and that business assets are protected against
malicious intent from both external and internal threats. Steps need to be
taken to safeguard the confidentiality, integrity, and accuracy of network
communications, that is, what should remain private, stays private.
Multilayer security. A multilayer approach to security will enable
you to offer variable-depth security, where each additional security
level builds upon the capabilities of the layer below. For example,
basic network compartmentalization and segmentation can be
provided by VLANs. A second layer of security can be achieved
through the use of perimeter and distributed firewall-filtering
capabilities at strategic points within the network. VPNs can be
added as a third layer for finer-grained security. The use of firewalls
and traffic managers capable of deep-packet inspection and
application-level gateways will provide additional security against
attacks directly targeting applications or payload (content).
Security for administration personnel. The more open the
Enterprise and the more centralized the network management
system, the greater the requirement for stringent security for
network management processes. This is especially true when the
network is in a transitional state. Access authority and privileges
granted to network management personnel must be carefully
secured to protect network configuration, performance, and
survivability.
Continuous security policy management and enforcement

A properly designed and implemented security policy must identify the
resources in the Enterprise that are at risk and address how to mitigate
threats. The security policy must enable vulnerability assessment, define

appropriate access control rules, and help identify and discover violations.
Risk and vulnerability assessment must be performed at all levels of the
network.
Without appropriate security features, VoIP networks are much more
vulnerable to eavesdropping, theft and denial of service than traditional
telephony networks. Logical security, where the system is contained to an
isolated intranet protected by firewalls, is not sufficient in today’s
environment. A system can be subjected to internal attacks from a
malicious user or from a pervasive worm that is transferred to a hard drive.
IP telephony systems are now connected to the corporate intranet or the
Internet. As such, security needs to be enhanced to address the new world
threats.
Security in an IP environment is based on the following components:
Physical and logical security of the infrastructure (such as end
points, switches, and routers)
Network Element (NE) security of all the system components
Equipment security of the servers and other hardware
Software security of the applications
Client security regarding access control and privileges to the
systems
Security of soft clients on multiuse personal computers
Demilitarized Zone (DMZ). A DMZ is a term first used in complex
multiple machine firewall setups, where a computer is placed outside the
firewall, but is still available for use by the internal (protected) network.
The advantage of a DMZ computer is it can use and receive the entire
Internet. The disadvantage is that it may be vulnerable to attack from
parties unknown.
Secure Voice Zones (SVZ). Securing telephony is an important step in a
comprehensive security strategy. A secure telephony solution framework as
part of a unified security architecture leverages both traditional resilient
telephony switched networks levels and a sustainable migration path to a
converged IP network. All levels of security call for a secure voice or IP
telephony zone since all IP telephony servers are vulnerable to attack,
malicious or otherwise from within the Enterprise as well as outside. A
stateful firewall with SIP and H.323 protocol support is needed to provide
an SVZ and four levels of security, minimum, basic, enhanced and
advanced that are based a unified security architecture to ensure these
critical servers and call servers are highly available.
Security. SVZs for IP telephony devices must ensure accessibility without
compromising the confidentiality and integrity of other Enterprise network
resources.

The design addresses ability

Ability is an SDA that describes abilities inherent to a solution and
provides the quality of being able to perform and achieves functional
requirements for QoE independently with the use of predefined rules,
filters and content redirection.
We define ability as the capability of a product or solution to be able to
sense problems and creatively/uniquely implement technologies/protocols
to provide a level of service the network was designed to achieve.
The design addresses scalability

Scalability is an SDA that defines how well a solution’s design is capable
of handling growth and expansion and addresses issues inherent to small,
medium and large deployments.
When designing a network, scalability refers to the ability for the network
to grow from the initial installation and capability to some projected
growth and capability over a known/defined horizon. For example, a
network may be implemented for twenty sites, 10,000 users and meant for
voice and data initially but be able to scale to support forty locations,
15,000 users and, additionally, implement multimedia and e-commerce
over the next five years.
Note: It is not always feasible to build the initial network to be scalable to
future network growth and requirements; however, these should always be
taken into consideration.
The design addresses interoperability

Interoperability is an SDA that defines the ability to exchange and use
information provided by other solutions vendors.
We define interoperability as one of the tenants because it can have such a
tremendous impact on the rest of the network if it is either overlooked or
not given enough consideration. Many times interoperability is minimized
because it is assumed that if multiple product or vendors implement the
same protocols or standards that they will easily integrate and interoperate.
Regretfully, this is not the case.
Sometimes strict control of changes in code, bug fixes and rev level
changes should be thoroughly documented and tested before
implementing—depending on the level of interoperability required. For
instance, interoperability at the PPP level requires much less consideration
than interoperability at the SIP or H.323 level.
Content Delivery Networking

Nortel’s Alteon Application Switch helps put an end to the brute force
approach. The Alteon Application Switch is a multiapplication switching

system that allows Enterprises to maximize existing investments in servers

and networks through application-intelligent traffic management and
integrated application support. The switch also allows service providers to
efficiently enable differentiated services for their Enterprise customers.
An Alteon Application Switch is a multiapplication switching system that
performs wire-speed Layers 2/3 switching and high-performance Layers 4-
7 intelligent traffic management for applications such as server and
network device load balancing, application redirection, security, and
bandwidth management. The Alteon Application Switch is commonly used
in server farms, data centers, and networks. Based on a next-generation
version of the proven Alteon Virtual Matrix Architecture and the
application-rich Alteon OS Traffic Management Software, the switches
were built from the ground up as specialized high performance Layers 4-7
application switches and enable the broadest range of high-performance
traffic management and control services.
The Data Center is built and is able to provide basic services, such as
Virtual Private Networking, File and Print Services, and Web Services in
the DMZ. We also enable the design to support a secure voice zone and
maximized or technology investment.
Enterprise Content Delivery Networking (CDN) solutions can be grouped
into two primary categories: solutions designed to accelerate the
performance of external Web and data center infrastructure and solutions
designed to enable high-quality intranet content delivery for applications
such as e-learning, corporate Webcasts, and WAN bandwidth savings.

Proven
Scalability Reliability
& &
Efficiency Interoperability
Optivity Content Cache
NMS
Alteon Content Director
SSL
Alteon Switched
Passport Firewall
8600
Contivity
Application
Switch WSM
Dual-homed
Servers
Redundancy WAN
Service
Performance Redundancy
Provider
& DMZ
Dual-homed Management
Servers
Application
Switch WSM
Contivity
Passport
Effectiveness, 8600 Internet
Ability
SSL Alteon Switched
Firewall
Service
& Content Cache Provider
Solution
Security
Resiliency
Figure 23-6: Content delivery

In order to optimize the delivery of services such as secure web access
(HTTPS) and improve end-user Internet response to reduce the world wide
wait syndrome, tools can be used to enhance the data center. An Alteon
Content Director will be connected to each Passport 8600 and load
balanced through the Web Switch Module (WSM) in each Passport 8600
delivering user-aware request-to-content routing capabilities based on
Layers 3-7 protocol attributes and true user proximity. End user requests
will be serviced by the fastest server pool for that particular request.
From the application switches, two devices will be load balanced: an
Alteon Content Cache, enabling high-performance caching, streaming, and
filtering services of end user content and pre-positioning web content
closer to the end user, and an Alteon acceleration appliance, which
intelligently speeds application traffic by offloading specific compute-
intensive functions from servers. This includes the Alteon SSL Accelerator
for high performance SSL offload and SSL VPN, and the Alteon Content
Cache for high performance caching and streaming.
Solutions Management
Several management service services are added in order to control the
HDC, a security Manager to manage the Firewalls and Contivity devices
and a Network Optivity* NMS, OSM, and QoS Policy Manager to monitor
network devices and establish QoS policies and packet prioritization.

Effectiveness, Scalability
Ability & Proven
Wireless LAN Performance
Security Manager & Efficiency Reliability
WLAN 2250 Security & &
Management Interoperability
Optivity
NMS Content Cache
Network SSL
Alteon Switched
Manager
Passport Firewall
8600
Contivity
Application
Switch WSM
Dual-homed
WAN
Servers Solution Service
Resiliency Redundancy Provider
Redundancy
DMZ
Dual-homed
Servers
Application
Switch WSM
Passport
Contivity
Internet
8600 Service
SSL Alteon Switched
Firewall Provider
Content Cache
Security
Manager
Content Manager
Figure 23-7: Management services

Content Manager is added to control Content Caches within the network
and schedule and control the prepositioning of content to those caches.
Another device added is the Wireless Lan Security Manager to control and
secure wireless access points with the corporate headquarters.
Partnering with NetIq* (VoIP performance monitoring and testing tool) and
combined with Optivity Network Management System (NMS) proactive
network monitoring of real-time applications is conducted. If an event such
as high delays in the network that could negatively impact voice quality
occurs, an immediate notification is triggered and Optivity NMS identifies
the location of the potential fault.
The design addresses manageability

Manageability is an SDA that provides the capability to manage, define and
control the solution.
There are a number of issues in management that are more critical in real-
time, converged networks than transitional data networks. This is not to say
they are not as important to data networks they have just been overlooked
due to complexity.

Management interfaces should be highly secure. Management systems

need to be able to proactively monitor network bandwidth and QoE goals
to assure that service levels can be met. Management systems should be
able to communicate to probes in devices that will report back statistics on
QoE.
Real-time and historical network performance monitoring is needed of IP
telephony that can be managed and effectively supported in the Enterprise
environment. A Network Management System (NMS) provides a fast and
efficient way to manage and troubleshoot networks. A powerful application
provides a comprehensive set of network visualization, discovery, fault,
and diagnostic capabilities for identifying problems before they impact
network services.
Multicast real-time performance monitoring that enables Enterprises to
reliably utilize webcasts, streaming video and collaborative applications.
Note: Proactive Voice Quality Monitoring tools (PVQM) from Nortel
can provide extremely valuable metrics to enable predictive analysis of
potential degradation in voice quality, as well as allowing the quality of
voice to be used as a key parameter in an ongoing SLA. Following the
implementation of the converged network, the use of PVQM tools
would allow monitoring of voice quality as the traffic patterns change.
Also needed is a centralized network access management that enables
network managers to protect the perimeter of the network from attacks that
could impact performance or availability.
Enterprise-wide security policy management plans to increase a company's
defense posture by centralizing updates and enforcing compliance. This is
a system-level software application designed to manage the traffic
prioritization and network access security parameters for business
applications in the Enterprise-networking environment. Network managers
can take a proactive approach to bandwidth management, security, and
prioritization of business-critical traffic flows across the Enterprise.
Centralized QoS provisioning enables network managers to scale a QoS-
enabled network to support new applications and services. Common Open
Policy Services (CoPs) specifies a simple client/server model for
supporting policy control over Quality of Service (QoS) signaling
protocols (for example, RSVP). Policies are stored on servers, also known
as Policy Decision Points (PDP), and are enforced on clients, also known
as Policy Enforcement Points (PEP).
Closets and Aggregation Points

If the Data center also had to provide wiring closet to areas within the same
building, SMLT is used to connect each wiring closet to both Passport
8600.

Wireless LAN access can also be added with the use of a WLAN 2220
wireless access point and the security is managed by the WLAN security
switch (WSS) 2250. An adaptive Wireless LAN solution is also available
with Access Ports (WLAN 2XXX) and WSS 2270 providing security and
end-user roaming capabilities.
Several underlying protocols and services are used to maximize the
manageability and performance of this solution. DHCP is used to provide
IP address, network mask, and default gateway to devices.
Users are used to having phone service maintained during power outages
for emergency calls, including 911. This requires the consideration of
Power over Ethernet (POE) (802.3af), which is a new standard to provide
power to hard VoIP clients. There are a number of issues that have to be
addressed. Before picking a strategy, certain business and regulatory
aspects need to be considered including: 911 services, redundancy, heat
dissipation and power requirements.
Before POE, most VoIP phones got power from a power brick that was
plugged into a standard power outlet. In the case of a power failure, the
phone would be inoperable. If POE is implemented, it is important to make
sure that the power source for POE will continue to operate in the case of a
power outage.
When sizing POE requirements, a number of additional issues need to be
taken into consideration to include: redundancy, survivability, power draw,
heat dissipation and air conditioning. Usually POE is implemented in the
wiring closet and the last three items are overlooked, which create other
problems and additional cost.
IP clients can be assigned IP addresses basically three different ways:
statically, partial DHCP and Full DHCP. A static IP strategy is the most
secure, but also the most costly and cumbersome to implement. A Full
DHCP strategy is the least costly and easiest to use from the user
perspective but does introduce some security risk. VoIP and multimedia
clients can use either an existing data DHCP server or provision a separate
DHCP server for VoIP and multimedia applications. It is preferable to
provision a separate DHCP server for security and performance reasons.
Furthermore, the VoIP and multimedia DHCP server should be placed in a
Secure Voice zone with other VoIP and multimedia components such as
Call Servers and application servers to limit access.
The voice solution

The second part of this chapter describes the voice solution of this private
network example.

Voice and Multimedia Communication Services

Voice and multimedia communications services in the “IP World” are
delivered over a number of different components: Call Servers, Gateways,
Clients and Application Servers. Each of these components provides a
combination of services found in the legacy voice world and required in the
new “IP World referenced as Voice over IP (VoIP) or IP Telephony.
While VoIP and “IP Telephony” are used interchangeably in the general
marketplace, VoIP basically refers to any system that can carry Voice over
IP, where dial tone is the minimum requirement and additional services are
optional. “IP Telephony” is an attempt to differentiate between basic dial
tone services and the level of service customers have been accustomed to
from traditional TDM-based Voice Switches, commonly referred to as
PBXs.
Many new vendors entering the market have propagated the notion that
PBXs were broken and needed to be replaced by VoIP. They have implied
that customers cannot move to new innovative services without starting
from scratch—in general, the marketplace now realizes that the traditional
PBX can be migrated into the new world providing both the services that
customers have grown accustom to and overlaying new innovative services
without any service disruption to the existing users. In fact, all that has
happened is a new transport medium (IP) has been introduced to the world
of Telephony.
At the basic level, VoIP is comprised of two changes in the architecture of a
voice system; first, using an IP network as a connection medium to clients
and second, using the IP network as a backplane to connect different
component “Applications” of the voice system. VoIP greatly simplifies the
task of Computer Telephony Integration (CTI) by converging both voice
and data at the IP layer.
The introduction of Session Initiated Protocol (SIP) allows true integration
of new multimedia services with traditional telephony services and
applications such as Unified Messaging, Conferencing and Call Center.
New multimedia services include; Instant Messaging, desk top video
conferencing, collaboration, follow-me services and personal control.

Figure 23-8: SIP Solutions

The location of the Call Servers, Application Servers and DHCP servers
can be anywhere in the network as long as they are reachable from a
connectivity perspective. From a performance and security perspective,
their location and proximity are very important. Generally speaking, it is
best to assure for security reasons that Call Servers, Application Servers
and DHCP servers are located in a Secure Voice Zone. Additionally, they
should be within a minimal network delay (150 ms end-to-end) and the
network segments providing signaling support should be as redundant and
resilient as possible to assure dial tone.
Many times, Enterprise networks will be confronted with constrained
network resources due to limited budgets or available technologies. From a
business perspective, certain customer applications may be considered a
higher priority than voice. However, prioritizing them over VoIP may have
a serious impact on the delivery of voice services and consequently on
business operations. Quality of Service mechanisms need to be employed
both at Layer 2 and at Layer 3 to ensure that voice packets do not suffer
from delay or jitter performance degradation under network load. Likewise,
Ethernet hubs should never be used to terminate IP clients and Call Servers
as their shared collision domains make the possibility of any reasonable
QoE to be achieved virtually impossible.
VoIP architecture and requirements3

The example customer’s network is comprised of a number of different
sites, applications and requirements. The network has the following site
requirements:

Two existing major sites (New York and San Francisco) that have a
traditional PBX with 12,000 existing TDM users of which 2,000
users need to be mobile is required.
A new campus in Los Angeles for 10,000 new users with
geographical redundancy is required.
The new campus requirement in Los Angeles is the new
corporate headquarters and the customer is looking at
maintaining the rich telephony set of features they currently
have in their voice network but want to move to IP Telephony.
They are looking for a fully distributed solution that has
geographical redundancy, which is defined as the ability to
distribute redundant call servers in different locations to provide
site redundancy.
This site will be required to support 10,000 total users. The
customer has determined that 2,000 of the users will receive
sufficient service from digital sets that were freed up from the
existing PBX site.
Support a new regional office in Chicago with 800 users.
The new regional office in Chicago has a requirement for 800
users and the customer wants to have the rich set of telephony
features they already are accustom to, but implemented on a
pure IP Telephony platform. They want network transparency
and a common dialing plan for their VoIP network.
Support a growing branch network of twenty to thirty users per
branch.
The customers new branch office requirement is for much
smaller sites and a different division of the company. They are
looking are looking for a centralized call processing approach
with the same feature set as the rest of the division due to the
mobility of the users.
The customers existing branch offices are supported by
traditional key systems. The customer wants to upgrade these
sites to be integrated into the overall network and upgrade them
to VoIP capable systems. The customer wants to consolidate all
voice and data requirements in the branch into a single
platform. This includes data and voice services to include:
Routing, VPN, VoIP, TDM set support, Call Center, Unified
Messaging and web support.
In this case, we will assume the customer recently acquired a
company that had an installed Norstar* key system that can
easily be upgraded to BCM maintaining the majority of the
investment in technology, training and support.

Integrate a newly acquired company that has a key systems.

Support a distributed call center application.
The distributed IP Call Center site is a customer requirement for
Call Center capabilities to be distributed across multiple times
zones and corresponding locations as a single seamless
operation.
The customer is looking for a distributed IP Call Center to
support both internal and external customer needs. The Call
Center needs to include the following capabilities and key
functionality:
Powerful, Skill-Based Routing. Skill-based routing means that

you can intelligently route callers based on their needs and to
the agent that is best suited to fulfill customers' needs.
Seamless Networking Environment. Networking provides an

efficient, streamlined solution for centrally managing multiple
call centers.
Adaptable Call Handling. Rich, flexible scripting language

allows your business to customize call routing decisions and
treatment based on your business processes.
Graphical, Real-Time Displays. Real-time displays provide a

snapshot of the call center for management to view customized
performance statistics for increased responsiveness to changing
conditions.
Complete, Customizable Reports and Call Tracking. Have

standard reports and the ability to customize historical reports.
Provide unified messaging to all users
The customer is looking for a unified messaging platform that
will integrate voice mail, e-mail, IVR and fax on a single
platform and that can be closely integrated with other platforms
such as Lotus Notes* and Outlook*.
The intent is to provide the following key values:
User productivity, for mobile and stationary desktop user, better

message organization and prioritization.
Integrated architecture—no impact to e-mail systems and the

data network—easier to deploy, better reliability.

Hands-free access to messages with speech activated

messaging.
Application builder to create caller services, including auto

attendant menus and fax on demand.
No toll costs for message networking using the VPIM

networking standard. Ability to send voice or fax messages
from one unified messaging system to another over an IP
network such as a corporate intranet or the Internet.
Provide multimedia services to all users.
In this example, all applications are assumed to provide a high level of QoE
and that resources and bandwidth are available to support these
requirements. Figure 23-9 illustrates the logical view of the example
network.
Figure 23-9: Network Overview
Call Server
The Call server provides the basic telephony services traditionally found in
the PBX along with new services required to deal with an IP infrastructure.

Depending on a customer’s requirements and network infrastructure, call

servers may deployed in either a centralized or distributed architecture,
each has its advantages and disadvantages—neither architecture can be
considered best without first understanding customer uptime requirements,
network infrastructure and carrier capabilities if a WAN is required.
In a centralized call server approach, uptime is heavily dependent on the
capability of the Wide Area Network (WAN) to be responsive and resilient.
Without proper provisioning, network outages can potentially even affect
local services including intrasite calling and 911 capabilities.
The real key is to have a common dialing plan and feature transparency
across the network. This can be achieved with a distributed architecture
that, takes the burden off the WAN for all capabilities but media path
bandwidth.
When implementing call servers for either VoIP or multimedia they should
generally be put in what is referred to as a Secure Voice Zone (SVC). This
is simply a subnet that requires special VPN and/or Firewall access to
protect in from security attacks and isolate it from any unrelated traffic.
The concept of an SVC can be applied to multimedia call servers, which
can in the future be referred to as a Secure Multimedia Zone (SMZ).
Secure Voice Zone

Multimedia Application
Server Server
Firewall
Network
Call Signaling
Gateway Server Server
Figure 23-10: Secure Voice Zone

No single call server fits all circumstances. In preparing a potential solution
to the problems posed in our example network, we will look at the options
available from the Nortel product line.

Nortel Call ServerOptions

Nortel offers four Enterprise call servers depending on the size and
requirements of the system; Succession* 1000, BCM, MCS5100 and
CS2100. Nortel also offers two carrier grade call servers; MCS5200 and
CS2000. This chapter discusses only the Enterprise call servers: BCM,
Succession 1000, and MCS5100.
The Nortel solution set allows both a centralized and distributed approach.
The Succession 1000 is a family of Call Servers designed to support
medium to large Enterprise customers and supports up to 16,000 users on a
single system and can network systems together to look like a single
system to the user providing feature transparency across the network. The
Succession 1000 provides a rich set of telephony services based on the
Meridian 1* product and is closely integrated with CallPilot* (Unified
Messaging) and Symposium* (Call Center).
The Succession 1000 family of Call Servers is comprised of the Succession
1000M, Succession 1000E, and the Succession 1000S.
The following table lists the Nortel Call Server options.
Site Call # Users Features Positioning

Server
Existing CS 1000M 100 to Full Meridian* Feature set Predominantly
Major 15,000 PRI & H.323 trunking TDM
Users Existing
Sites San SIP Signaling
Francisco Installed
Full Multimedia capabilities customers
and New with MCS5100 Integration
York Protect
Centralized Call Server and Investment
Gateways
Distributed Remote Gateways
Redundant


Server
New CS 1000E Up to Full Meridian Feature set Predominantly
Campus 15,000 PRI & H.323 trunking IP
Require- Users New Large
ment Los SIP Signaling
Sites
Angeles Full Multimedia capabilities
with MCS5100 Integration
Distributed Call Server and
Gateways
Redundant Call Servers,
Gatekeepers, Gateway and
Client Proxies
Geographical Redundancy
Regional CS 1000S 100 to Full Meridian Feature set Predominantly
Office 1000 PRI & H.323 trunking IP
Chicago Users Small to
SIP Signaling
Medium sites
Full Multimedia capabilities
with MCS5100 Integration
Distributed Call Server and
Gateways
Redundant Gatekeepers,
Gateway and Client Proxies
New SRG 5-100 Full Meridian Feature set Small sites with
Branch users when connected to central call central control
Network server
Provides PSTN Trunking in
Network
Provides local Branch
solution for small locations
Survivable
Existing BCM 10 to Norstar Feature set small to
Branch 150 Integrated routing and VPN medium sites
Network users that want all
Integrated unified messaging voice and data
Integrated Call Center capabilities in a
single platform


Server
Distributed CS1000B 40 to Full Meridian Feature Local or
Call 400 IP, Digital or Analog Trunks Distributed
Center users Applications
Atlanta SIP/H.323 Gateway Services (i.e. Unified
E911 Services Messaging, Call
Center Agents,
Local or Distributed Media
etc)
Services: Conference, Tones,
RAN/Music
Table 23-1: Nortel call servers
Existing major sites

At the existing major sites, the challenge is to upgrade the existing TDM
switch to be able to convert 2,000 of their users to IP Telephony and have
an integrated solution that will allow the existing 10,000 other users to
communicate seamlessly with the new 2,000 IP Telephony users and the
rest of the network.
There are several major areas of concern when upgrading an existing major
site to IP Telephony: maintaining the feature set in which the users are
accustomed to, providing the users with the same level of voice quality,
providing proper security for the IP Telephony infrastructure, and making
sure that operational issues are taken into account.
It is imperative that not only you provide like features to the users but the
same features; otherwise, you will disrupt work flow and potentially create
a huge training requirement to get the users and operations people use to a
new feature set. Obviously, new and upcoming features provided by the
multimedia wave will require training and operational preparation.
Voice users are used to high quality calls that are virtually nonblocking.
When moving to a data infrastructure, care must be taken to size the
network properly for bandwidth and call loads. In a traditional voice
system, only PSTN trunks have to be sized for traffic. In a IP Telephony
system, the entire network has to be sized for calling patterns and available
bandwidth. It is recommended that only switches are used and no hubs.
Additionally, a customer new to VoIP may want to implement a separate IP
network just for voice in the beginning. One strategy is to build a parallel
voice network and use the voice and data networks as backup to each other.
VoIP can run on any IP network— it is a question of design,
troubleshooting and quality. Integration of real-time and non–real-time
protocols can be a challenge. It is also recommended that QoS tagging at

either Layer 2 (802.1p) or Layer 3 (DiffServ) be applied so the network can

give priority to these flows.
In order to design a VoIP network properly, the data network has to be
thoroughly analyzed for available bandwidth. Any potential bottlenecks in
bandwidth should be identified in both the LAN and the WAN. Where
bottlenecks are identified corresponding analysis or projection of real-time
(VoIP) flows should be conducted; that is, traffic flows inter and intra
department for intersite, intrasite and external communications. These are
done similar to a traditional voice trunking exercise but done to all
constrained facilities and then converting the trunk requirements into
bandwidth requirements. A Network Assessment can be utilized to not
only determine bandwidth bottlenecks, but should also be done to identify
VoIP killing impairments such as duplex mismatch, which are commonly
found in networks.
Customers implementing VoIP want to be sure to take security into
account. While security is not discussed heavily in this book due to its
complexity, it should be noted that all VoIP call servers and gateways are
positioned in a Secure Voice Zone.
Call Servers at Existing Sites
Figure 23-11: Call servers at major sites

If this were a Nortel customer with a Meridian 1, an upgrade to CS 1000M
would be recommended for both existing major sites: New York and San
Francisco. To provide the 10,000 users with advanced services, Nortel will
implement converged desktop services with an MCS5100 overlay. If this
were another vendors PBX, MCS5100 could still provide a converged
desktop overlay and support the 2,000 users who want VoIP services
directly on the MCS5100.
Network recommendation for Major Sites

In our example, we assume that all Ethernet hubs in the network have been
replaced with Ethernet switches so that we don’t have collisions and the

delay variation associated with collisions. Ethernet hubs should not be

deployed anywhere in a real-time network. QoS tagging has been
implemented to mark VoIP traffic with an “EF” DSCP code point. This
allows us to use policy filters on our routers to give voice the required
handling to ensure low delay and low jitter. We use Priority Queues for all
EF marked traffic.
VoIP has been put on its own VLAN for security and to achieve best VoIP
quality. A Network Health Check has been carried out to ensure that there
are no VoIP killing impairments in the network such as duplex mismatch.
Proper network documentation has been created that identifies the physical
and logical network structure, the QoS strategy to include QoS mappings,
proper call flow analysis (trunking), etc.
Bandwidth studies have been carried out to ensure that all links have
sufficient capacity. Both voice and data are permitted on the same links, but
we have chosen to ensure that the link is sized so that voice traffic is less
than 30% of the total link capacity. This both ensures that other traffic does
not get starved by voice traffic and ensures that, in the event of a network
failure, we can fail over to the new link without having to drop traffic.
New campus requirement

The major advantage to a new site is that the data infrastructure and
security can be designed to accommodate the requirements of a real-time
network. This includes the exclusive use of switches that are QoS and
security capable. Additionally, the use of gigabit uplinks and gigabit
backbones should be seriously considered. Even though it is a new
network, a Network Assessment should be completed to assure there are no
VoIP killing impairments in the network.
As mentioned in the major sites, all VoIP and multimedia flows should be
tagged and all VoIP infrastructure components should be in a Secure Voice
Zone. Since the Campus is the site chosen to host both the unified
messaging and distributed IP call center, special care should be taken to
assure that enough bandwidth is built into the network with loadsharing
implemented and QoS properly implemented to assure a high level of
service.
Call Server at New Campus

Up to 15,000 Users
IP Predominant
Communication Server
1000E
Figure 23-12: Call server at new campus

Assuming as before that this an existing Nortel customer, we would
recommend the CS 1000E for the new campus requirement. This new
headquarters site will be host to a centralized distributed IP Call Center and
Unified Messaging system. The Call Center will be discussed in both the
application section under Call Center and the Gateway section—
Distributed Call Center Site. The new headquarters site will be the host for
the new small branch network. Primary call control will be from the new
corporate site. The branch gateways and functionality will be discussed in
the gateway section.
Network recommendation for New Campus

The new campus was engineered to the same specifications as articulated
in major sites. The carrier network for WAN access has both ATM and
MPLS implemented. A Service Level Agreement (SLA) has been specified
guaranteeing a high priority level of QoS, and the corresponding QoS
Mappings have been defined.
Regional office
The regional office similar to the new campus has the advantage of building
a new network that can accommodate the requirements and bandwidth of a
real-time network. However, special consideration has to be taken as it will
be considered a smaller site than the campus and, therefore, the tendency is
to take shortcuts and cost cutting measures when building a network for a
location like this.
While this may be acceptable for a pure data network, a real-time voice
network always has to be built to the highest specifications if quality and
connectivity are the goal. This is a design goal that is built into voice
networks and never questioned. In general, a voice network has been built
to meet specific bandwidth requirements to carry a specific load of traffic.
Data networks traditionally were built based on a constrained budget and
the assumption was due to the difference between LAN and WAN
technologies that users would never have enough WAN bandwidth;

therefore, the key consideration was connectivity and as much bandwidth

as you can afford. This was acceptable because data networks can usually
afford to slow down and have the ability to recover. Voice networks have to
have their designated bandwidth or they have no connectivity. Furthermore,
voice networks do not have the ability to retransmit;, they use UDP not
TCP.
Therefore the challenge is to assure that the same care that is given to large
sites is also given to all sites if a network wide IP Telephony system is to be
implemented.
Call Server at Regional Office
100-1000 Users
IP Predominant
Communication Server
1000S
Figure 23-13: Call server at regional office

If this were a Nortel solution, the CS 1000S would be recommended. The
CS 1000S was the first pure IP PBX solution offered by Nortel that
featured the Meridian feature set.
Network recommendation for Regional Office

The regional office was engineered to the same specifications as articulated
in major sites and campus. Real-time applications deployed at the regional
office require the same level of support as major sites and the same level of
engineering. This will assure the same quality objectives are met.
Distributed IP call center site

Distributed call centers also have to be engineered properly. Most trunking
requirements for a normal IP Telephony system require about on average
20–25% trunking. In the case of a call center trunking, requirements run
anywhere from 125–200%. This is due to the fact that a call center user is
supposed to be on the phone basically 100% of the time, and call centers
generally expect to keep incoming calls in queues waiting for an operator
and be given a tree of options to pick from.
Call centers are many times the life blood of a company and, therefore,
need to be assured that if they go to IP Telephony, they indeed will have the
quality and availability previously experienced in TDM systems.

IP Call Center
The CS 1000B is recommended for the distributed Symposium IP Call
Center sites. The CS 1000B is a uniquely configured branch solution that
allows Symposium call center to be distributed to remote sites. The IP sets
on the CS 1000B are redirected to appear on the central site system in the
case of this network the CS1000E.
Network recommendation for Distributed IP Call Center

The Distributed IP Call Center was engineered to the same specification as
articulated in major sites and campus. The IP Call Center is customer
facing and mission critical. Therefore, special effort has been made to
ensure that proper engineering is conducted and achievable QoS objectives
are met.
New branch offices

Branch offices in most cases use frame relay services due to their low cost.
When ATM and frame relay were originally released, they were both called
“Fast Packet” technology—this was based on the de facto standard that at
the time was X.25 (very slow). ATM was developed for all services
including real-time services, while frame relay was specifically for data.
Figure 23-14: Branch network

While frame relay can carry VoIP effectively, it is a big challenge and there
is no real QoS in frame relay, only congestion notification. Therefore,
special engineering and care needs to be taken if you plan on implementing
VoIP over a frame relay network.
The new branch network is shown in Figure 23-14. As can be seen, there
are a number of various speeds depending on the site location. When
dealing with frame relay, you need to be concerned with a number of issues
including the speed of the service, access rate, segmentation, shaping,
policing, pacing and PVC allocation.
Branch Offices
In this case, the Nortel solution would be the Survivable Remote Gateway
(SRG). It is recommended that the frame relay network be built as a full
mesh with separate PVCs for VoIP to assure the best possible QoE. It is
further recommended that all remote sites frame relay channels be put on a
full T-1 and not a fractional T-1. Adhere to proper shaping and pacing to
keep the ingress side of the frame relay network from applying policing
actions that may make any offending packets either Discard Eligible (DE)
or actually discard them.
Network recommendation for New Branches

The new branches were engineered to the same specification as articulated
in major sites and campus. Frame relay has been implemented in the
branches with the following design:
Segmentation has been implemented to assure that data packets do
not introduce excessive jitter to voice packets.
Shaping has been properly configured to ensure that the network
will not inadvertently throw away legitimate traffic or make it
discard eligible. This will ensure unnecessary policing at the carrier
ingress point of the network.
Proper pacing has been implemented to assure orderly traffic flows.
Voice and data have been provisioned to have their own separate
DLCIs. If separate DLCIs are provisioned for voice, Segmentation
and Pacing are not as critical.
A full mesh of DLCIs has been built to reflect the voice traffic
flows. This will reduce latency and other possible degradations
A minimum of full T-1 access for all sites has been implemented to
minimize serialization delay.

Existing branch offices

The customer will be able to migrate to a converged IP network while
allowing the users to keep the feature set they are accustomed to without
retraining them or support personnel on a new system.
This network would also ride on a frame relay network. Please see Branch
Office (New) for a discussion on issues to deal with in a frame relay
network.
Branch Offices
In this case, the Nortel solution would be the SRG.
Network recommendation for Existing Branches

The existing branches were engineered to the same specification as
articulated in the new branches.
Gateways
Gateways basically provide some type of protocol conversion function.
Gateways in IP Telephony provide the same functionality to legacy
environments and to different connection protocols. There are proprietary
and third party gateways. Some of the basic gateway functions that may be
required to be performed are as follows:
VoIP to TDM phone
VoIP to PSTN Facilities
VoIP to Legacy Applications
VoIP to Legacy (TDM) terminals
SIP to H.323
H.323 to SIP
Currently, there are two major and competing connection protocols:
Session Initiated Protocol (SIP) developed by the Internet Engineering
Task Force (IETF) and H.323 developed by the International
Telecommunications Union (ITU). Calls can be connected from an H.323-
based system to an SIP-based through a H.323 to SIP gateway. However,
only a small set of basic call features are supported.
The following table list the Nortel Gateway options.

Gateway Associated Features Positioning

Call
Server(s)
1000B All CS1000 Provides PSTN access in Local or Distributed
Call Servers network Applications (that is,
Media
Unified Messaging,
Gateway Full Service solution for branch
Call Center Agents)
up from 80 to 400 users
Provides local Branch
IP, Digital or Analog Trunks
solution for med size
SIP/H.323 Gateway Services sites
E911 Services Up to 400 users
Local or Distributed Media
Services: Conference, Tones,
RAN/Music
Survivable
1000T All CS1000 Provides digital trunk PSTN Used as trunking

Media Call Servers access resource for CS 1000E
Gateway Capacity 20 PRI per SIP/H.323 Can be used by any
Gateway CS1000 Call Server to
provide trunking as a
No users allowed
network-wide resource
Network-wide resource
Unlimited quantity allowed in
network
1000 E CS 1000E Provides access to TDM devices Deployed on Campus–

Media not on WAN
Physical: One Media Gateway &
Gateway Media Gateway Expander Part of the CS 1000E
system infrastructure
Up to 30 per CS 1000E system
and are configured
Capacity ~200 IPE cards per CS based on line size, set,
1000E system: allows space for and trunk types and
Media Cards applications

1000S CS 1000S Provides access to TDM devices Deployed on Campus–

Media not on WAN
Digital/Analog Lines
Gateway Part of the CS 1000S
Analog/digital Trunks
system infrastructure
RAN/Music and are configured
Conference/Tones based on line size, set,
and trunk types and
Physical: One Media Gateway & applications
Media Gateway Expander
Up to 4 supported per CS1000S
Survivable All CS1000 Provides PSTN access in Full Service solution

Remote Call Servers network for small branch with
Gateway 5-100 IP Users
Local Call Processing Fallback
Hosted from
Simplified Configuration &
Centralized Call Server
Management
(Feature Transparency)
Table 23-2: Nortel Gateways
Clients
The customer has a requirement for a number of different clients and
services for all their sites. They prefer to have a ubiquitous service offering;
that is, seamless across the network. This implies support for many types of
clients and facilities to include: Wired IP, Soft IP, TDM, wireless IP, and
PDAs. It also includes TDM trunking and SIP & H.323 IP Trunking.

Soft IP
Wireless IP PDA
Wired IP
TDM
Any Place
Any Time
Any Device
TDM SIP & H.323
Trunking IP Trunking
Figure 23-15: Support clients and facilities
The decision to go with standards-based SIP or H.323 phones and
proprietary solutions that will work in the SIP and H.323 is a very heated
discussion. Some believe that standards-based is the only solution; but
these solutions, while reducing cost, generally lack in the rich feature set
that proprietary clients will provide.
IP clients are comprised of a number of devices to include: hard clients (IP
phones); soft clients (PC Clients); wireless clients (PDAs); and multimedia
clients that may support VoIP, Video, and Instant Messaging.
Deployment issues for these devices depend on a number of issues; but,
generally should be governed by a well thought out QoS and Security
Policy. The QoS strategy should include tagging at either Layer 2 (802.1p)
or Layer 3 (DiffServ). This not only provides QoS on the LAN, but allows
you to map these priorities to core technologies in the backbone.
Additionally, the voice or multimedia can flows can be separated onto
separate physical subnets or separate VLANs.
Clients
There are no best options here as Nortel supports a plethora of clients to
include: analog, digital, IP, wireless, PDA and third party devices on all call
server systems. The hard clients get their software load from the call server
they log into and, therefore, can be used on different platforms with no
changes.

Applications
Applications cover both traditional applications found in telephony, such as
conferencing, unified messaging and call center, along with new
applications found in the multimedia revolution, which include instant
messaging, desktop video, collaboration, follow-me services and
customized personal control (Personal Agent).
Applications can be implemented in both a centralized and distributed
architecture. The choice between centralized and distributed applications is
a complex decision based on a number of cost, performance and scaling
issues. For instance, in the case of unified messaging, you need to evaluate
the overall requirement for individual sites and all sites together to first
determine if a single system could be used. If a single application platform
can support the overall requirement plus scale to the projected growth of
the network the next step would be to determine the network bandwidth
required to support a distributed environment, along with the cost and
performance.
If a distributed application approach is required, there are a number of
additional issues. The first is to measure the cost of duplicated services as
opposed to a centralized approach. The second issue is to determine the
complexity of networking multiple application servers over the network to
provide the same level of service and performance as a single server.
Network bandwidth will have to be evaluated even though it should not be
near the load of a centralized solution.
Sometimes the solution is easier based on the requirements, if evaluated
properly. For instance, in the case where two major sites required a total of
two servers to service the load and each site could be serviced by a single
server, the decision would be obvious from a technical perspective, which
would be to go to a distributed environment. However, note that this does
not take into account the cost of managing and maintaining the system,
which should be considered.
The basic nature of application servers in a VoIP and multimedia
environment naturally challenge the ability to deliver a high level of QoE.
Most application servers, due to requirements or architecture, break down
the end-to-end nature the Internet was built on. In the case of a Unified
Messaging system, it has to store and forward messages. In the case of
VoIP, it can potentially cause a double transcoding, which in turn increases
the demand for the network to be loss-free, error-free, minimal delay and
minimal jitter.
Both centralized and distributed approaches have their benefits, depending
on the requirements. However, a long standing computer paradigm has
been to use bandwidth instead of computing cycles whenever possible to
minimize the complexity of the system.

Unified Messaging
When designing a network for a centralized unified messaging platform,
much care should be taken into account on the architecture of the network
and of the unified messaging platform. Most VoIP networks today are
financially based on implementing the G.729 codec (CELP) for bandwidth
savings. This codec is known to be near toll quality performance, but that is
based on a single transcoding.
Many store and forward applications servers like a unified messaging
platform, which may introduce an anomaly called multiple transcoding.
This is where the message will be transmitted across the network at G.729
and transcoded back to G.711 when it hits the application platform and
then is stored in another compression mode. When the message is picked
up, it may be compressed back to G.729 for transmission on playback.
Depending on the quality of the network, voice quality may degrade
substantially.
In this case, you will want to design the network to be able to either record
or playback at G.711; thereby, eliminating a multiple transcoding. Another,
solution is to store the message in the original algorithm or as an RTP
stream. However, this is in the domain of product enhancements and these
solutions still have a number of limitations and generally are not
implemented well at the time of this writing. The purpose of this discussion
is to shed light on some implementation issues and potential solutions to
unified messaging architectures, and not to discuss the relevant advantages
and disadvantages of unified messaging architectures.
Unified Messaging
CallPilot is a unified messaging tool that utilizes speech recognition and
TCP/IP digital networking to give complete access and total control of fax,
e-mail and voice messages. Using simple voice commands, like “play” or
“print,” a user can remotely manage their multimedia communications over
the telephone. The user can print faxes, store or delete voice messages and
more just by speaking.
Call Center
As discussed before, call centers are generally customer facing and key to a
company’s success. Therefore, IP Call Centers need to be implemented
with the highest priority for these VoIP flows. Additionally, they should be
made as robust and resilient as possible assuring that there is sufficient
bandwidth along with load sharing appliances to assure maximum
performance. A Call Center is a high candidate for a separate IP network
for the application to assure maximum quality.
It should be realized that once an IP network becomes congested, the TCP
algorithms will kick in and throw away packets to preserve the network for
high priority applications. However, this is very disruptive to real-time

protocols. One way to protect against this is to put critical VoIP network
requirements on their own network.
Call Center
Symposium Call Center is recommended as the call center solution.
Symposium is an industry leading solution that traditionally is considered a
centralized solution being associated with a single Call Server. The unique
configuration of the CS 1000B allows symposium to be implemented as a
distributed IP Call Center Application.
Multimedia
It should be noted that while multimedia applications are considered to be
real-time applications similar to VoIP, there are differences as originally
identified in ATM AAL service class analysis of applications that define
the attributes to be the timing relationship required between the source and
destination, whether the bit rate is constant or variable, and if the
connection mode is connection-oriented or connectionless.
Multimedia requirements are diverse in their bandwidth and timing
relationships; however, for the most part VoIP is the most demanding when
it comes to the timing relationship. Therefore, if voice is expected to be
near toll quality, it will require being given the highest priority in the
network over all other applications.
For instance, in a true multimedia call between users where VoIP, video,
instant messaging, application collaboration and FTP may all be happening
simultaneously, VoIP is the only service that if call quality is affected the
entire session is in jeopardy. Most times, minor degradation in quality of
the other services will generally be a mild distraction. Many times the
video can be turned off to gain bandwidth for other applications and turned
on only when applicable.
The MCS5100 Multimedia Call Server
Our VoIP offering is enhanced by multimedia services, which remove the
barriers of distance and location with applications including video
conferencing, instant messaging, collaborative whiteboarding, and
dynamic call handling. These applications are delivered by the MCS5100.
MCS5100 is designed as Multimedia Call Server that can provide
multimedia services as an overlay to any customers existing voice network
and a total voice and multimedia solution to new customer requirement.
MCS5100 can act as a standalone system providing a basic set of telephony
features available with the SIP protocol and can, additionally, overlay
existing PBX solutions with an option called converged desktop.

Converged Desk Top

Converged user desktop extends the value of advanced multimedia services
to end-users, while retaining the investment in existing digital telephones
and PCs.
Figure 23-16: Converged user desktop

When “Converged Desktop” services are added to an employee’s existing
telephone and PC, they can be utilized to provide advanced communication
to their desktop. Both the PC and the telephone act as if they are a single
device when providing advanced features to an end-user—maximizing
functionality while minimizing costs. Converged desktop has the following
features:
Calls coordinated with PC Client
Current Phone handles voice,
PC Client handles Multimedia
Events on one device cause status updates on the other
For example, with MCS5100, the calls are coordinated with the existing
digital phone and the Converged PC Client. The calls come in through the
trunks on the PBX and go through an SIP/PRI gateway to the MCS 5100
and are presented to the PC Client simultaneously as the phone is ringing
(see Figure 23-17). From an end-user experience standpoint, the phone
rings and provides voice services—as it did before. The PC simultaneously
provides visual/picture caller-id of who is calling, as well as additional
information.
The phone still works like it did before; you have your choice of answering
the call using your phone/speakerphone or a PC-based headset. The PC
simultaneously provides collaboration services and features such as Call
Log, Instant Messaging, Whiteboard, file transfers—all of which provide
additional features and capabilities without losing anything, or changing
the user’s desktop—and is additive without any disruption or retraining for
users or maintenance people.

MCS 5100 BayStack 460
SIP/PRI
Gateway
Current Phone
Existing
Multimedia
PBX
PC Client
Single User – Single Experience

Multiple Coordinated Devices
Figure 23-17: Converged Desktop
Collaboration Services—Instant Messaging, Conferencing and

Whiteboarding
Collaboration focuses on organizations that put a great value on the need to
collaborate together, such as by using video and file-sharing. This is
especially helpful for far-flung, distributed workforces.
Conferencing, Instant Messaging, White

boarding
Figure 23-18: Collaboration

On a typical conference bridge, people dial in and the moderator has to
keep asking “Who just joined the bridge?” With instant-messaging, they
get a message telling them who just joined. On a traditional bridge, they
frequently have to track people down who are supposed to be on the call,
placing everyone on hold while they dial four to five phone numbers to find
someone. With instant messaging, they click on the person’s name, and can

send them an instant message that may appear on the user’s phone, PC,
pager, or whatever the user’s preference is currently set to.
The end-user should have the capability of connecting to the conference
bridge from PC soft-client, mobile phone, or office phone. The last stage of
a call usually involves discussing or sharing a document. Instead of
collecting FAX numbers and e-mail addresses, the conference chairperson
can send files, push web pages, and share a whiteboard application with
other conference participants.
A highly-mobile worker deals with coworkers in several different regions
globally. A worker should be able to communicate with others when they
are in the office using their phone and PC, or when they are at home using
their PC on a cable modem or DSL line, even from a hotel with Internet
access or a web terminal. Communications over the public network should
be protected utilizing secured, encrypted VPN technology.
If this were a Nortel solution, the MCS 5100 would provide Instant
Messaging, conference bridging, whiteboarding and follow-me voice
services.
Meet Me Conferencing
In addition to the base service ad hoc conferencing capability, an optional
“Meet Me” Media Conferencing application. Meet Me Media
Conferencing should require no reservations and utilize soft DSP
technology, which reduces the cost and footprint when compared to TDM
based in-house conferencing. Users access the service with a dial-in
number and passcode just like many used today with TDM-based
outsourced conferencing.
The service should support both a G.711 as well as a G.729 codec to
accommodate lower throughput networks or DSL access. For Enterprises
currently outsourcing their conferencing, Meet-Me Media conferencing
can produce immediate, significant savings.
If this were a Nortel solution, the MCS 5100 provides a highly scalable
audio conferencing solution providing visual notifications to the
Chairperson of conference activity and from an ROI perspective offers
significant savings to an Enterprise that may today be providing services
through a third party provider. Unlike most conferencing services, the
chairperson gets notification of participants entering and leaving the call,
so there is no more asking who joined during the middle of the
conversation. The solution will support point-to-multipoint video
conferencing by late 2004.
Personalized management
A personal agent is a web-based portal for accessing all of the advanced
features listed here. Customizable settings include all of your contact

information (Name, Phone Numbers, E-mail, Photo-ID). It includes an

address book of frequently called numbers and your “Friends” list, which
are the people that are people you work closely with. You can see the
Presence status of your “Friends” from your PC Client window.
Personal Agent
Flexible Access
Figure 23-19: Personal agent
Presence is a key feature of doing mobility well. Presence is the concept of
a system treating you as one user with multiple devices. Instead of multiple
phone numbers, addresses, and separate services, the user has a single set
of services, coordinated across their multiple devices.
The Automatic Presence feature can be enabled if you want so if you don’t
touch the keyboard or mouse for a selectable period of time, your presence
status is changed to offline; then, all the people that have you as a “friend”
will see that you are offline.
For example, if a manager has a little phone next to their status icon, then
you know they are on the phone and your calls will go to voicemail. If you
require a quick answer to an easy question, you can send an IM and they
can get your answer right back. This allows you to get quick answers to
questions while you are still on the phone with the person who asked the
question, which reduces multiline phone interruptions while on the phone
and reduces the number of voicemail messages that need to return.
MCS 5100 Call Manager allows preferences to be set on how to handle
calls. Though an administrator can set defaults and get in there if they need
to, this is designed to be set up by the end user. MCS 5100 Call Manager is
very intuitive and easy to use. Different profiles can set up, based on
whether you are in or out of the office, allowing calls to be routed
differently. For instance, different treatment to calls are allowed coming
from your family to get to you wherever you are—or perhaps automatically
straight to Voice Mail.
Utilizing the presence concept, the MCS 5100 provides you a visual
indication of the status of your close contacts (that is, “Friends”). Time can
be saved by looking at their status and availability, helping to decide the
best way to communicate with them at any given time.

Custom applications
Custom applications are the abilities to take a multimedia system and
leverage the advantage of being a standards-based solution while
leveraging “killer apps” (applications) that may be developed by a third
party. Custom applications are about taking a feature set like MCS 5100
and customizing it for specific applications.
Programmability tools, interfaces & inter-op labs
Figure 23-20: Custom applications

The MCS 5100 platform is not a completely closed environment; it can
host a variety of multimedia applications. Nortel has a growing community
of developer partners working on providing a wide variety of applications
to various vertical markets, as well as customized development work for
specific organizations.
Different organizations and industries have different “killer apps.” MCS
5100 allows the creation of the ultimate “killer app” – integrated fully with
the MCS 5100 communication platform. Applications development for the
MCS 5100 is standards-based, featuring a large number of APIs and
interfaces to include SIP, CPL, HTTP, JAIN, Parlay, VXML, SOAP,
CCXML, SALT. Upcoming product releases will feature developer tool
kits that Nortel authorized partners and customers can use to customize
their environment.
Conclusion
A major advantage of the Nortel solution is that an existing Nortel
customer does not have to implement a “forklift” strategy to take advantage
of new offerings and capabilities offered in the new IP space. Additionally,
new non–Nortel customers are not necessarily forced into a full VoIP
solution if it is not required. As much of a compelling story that VoIP and
multimedia is, it is beginning to be realized that not all customer voice
requirements include the flexibility and mobility that VoIP delivers. While
VoIP is probably the ultimate solution, it requires a substantial investment
in the data network to provide and guarantee the same quality and
bandwidth that a traditional TDM system provides. Until data networks can
be built without constrained bandwidth and all devices are QoS aware,
many customers may find it easier to implement traditional voice systems
for many of their voice applications.

When choosing VoIP, Nortel offers a number of different Call Servers

meeting different customer requirements and feature sets. Call Servers
should be located in the network to minimize delay and guarantee dial tone.
Additionally, Call Servers, Gateways and application servers should all be
placed in a Secure Voice Zone (SVZ) to protect against Denial of Service
and other security vulnerabilities that can affect their ability to deliver
service.
When implementing a VoIP network, QoS should be implemented to
assure the proper level of QoE desired. Ensure proper level of QoS through
use of QoS Tagging at either Layer 2 (802.1p) or Layer 3 (DiffServ),
implementation of VLANs (802.1Q), proper QoS mapping, proper network
design and sizing, and sufficient Network security.
It should be noted that call centers require much more bandwidth than
traditional VoIP applications because call center agents are virtually on the
phone all the time, sometimes handling multiple calls with more in the
queue. Additionally, they are generally customer facing; therefore, have the
highest demand for QoE.
Frame relay network connections require special consideration and
engineering due to their lack of QoS. To insure a high level of QoE in the
frame relay portion of the network, a full mesh topology should be
implemented with separate PVCs for voice. Additionally, all sites should
have a minimum of T-1 access (not FT-1) to assure the best QoE.
Nortel has a full line of voice and multimedia products that provide
solutions for voice and multimedia. They provide industry-leading
technology advancements, provide investment protection, and allow the
user to move into the new world of IP and mobility.


593
Chapter 24
IP Television Example
Ed Koehler
Chris Busch
Session Gateway
H.248 / MGCP
View From
Audio
Video
Voice
Real-Time
Application
Control
/ NCS
RTCP
RTSP
H.323
Perspective
SIP
To
codec Ntwk
RTP
UDP / TCP / SCTP

IP
QoS
MPLS Packet
AAL1/2 AAL5 Cable
Resiliency
SONET / TDM

This chapter covers the issues and technologies for supporting real-time
unidirectional streaming video applications. IP-based television is
introduced and discussed as an example application for this type of service.
A high level overview of an IP-based television head end is covered with
particular detail in network level behaviors. IP multicast is examined
against the service level requirements for this application, using typical
CATV user experience as the baseline. From this, enhancements to
multicast technologies are covered that enable this application model to be
implemented. Aspects of content protection and conditional access for
these ‘paid for’ services are also discussed.

Unicast streaming video is also covered with a focus on Video on Demand

type services that may be adjunct to the IP multicast television distribution.
The reader will learn about the basic concepts of unicast streaming video
and how it differs from multicast. Both transport and media signaling
methods will be discussed in detail.
The chapter will wrap up with considerations to Video on Demand usage in
the QAM-based hybrid fiber-coax transport architectures that are typically
seen in the cable provider environment. A basic overview of QAM delivery
of VoD services is provided and how IP-based content insertion can replace
traditional ASI approaches.
Introduction
In today’s carrier, service-provider, and Enterprise networks, the use of
advanced IP-based networks has become more commonly accepted as an
alternative to the more traditional modes of media transport.
The use of this technology for IP-based television has been gradually
increasing without fanfare in the industry at large. For some time, it has
enjoyed acceptance outside of North America, with early beachhead
deployments in both Europe and the Asia-Pacific Rim. Recently, however,
it has begun to generate increased interest from Local Exchange Carriers,
Regional Bell Operating Companies, and Internet Service Providers (ISPs),
as well as Metro Area Transport providers that are beginning to implement
Ethernet-to-the-User (ETTU) networks in residential high-rise
applications.
Enterprises are also finding that properly leveraging high-speed advanced
IP networks allows them to offset much of the cost of implementing a
traditional television headend and fiber/coax distribution system.
The requirements demanded by IP-based television are very stringent on
the IP network. Not only is there the requirement for IP multicast
capabilities in the network, but also the need for a robust and stable
deployment that has only recently arrived in the market.
In order to understand the improvements that had to be made, it is
important to review IP multicast from a generic perspective. This document
reviews the basics of multicast and then compares that generic working
architecture against the requirements of an IP-based television head end.
By comparing a generic IP multicast deployment against the application
requirements of the reference model, areas where optimization had to be
made can be highlighted.
IP multicast is an evolution in progress

The basic premise of IP multicast is to send out a single packet (or in the
case of video, a stream of packets) from a single source to multiple end-
points. This by itself would not seem very difficult and, from a high level, it

Chapter 24 IP Television Example 595
was not. However, achieving it in a scalable and stable manner, with the
necessary flexibility, proved to be more difficult.
In light of the fact that routers do not forward any multicast or broadcast
activity, a method was needed to provide Layer 3 links across the routed
boundary. All multicast routing protocols are methods to address this
requirement.
DVMRP is a good all-around multicast technology. While it does score
comparatively high from a network overhead perspective (a result of the
routing table update requirement inherent in vector-based routing
protocols), it scales relatively high as well and is well adapted to dense
mode networks. This is particularly true when DVMRP is implemented
with the right routing policies and features. The newer PIM-SSM,
technology, however, raises expectations by showing a promise of scale
beyond that of any other protocol.
There is another consideration that is equally important: IP-based
television is a single-source multicast model. The premise of the
application is to make a single stream (the television channel, which is the
equivalent of an IP multicast group) available to multiple viewers. Both
DVRMP and PIM-SSM are source-driven or reverse-path implementations
of multicast. For this reason, the source of the multicast activity is always
the root of the network tree, unlike a shared-trees approach such as PIM-
SM, which starts the build of the multicast tree from an independent root
known as a Rendezvous Point (RP). For these reasons, both DVMRP and
PIM-SSM are logical choices as Layer 3 routing protocols for IP television
headend implementations.
IGMP indications of signaling interest

Interest in a particular multicast group is indicated by an Internet Group
Management Protocol (IGMP) join request that is generated by the client.
The edge router sees the “join” request and joins the interface onto the
multicast group. If the router is running DVMRP, it will have a known
reverse-path route back to the site by virtue of the DVMRP table updates. If
it is PIM-SSM, it will have the unicast source address to reference for a
reverse forwarding path.
During the whole time that the client is viewing the channel, it is
generating IGMP membership reports (every 125 milliseconds by default)
for the Class D address. These membership reports enable the edge router
to keep the interface “joined” onto the multicast tree.
By default, the edge router prunes the interface after the explicit leave
message if it does not receive another IGMP membership report for the
multicast group until the end of two IGMP message intervals (400 ms,
assuming a Last Member Query Interval [LMQI] of 20 or 200 ms and a
robustness value of two). This improves the edge performance greatly, to

roughly one half second. The LMQI is the amount of time between the
group-specific queries, and the robustness is a factor of expected data loss
on the network (high values mean high data loss, and, therefore, more
queries will be sent).
By this coordination of the two protocol environments, IGMP and
multicast routing (DVMRP and PIM-SSM), the state of the multicast event
is maintained. Figure 24-2 illustrates several stations on a common Layer 2
segment. One station is leaving multicast group 228.1.1.1, but there is
another client on the segment who is sourcing a membership report. This
signals to the edge router that there is still solicited interest in the channel,
and the edge router does nothing but continues to serve the stream for that
group. There is another client that is leaving 224.1.90.5. Because the router
does not see any other IGMP reports for that group after the standard
interval, it is pruned off of the segment. A new station request, such as the
one for 224.1.1.1, will require a reverse-route Shortest Path Tree (SPT) set
up as a join to the multicast group. This will require some latency for the
build of the tree join. The farther away the client is from the sending
source, the longer the setup latency will be.
Figure 24-2: IGMP membership reports

It is useful to think of Layer 3 activity (between multicast-enabled routers)
as tree-based topologies that branch out from the sending source to the
network edge. It is also useful to think of the Layer 2 segments as “clouds”
of multicast activity, meaning that without some sort of multicast control at
Layer 2, all multicast traffic will be sent to all active stations on the
segment, whether they have solicited interest or not. Thankfully, Layer 2
Ethernet switches provide the perfect place to introduce Layer 2 IGMP
controls that, in essence, extend the tree concept into the Layer 2 clouds.
Note, however, that a Layer 3 forwarding device can also be located at the

edge, where the IGMP control is accomplished along with the routing
functionality at the same interface.
The IP television system

The IP-based television delivery system provides the video sources for the
service. From the perspective of the IP network, what results is the
presentation of a series of IP multicast sources (the television channels). As
a result, the IP video headend is a set of subsystems that work together to
provide the basic service of broadcast-quality television signals over an IP
multicast-enabled infrastructure. This service is defined as “basic” because
companies also offer other service extensions such as e-mail, web access
and VoD.
The typical speed of MPEG2 ¾ D1 resolution is around 3 Mbps1. It is
commonly accepted as the baseline threshold for IP-based television
media. Hence, 3 Mbps will serve as the channel speed in this discussion
and reference model2.
The IP television reference model

Figure 24-3 shows the network reference model that will be used for the
remainder of this discussion.
1. Although some implementations (specifically those of ADSL providers) are looking to reduce this
requirement, others (like some cable applications) use more bandwidth for better image quality.
2. It should be noted that speeds as high as 6 or 8 Mbps are also used in Standard Definition Video.

Figure 24-3: The IP television reference model

As illustrated in Figure 24-3, there are two subscriber access models. The
first is intended to bring 100 Mbps Ethernet directly into the residence.
This connection is then fanned out by a small residential Ethernet
workgroup switch, which provides connectivity to the IP STB, as well as
the subscriber's PC for normal Internet access. This connectivity model is
appropriate in metro high-rise applications, gated or clear boundary
residential neighborhoods, and corporate Enterprises.
In more rural areas, such a model is not yet practical. The use of DSL
technology maps well into this space if the right kinds of capabilities exist
in the provider's network. The more traditional distribution architecture for
DSL is ATM. In this scenario, there is not a lot of capability that can be
introduced into the DSLAM to optimize it for IP multicast. Consequently,
discrete VLANs over individual PVCs are run over the DSL line to the
subscriber residence. The last point of control is at the edge router itself.
While this is a workable design, it is limited in how well it scales. It also
requires DSLAM aggregation trunks greater than OC-3. While many new
ATM-based DSLAMs are coming out that meet these requirements, much
of the installed infrastructure is legacy OC-3 based, with very large
populations of subscribers attached.
A more recent approach is to actually place Ethernet or IP down the DSL
subscriber link, which has the advantage of avoiding the ATM scaling
problem.
See Appendix F for further details on the IP television head end system.3

Core switch/router feature enhancements for IP multicast

The first issue to be addressed was the channel setup time to register the
requesting client and then build the extension of the tree out to the new
viewer4. With DVMRP, the underlying methods of group propagation
(reverse path table updates) are far too time consuming. The workaround
was to introduce a static route (static forwarding entries) feature to
DVMRP so that the multicast groups are brought up and active out to the
edge of the core on a full-time, 24x7 basis.
In addition, a generic “static receiver” allows the configuration of static
groups on given ports, including ports connecting the Layer 3 switches
together for fast channel delivery.
In the reference model, the IP television headend is sourcing eighty
channels at 3 Mbps per channel. This sets the requirement for a full-time
multicast activity across the core at 240 Mbps of IP multicast traffic.
These simple feature enhancements avoid much of the network overhead of
DVMRP. By providing all eighty channels out to the edge of the core, each
channel is available immediately because the whole build of the tree across
the core from the sending source is avoided. These features are something
that is set up only on the edge switches, so the configuration is easily
replicated and implemented.
With DVMRP routing policies, user interface subnets are not advertised to
the rest of the network, allowing the protection of the network from
accepting any multicast traffic from these interfaces. Although these
interfaces are not advertised to the multicast network, unicast routing
policies are not impacted because DVMRP uses its own routes and route
exchange mechanism. Also, DVMRP default route policies allow for the
reduction of the routing tables to just a few routes. These enhancements
make DVMRP highly scalable.
The introduction of PIM-SSM avoids a great deal of the latency in the tree
build of by eliminating the use of reverse-path table update or some sort of
auxiliary source address discovery method (such as a SPT join from the RP
shared tree in PIM-SM). The reason for this is that the unicast address is
already known by the PIM-SSM edge router at the time the client requests
the channel. Consequently, the edge router can immediately reference an
independent unicast routing table and build the leaf immediately.
With route policies, the path each group uses out to the edge can be
preengineered, further assuring stability and predictability not only during
normal operations, but also during failure rollover scenarios.
4. The requesting client, and build the extension of the tree out to the new viewer.

The first enhancement for PIM-SSM is a feature known as PIM-SSM

“static mode.” In PIM-SSM static mode, a table is maintained in a switch’s
forwarding table, supporting the service that holds S, G entries for all PIM-
SSM groups. This means that the sending source no longer has to provide a
source-specific advertisement because the edge core switch connecting to
the client will have a known unicast IP source address for the given group
in its PIM-SSM static table5.
Figure 24-4: Core IP multicast enhancements

Another aspect to be considered is that the reserved addressing space (232/
8) for SSM is configurable. Whether the operation is in static or dynamic
mode, the address range needs to still be configurable. This provides the
flexibility to use IGMPv2 and IGMPv3 with any defined range of multicast
addresses outside the predefined SSM range As a result, a user can take an
already addressed IP multicast environment (225/8, for example) with
IGMPv2 and retrofit it into a single-source mode deployment. This allows
for the user to reap the benefits single-source technology without the need
to upgrade and redesign.
5. To ease the configuration task of such tables, a centralized management system usually allows bulk
configuration for hundreds of switches with such entries in the Nortel Layer 2/3 core switch
implementations.

Summary of Core Switch/Router Features to efficiently support IP

Television:
DVMRP static forwarding entries
DVMRP static receiver
DVMRP routing policies
PIM-SSM “static mode”
SSM configurable reserved addressing space (232/8)
Appendix F contains further details on the Core feature enhancements for
IP multicast.6
Edge switch feature enhancements for IP multicast

Once the stability and robustness of the core implementation is addressed,
the next element to be considered is the network edge. As stated previously,
it is best to think of multicast activity at Layer 2 as a cloud. This means
that, by default, there is no control of multicast activity, which means that
the traffic is sent to every active port.
Because of this, some sort of IGMP control features are needed. These
features are IGMP Forward/Proxy, IGMP Forward/Relay and IGMP
Snooping. Together they allow for Layer 2 switch to perform the following
actions:
Identify which port is connected up to a multicast-enabled router
(port 5 in Figure 24-5).
Identify which ports are receiving IGMP reports for which groups
and thereby allow that traffic to be forwarded. All other ports are
not forwarding the traffic associated with that multicast group
because no IGMP report for that group was received from those
ports.
Forward to the edge router the IGMP reports that need to be seen
(such as the leave message of the last viewing client to a group) or
to block its forwarding (as in the case of where there are still
viewing clients for a given multicast group) because there is no
need to interrupt the edge router in this scenario.

Figure 24-5: IGMP snooping and forward/proxy
IGMP fast leave and LMQI residual streams

With the timing parameters that have been discussed so far, there is still a
problem that lurks in the details of timing. During channel surf, a
subscriber may traverse as many as ten channels in a half a minute. During
this time, the viewer will, of course, want to see what’s on. Arrangements
have been made for this to happen as discussed with the core feature
enhancement. However, the router is still going through the standard time
interval (roughly one second) before the actual prune takes place. As a
result, during channel surfing, a series of residual or orphaned streams are
up and active but there is no real interest on the part of the viewer. The
prune for these streams has not yet timed out. If this happens with enough
channels and enough subscribers (or if there is even a single subscriber
with a comparatively narrow bandwidth profile such as DSL), some serious
problems can occur in the way of congestion.
In order to avoid this, a feature known as “fast leave” has been added. This
feature allows for the immediate blocking of the Layer 2 switch port and
the immediate forwarding of the IGMP leave to the router, where it acts
immediately to prune the interface from the group. The addition of this
feature provides a leave-prune time in the millisecond range.
As perceived by the human eye, this happens in an instant. In single STB
deployments or in deployments where each STB is on a separate VLAN,
the fast leave feature works well. In instances where multiple STBs share a
common VLAN, however, there is a quirk. If there are two STBs both

viewing the same channel and one of them changes to another channel,
both STBs will lose the channel. The second STB will source an IGMP
report for the channel and the edge router will join the port back onto the
event so the loss of video will be intermittent, but it could be for up to one-
half second. This is enough to cause issues. As a result, a feature known as
adjustable LMQI was developed, which allows for multiple STBs on a
single VLAN model. This allows for the fine-tuning of the LMQI value,
and its association with the IGMPv2 leave improves the handling of this
process.
In this scenario, when the first STB sends a leave, the Layer 2 switch
removes the STB port from the group. But the edge router still keeps the
stream active. At this point, the Last Member Query Interval (LMQI) timer
is started at the edge router. During this time, the edge router listens for any
report activity for the multicast group, following the IGMPv2 leave
process. In the case of the above example, STB2 will source an IGMP
report according to standard interval (125 ms). When the switch sees the
report, it continues to serve the stream and STB 2 does not experience a
service interrupt. If the LMQI were to expire and no reports were received,
the stream would be deactivated. Figure 24-6 illustrates these features.
Figure 24-6: Channel Surfing

Bandwidth control and TV channel limitation

It is important to provide a way to prevent users from watching more than a
certain number of channels at one given time, the premise being to limit the
number of TV channels allowed to flow to a user interface to a predefined
but configurable number. For example, a user with two TV sets should be
able to get two TV channels simultaneously and is not allowed to get a
third channel. This allows the service provider to control the bandwidth
usage; in addition, preventing users from attaching more than the allowed
number of TV sets to a given link.
A provider might also choose to allow an additional number of TV sets for
a given subscription and then have the flexibility to charge additional fee
for more TV sets. Also, as part of this control, it is essential that when any
violation of the allowed number of channels is detected, a notification
(trap, for example) should be sent to the management station to inform of
this violation. This capability is based on tracking of IGMP leaves and
joins per user interface in order to count the number of flowing streams on
a given interface for a specific user.
In the case where every user has a dedicated VLAN, a different model
needs to be considered. In this case, the bandwidth is equal to the number
of users times the bandwidth per channel, even if they are all watching the
same channel, because each user has their own edge-routed port.
Appendix F contains further details on the bandwidth factors at the network
edge and conditional access and content protection methods.7
Video on demand service extensions

Video on Demand is often a tandem requirement to multicast video or
television distribution. At first light, it might seem as if the issues
surrounding deployment of an effective video on demand offering would
be easier and less stringent than for that of multicast. This, however, is not
the case. This section of the chapter will investigate the nature of video on
demand as a service and the systemic requirements and design
considerations that need to be made.
Unlike multicast video distribution, which is either delivered in raw UDP
or a lightweight RTP/UDP stream (RTP for Real-Time Protocol), video on
demand requires ancillary dialogs for request and control of the media
stream. The standards-based method for this dialog is a protocol set known
as Real-Time Streaming Protocol (RTSP) as specified in [RFC 2326). The
basic method for RTSP, which uses port 554, is to provide for the facility to
request describe, setup and play directives for media from the requesting
client to the VoD server. Once the session is established, the media is
delivered on a separate port dialog using RTP over UDP. (This is the usual

case. There are methods to use the RTSP port connection as noted below.)
The figure below illustrates these dialogs.
Figure 24-7: Illustrated RTSP session flow

Note: The media streams can be other protocol formats than RTP
over UDP. RealNetworks implement Real Data transport Protocol
(RDP) as the media stream protocol. Also, there are methods of
delivering the media stream in-line over the RTSP/TCP connection.
In the diagram above, the RTP media stream is illustrated by the Packet
arrows going from right to left (server to client). Below this, arrows are
shown indicating reports going from left to right (from client to server).
This dialog is facilitated by Real-Time Control Protocol (RTCP), which
provides feedback from the client to the server regarding latencies, packet
sequence and loss. Some decoders are equipped with the ability to ‘repair’
the incoming stream by inserting double frames or blanks and also
providing reordering of the packets prior to decoding. Note that these
features are application/vendor specific and not part of the RTP/RTCP
specification.
Another point to note is that some media streams, usually web streaming
applications, will deliver separate audio and video streams. In these
instances, RTSP calls up two sets of RTP/RTCP ports, one for audio and
the other for video, where the RTP port is usually called up from a dynamic
default range, again vendor specific, but must be an even number port. The
corresponding RTCP port will always be n+1, the value of the RTP port
call. This means that the RTCP port value is always odd. This behavior is
according to RFC standard (RFC 1889).
So as an analogy, RTP/RTCP might be considered as the VCR tape while
RTSP is the actual VCR control portion of the VCR unit. It should now be
obvious to the reader that the viewers experience is not only related to the
delivery of the media but the performance of these ancillary protocols as

well. Consequently, when considering quality of service metrics, it is

generally recommended to provide a service class to the signaling protocol;
that is, at least equivalent to the service class of the media stream.
Another aspect of video on demand that is different from multicast is that it
is a dedicated unicast stream between the server and the client.
Consequently, a different traffic demand is exhibited on the network. In
multicast, the burden is on the devices that perform stream replication.
These devices are the L3 switches and routers that are multicast enabled.
Here, the burden is in CPU and memory load regarding how much ‘state’
information the device can effectively process without significant packet
loss to the streams they are servicing. The traffic burden is conversely light,
however. A 3 Mb/s stream at 100 viewing clients is still a 3 Mb/s stream at
the server or encoder source. This is not the case with video on demand.
In video on demand, there is no burden of stream replication; however, 100
viewing clients watching 3 Mb/s streams equates to 300 Mb/s at the server
source. This is the case even if all viewing clients are watching the same
movie! This is something referred to as bandwidth aggregation. This is a
symptom of VoD sessions because, as we pointed out earlier, it is inherent
in the service traffic characteristics.
One can invest in a bigger server (or even create a load-balanced server
farm) and provide faster network interfaces on it but, unless there is
consideration for the aggregation of the bandwidth service requirement, a
certain amount of oversubscription is bound to occur. This is typically how
such services are handled—where some portion of the subscriber base is
planned for using the service at a given point. Anyone else is out of luck for
that period of time (the length of the shortest VoD playtime that is left from
the time of service refusal).
The other option is to over-provision the network to allow for the
bandwidth aggregation. While these approaches work for small networks
and small subscriber bases, both fail as the scale of the network or the
service increases. As an example, a 7,000 subscriber base would equate to
a bandwidth aggregation factor of 21 Gb/s. Overprovisioning for the
bandwidth is clearly out of the question for all but the largest providers.
In the case of undersubscription, a problem raises its head in raw numbers.
With 100 users, oversubscribing at 50% means that fifty people might
potentially desire to use the service and are denied within the average
program run (ninety minutes for most movies). With 7,000 subscribers, that
same approach would result in 3,500 subscribers that could experience
service refusal. The laws of probability work here in that as the
oversubscribed base increases, the likelihood that service refusal will be
experienced increases as well because the delta of the average program
(ninety minutes) run does not change. In essence, we are comparing the
probability of fifty subscribers for an average program run or 3,500
subscribers for that same period of time for demanding the service. This is

the case even though the service provider has already allocated 10.5 Gb/s to
figure for the 50% of the subscriber base. Clearly, there needs to be another
option for VoD deployment.
All of this is symptomatic of centralized serving of the video on demand
service. The difficulty with the centralized server approaches is that they
are prone to these issues. These are in fact the tractable limits that such an
approach has inherent in it. Distributing the server process out to edge
provides a much more scalable method for video on demand services by
doing two things. By distributing servers out closer to the viewing base, the
bandwidth aggregation factor is reduced in relation to the reduction in the
size of the subscriber base served.
As an example, if the 7,000 subscriber networks were served by ten servers
instead of one, the subscribing base to each server would be 700. Now a
bandwidth aggregation factor of 2.1 Gb/s is the result with a 100%
overprovision. By oversubscribing 50%, the bandwidth factor becomes a
manageable 1 Gb/s and only 350 subscribers are thrown to the wolves with
the possibility of service refusal during the average program run. Now let’s
multiply the number by four, so that the 7,000 subscriber network is served
by forty VoD servers. This means that each server will provide streams to
175 subscribers. This results in a bandwidth aggregation factor of a quite
manageable 525 Mb/s. In this scenario, the service provider can now easily
provision for 100% of the viewing base; thereby, resulting in a true real-
time video on demand service offering.
The second aspect to consider is that distribution of VoD servers out
towards the edge of the RTSP control flow loop is likewise shortened. This
improves user experience as well as reduces the burden on the server
because of the reduced subscriber base that the servers are supporting. The
figure below shows a network topology. In this figure, the VoD servers are
placed out at the topology aggregate to the subscriber edge. As shown, the
viewing population has been reduced as a result to this as well as the fact
that the video on demand traffic burden has largely been lifted off of the
core of the network; thereby, freeing it up for multicast video as well as
other real-time services such as VoIP.

Figure 24-8: A distributed video on demand deployment

Note that there are now two types of media content paths. The normal
RTSP streaming delivery is provided by the edge VoD server and is
illustrated in the figure above. The second type of media content path also
illustrated in the figure is a server-to-server dialog for the positioning of
content out to the edge servers.
The distribution can be accomplished in a number of methods ranging from
100% prepositioning of the content to an active proxy method that pulls the
content off of a central repository server and serves it from the edge while
caching it temporarily for any other interested viewer. After a period of
time, the content would then be purged from the edge server to make room
for newer content.
Usually, a given service would use both or a range of methods. Hot releases
for instance would be completely prepositioned at the edge so that
maximum subscriber exposure and availability is maintained for that
content. Other content might be mainstream contemporary releases that
still have a degree of popularity. This group of content might be partially
prepositioned, also known as ‘prefix caching.’ Prefix caching allows for the
prepositioning of a given percentage of the media content. An
administrator could have several ranges of content, some at perhaps fifty
percent; others could be at lower increments such as ten or twenty percent.
For these groups of content, the RTSP stream delivery is served from the
edge server out of the ‘prefix’ while the rest of the content (the suffix) is
pulled down from the central repository via the media positioning path.

Any other requests for that piece of content would be served from the edge.
After a given period of time, provided no one else requests the content, the
suffix would be purged from the edge server, whereas the prefix would
remain cached at the edge for the next request.
The distribution of content is not in itself sufficient to address the complete
demands of the clients request for media. There needs to be a way to
provide for request routing of the clients request to the appropriate video
server. When a video asset is published into the video server system, a
series of metadata is typically created. This metadata may provide content
description and duration as well as stream speed and appropriate player
call. One of the other metrics that are created is a URL for the content
request. By providing an RTSP redirection to the local edge server for each
VoD request, the previous content distribution model would be leveraged.
Appendix F contains further details on the Web Streaming methods and
practices.
VoD deployment in cable MSO

Cable operators are aggressively working to turn VoD service up in their
networks to compete against Satellite operators. The Cable MSO network
is being driven to “everything on demand” model with enhanced
information services by its subscribers. Old mediums of transport for VoD,
such as AM-QAM and DVB-ASI, are complex to operate and manage and
lack the ability to scale and deliver the next generation of on demand
challenges facing Cable Operators.
Video on Demand service in a Cable MSO comprises several elements to
form a service offering. The Primary Hub or Head End in the case of a
centralized service approach will be the repository and contribution point
for all Video On Demand assets. In a decentralized model, assets will be
kept at the Primary Hub or Head End as well as at one or more Distribution
Hub(s). The distributed assets may be updated across the back office
network of the Cable MSO, or manual asset distribution may be performed.

Figure 24-9: Video on demand functional components
What is VoD in cable?

To a Cable MSO, VoD is a unicast, real-time Mpeg-2 broadcast quality
video service. VoD is delivered via the MSO transport network to the
served Hub where it is delivered into the Hybrid Fiber Coaxial network for
last mile service to QAM Set Top Box.
Cable MSO VoD is not streaming, cached or time-shifted.
VoD bandwidth
To understand bandwidth in a Cable MSO network, we will describe how
bandwidth is determined in the last mile; therefore, how services are
constrained for VoD.
A 6 MHz North American Cable channel modulated using 256 QAM
yields 38 Megabits of derived bandwidth given the modulation chosen.
Therefore, with Mpeg-2 encoding qualities, it is proper to assume each
channel contains ten VoD services—each service equal to 3.8 Megabits.

Digital cable headend

The chosen Digital Cable headend determines to a large extent the use of
equipment throughout the Digital Cable network. An example of this is
Scientific Atlanta* (SA), which manufacturers headend equipment for the
control and management of Digital Cable network resources. A Digital
Network Control System (DNCS) is used in SA headends to enumerate set-
top boxes, control encryption tokens, and assignment or arbitration of
narrowcast segment channels. The SA approach is to require the use of
their Modular QAM (MQAM) product at the edge of the distribution
network facing the customer settops by controlling the encryption key
algorithm from the headend with the MQAM and not releasing the
algorithm to competitors. In so doing, SAs DNCS is the sole device in the
network responsible for all resource assignment in the edge of the network.
Other Digital Headends exist in North America and Europe, such as
General Instrument* and Zenith*. Generically, a Session Resource
Manager will arbitrate all the functions of a DNCS from SA for these other
systems. These non-SA systems do not employ the same encryption
methodology in the delivery of digital video assets.
The industry belief in Cable is that theft of service costs relatively less than
the complete prevention of theft. For VoD, a DNCS or Session Resource
Manager acknowledges per subscriber assets billing; therefore, VoD as a
new service offer is quite secure.
MQAMs take ingress DVB-ASI MPEG program IDs and modulate each
ID present to a directed timeslot of an RF channel. These timeslots are
simply frequencies within the RF Channel used. As you will recall, in VoD
bandwidth we reviewed a 6 MHz RF Channel and how it supports ten VoD
streams.
The DNCS in the Headend is connected to all the served Hubs via an ATM
network typically implementing LANE (LAN Emulation on ATM). This is
accomplished via an overlay ATM-on-SONET network to every Hub.
Scientific Atlanta originally recommended Fore Systems* ASX1100 and
ForeRunner 200’s for the Headend-to-Hub LANE setup. As a result,
typically Fore switches are found in Cable MSO networks strictly for the
needs of the DNCS communications.
Set-top box operation

Set-top boxes from Scientific Atlanta use PowerTV* Operating System to
interact with both the network and end subscribers. PowerTV OS is
licensed to several set-top manufacturers and is currently a predominate
player in QAM-based settops.
When a Power TV settop boots up, it requests its inband IP address, which
is invisible to the subscriber. This is handled by Bootp. In newer settops,
support for DHCP has been supplied. Once the settop has an address, it

requests its image download. The settop IP network is a single Layer 3

nonrouted domain in which the DNCS, in the case of a Scientific Atlanta
setup, shares a connection. The DNCS listens for all new set-top box
initialization requests. DNCS will arbitrate enumeration of settops to the
network, as well as direct them to a Token Encryption Device (TED) for
smart card security key.
DNCS, after enumerating the settop to its device table, will send the settop
its Electronic Program Guide (EPG) and any additional APIs, such as VoD
specifically written for a certain type or revision of settop to ensure support
of “trick play” operations (pause, fast forward, and rewind) with the VoD
server.
The settop uses the Data Channel Gateway, often referred to as an Return
Path Demodulator (RPD) in an SA system to provide two-way real-time
control and signaling support to the set-top box. It basically takes data from
the Fore ATM LANE network and converts it into the DAVIC out-of-band
channel format used for communicating with the set-top box over the HFC
media.
The upstream path between the set-top box and the headend is provided by
the DCGs MAC interface, which assigns timeslots to each set-top box that
wants to communicate with the headend. The physical channel frequency
between the DCG and the setup box lies between 5 and 42 MHz (QPSK
modulated channel).
This path would be used for any communication between the set-top box
and the DNCS (that is, all DSM-CC messages would go through this path,
the DCG would terminate the QPSK channel and provide and an L2 or L3
IP interface to the remaining network).
Cable network communication paths

The Cable network uses modulation in several frequency ranges to enable
downstream path and return path communications for several facilities.
The downstream facilities that a Cable network supports are as follows:
Set Top “Loader Path”
Analog Broadcast Contribution “Vestigal Side Band”
Digital Broadcast Contribution “QAM 64/256”
Cable Modem QAM 64/256 “Docsis or nonstandard Cable
Modem”
Digital QAM Narrowcast Contribution QAM 64/256
TDM Voice (Cornerstone Voice)
The upstream facilities that a Cable network supports are as follows:
Set Top control “DSM-CC messaging”

Cable Modem QAM 16 “Docsis or nonstandard Cable Modem”

Out-of-Band FM Data Channels
TDM Voice (Cornerstone Voice)
VoD narrowcasting
A Cable network, as with any network, requires segmentation in order to
scale to the subscriber base and offer additional bandwidth efficiency.
Cable networks accomplish segmentation via RF combiners in the Hubs
serving subscribers. If 5-550 MHz of frequency is always present on one
coax feed, you can “insert” higher frequency ranges at an individual hub
for “narrowcasted services.”
Example:
Broadcast and Inband set-top services are always present and carried to all
hubs as the “flat” 5-550 MHz network. In the Hubs, we choose to add VoD
into the 550-750 MHz range.
We do so by contributing the services to the proper RF channels, then
combining them with the “flat” 5-550 network; therefore, delivering 5-750
MHz with the 550-750 range being unique to this hub alone. This method
of micro segmentation for a frequency plant is known as spatial reuse. The
term is appropriate as we reuse sections of the 600-750 MHz space for
every narrowcast segment created.

Summary
This chapter has covered all aspects of unidirectional video transport
utilizing IP networking technology. First, the multicast delivery of video
content was covered with a review of industry prevalent methods and
directions for multicast technology. Many optimizations need to be made to
the standard Internet Service Model (ISM) for multicast. Among these are
DVMRP and PIM multicast static routes, IGMP snooping and timing
optimization, as well as access control and management. The reader should
be able to discuss these enhancements and how they provide drastic
increases in the performance profile of the multicast service model for the
support of IP-based television services in a noncable network topology
environment. The reader should also be able to discuss dense versus sparse
mode multicast models, as well as single source modifications to sparse
mode multicast.
Aspects of unicast video were also discussed. First, standards-based Video
on Demand services that utilize RTSP/RTP transport were covered. The
mechanics of each protocol were discussed and how they relate to the
actual service from the user, client or set-top box perspective. Traffic
engineering concerns were also discussed and the different bandwidth
demands that Video on Demand services exhibit on the IP network when
compared with multicast video delivery. Centralized versus distributed
Video on Demand architectures were discussed with a comparison of
bandwidth demands for each.
Finally, VoD service offerings within the cable provider environment were
covered. Cable networking communication paths were discussed with
particular emphasis on Video on Demand. The reader should be able to
explain how IP-based VoD services are ‘overlaid’ onto the QAM transport
that is used in the CATV network. The reader should also be able to
describe the initialization of the set-top box and how it is brought up on line
to the cable service offering.

615
Appendix A
Additional Details about TDM
Networking
SONET/SDH hierarchy
Knowing how many voice channels fit into each level of the hierarchy can
be used to calculate the total payload of each system. Moving up the chart
in Figure A-1, for example, we see:
DS1 24 channels
DS3 24 channels X 28 DS1/DS3 = 672 channels
OC-3 24 channels X 28 DS1/DS3 X 3 DS3/OC3 = 2016 channels
OC-12 24 channels X 28 DS1/DS3 X 12 DS3/OC3 = 8064
channels
OC-19224 channels X 28 DS1/DS3 X 192 DS3/OC3 = 129024
channels

616 Appendix A Additional Details about TDM Networking
Figure A-1: SONET /SDH Hierarchy
Stratum level clocks

At the apex of the clocking hierarchy are the most precise clocks, called
Stratum 1 clocks. The following table shows the defined Stratum levels,
their accuracy, their ability to synchronize to a higher level clock, and the
required stability of the clock. The table shows this information in two
forms: first the stability in terms of drift, and second the time that it takes to
accumulate 193 bit slips (one T1 frame). The latter measurement is where
the TDM system will experience data loss because when a frame slip
occurs, the TDM system will attempt to resynchronize itself to the data
stream and, in so doing, will lose all of the data bits during the resync
operation.

Appendix A Additional Details about TDM Networking 617
Stratum Precision1 Synchronization Stability (Drift) Minimum Time

between Frame Slips2
1 1 X 10–11 144 days
2 1.6 X 10–8 +/– 1.6 X 10–8 1 X 10–10 /day 14.4 days
3E 1 X 10–6 +/– 4.6 X 10–6 1 X 10–8 /day 3.5 hours
3 4.6 X 10–6 +/– 4.6 X 10–6 3.7 X 10–7 /day 77 minutes
4E 32 X 10–6 +/– 32 X 10–6 32 X 10–6 6.7 minutes
4 32 X 10–6 +/– 32 X 10–6 32 X 10–6 6.7 minutes
1. Precision of 1 X 10–11 is equivalent to deviating by no more than one part in 100,000,000,000

2. The time between frame slips is the time required to accumulate 193 bit slips @1.544 X 106 bits per
second. It is a minimum time because it is a limit on the worst case drift. Note that the time to the first
frame slip may be half of this value.
Table A-1: Stratum clock hierarchy
Stratum 1 clocks can be used to control Stratum 2, 3E, 3 or 4 clocks.
Stratum 2 clocks may be used to control Stratum 2, 3E, 3,4E or 4 clocks.
Stratum 3 clocks may be used to control Stratum 3, 4E or 4 clocks. Stratum
4E or 4 clocks should not be used to control other clock sources.
DS1 Timing circuits

In the early days of digital telephony, precision clocks were very expensive.
Building a separate high-speed network to distribute clock signals would
have been prohibitively expensive. The distribution mechanism used to
distribute clock signals was, and still is, to a very large extent today, a
working DS14 rather than a DS1 from a dedicated timing network.

618 Appendix A Additional Details about TDM Networking

619
Appendix B
RTP Protocol Structure
The structure of the RTP packet is shown in Figure B-1. The following
paragraphs give a detailed description of each field.
bit 0 8 16 24 32
CSRC
V(2) P X M payload type sequence number
count
timestamp
UDP packet
synchronization source (SSRC) identifier
Contributing source (CSRC) identifier

….
Payload (audio, video, …)
… padding count
Figure B-1: RTP protocol structure

Version (V) [two bits]: The field indicates the version of RTP used
to create the packet.
Padding (P) [one bit]: P is set when the packet payload contains
padding; that is, one or more octets at the end of the payload are not
part of the transmitted data. Padding may be needed when RTP
packets are packed together in a lower-layer PDU. Also, some
processes, such as encryption algorithms, require fixed block sizes.
Extension (X) [one bit]: If X is set, the fixed header is followed by
exactly one header extension.
CSRC count (CC) [four bits]: The field indicates how many CSRC
identifiers are contained in the CSRC identifier field. This field has
a non–zero value, only if the packet has been processed by an RTP
mixer.
Marker bit (M) [one bit]: Setting M indicates that a frame
boundary or other significant event is marked in the bit stream. For
instance, an RTP marker bit is set if the packet contains a few bits of
the previous frame along with the current frame.
Payload type (PT) [seven bits]: PT indicates the type of data
carried in the RTP packet. RTP Audio Video Profile (AVP) defines

620 Appendix B RTP Protocol Structure
a default static mapping of payload type codes to payload formats.

Additional payload types can be registered with IANA.
Sequence number [sixteen bits]: This value number increments by
one for each RTP data packet sent. Initially, Sequence Number is
set to a random value. The receiver can use the sequence number to
detect packet loss or to restore the packet sequence.
Timestamp [32 bits]: The timestamp reflects the instantaneous
sampling time of the first octet in the RTP data packet. The
sampling time must be derived from a clock that increments
monotonically and linearly in time to allow synchronization and
jitter calculations at the receiver. The initial value should be
random, so as to prevent known plain text attacks. For example, if
the RTP source is using a codec that buffers 20 ms of audio data, the
RTP timestamp must be incremented by 160 for every packet
irrespective of whether the packet is transmitted or dropped by a
silence suppression feature.
Synchronization Source Identifier (SSRC) [32 bits]: This field
identifies the source that is generating the RTP packets for this
session. The identifier is chosen randomly with the constraint that
no two sources within the same RTP session have the same value.
Contributing Source Identifier (CSRC) list [various]: The list
identifies the contributing sources for the payload contained in this
packet. The maximum number of identifiers is limited to fifteen;
having all zeros is prohibited in CC field). If there are more than
fifteen contributing sources, only the first fifteen are identified.
RTCP packet formats

RTCP defines five different types of packet formats defined to convey the
statistical values flowing between two endnodes to ensure QoS, including
the jitter computation, loss statistics, and synchronization information.
Sender Report (SR): used for transmission and reception statistics
from participants that are active senders (bytes sent, timestamps)
Receiver Report (RR): used for reception statistics (estimated
packet loss, interarrival jitter, round-trip delay) from participants
that are not active senders
Source Description (SD): carries additional information on a
source (for example, CNAME, e-mail, phone number)
Bye: indicates a source is leaving the session
Application Specific: functions specific to an application
RTCP packet structure is similar to RTP: each packet begins with fixed
length elements, followed by variable length elements. The number and

Appendix B RTP Protocol Structure 621
length of the variable elements depends on the packet type, but it must end
on a 32-bit boundary. The alignment requirement and a length field in the
fixed part of each packet are included to make RTCP packets “stackable”.
The means that multiple RTCP packets can be concatenated to form a
compound packet to be sent as a single packet of the lower layer protocol,
such as UDP. No separators are needed. An example of a compound RTCP
packet as produced by a mixer is shown in Figure B-2.
if encrypted: random 32-bit integer
packet packet packet

receiver receiver
SSRC
SSRC
SSRC
SSRC
SSRC
SSRC
SSRC
SDES
sender
BYE
SR
CNAME PHONE CNAME LOC report report reason
report
site 1 site 2
compound packet
UDP packet
Figure B-2: Example of an RTCP compound packet

622 Appendix B RTP Protocol Structure

623
Appendix C
Additional Information on Voice
Performance Engineering
This appendix provides additional details and discussion of jitter and the
jitter buffer.
A dedicated voice packet network in a steady state (that is, having
continuous voice calls, no silence suppression, no data load, and
congestion free) will have a quasi-static flow pattern. Thus, it will be no
packet jitter. There will be a distribution of delay for the various calls, but
delay will be invariant for any particular call. In a changing flow pattern,
where voice calls are being set up, cleared down, or where silence
suppression is in use, the changing instantaneous load vs. output link speed
will give rise to changing contention for the output link resulting in
homogenous jitter. When data is added (and associated forwarding classes
to prioritize voice and data traffic), the relative traffic load of each
forwarding class vs. the output link speed will give rise to changing
contention for the output link, resulting in heterogeneous jitter.
Figure C-1 shows the relationship between the loading (utilization) of a
link and the amount of jitter experienced on the delay. Note that lower
speed links have generally higher jitter at all values of utilization, and that
lower speed links also show inflated jitter at lower loading than do high-
speed links. This reflects statistical smoothing of the traffic for high-speed
links.

624 Appendix C Additional Information on Voice Performance Engineering
Figure C-1:Jitter as a function of the percent of maximum voice load, packet

size & link speed (bits/s). Voice and data traffic - 1 queues stat
mux, M/D/1G.711 voice packets of varying sizes.
In networks carrying voice and data, jitter can be significantly reduced by
strict priority scheduling and proper load balancing. This will prevent
individual nodes from being oversubscribed. A good general rule is to keep
loading (utilization) below 90%. For many link speeds, jitter becomes
uncontrollable by about 90%. Avoid loading above 90% on low and
moderate speed links. Very high-link speeds tolerate higher loading. As
jitter is controlled, the better, since jitter buffer waiting time is reduced.
Network jitter
Network jitter refers to the jitter in a core network that uses generally high-
speed links (>= 10 Mb/s). Jitter is no longer significant above 10 Mb/s,
providing the post-90% loading asymptote is not reached. In situations
where the statistical multiplexer output link loading is less than 90%, jitter
is bounded to a few milliseconds and will have no or negligible impact on
voice quality. Where statistical multiplexer output link loading is not
controlled (that is, no admission control or under-provisioning,) loading is
unbounded (>90% –> 100%) and, therefore, jitter is unbounded – the delay
can rise asymptotically. Voice quality becomes unpredictable and unstable,
especially as packets are being dropped. For every 10 ms of additional
jitter, voice quality degrades by 0.5R @ Delay =150 ms, 1R @ Delay = 200
ms, 1.3R @ Delay = 250 ms (Delay is the mouth-to-ear delay one-way).
The potentially deleterious effects of homogeneous jitter and packet loss
are only adequately resolved by bounding the% load, using network

Appendix C Additional Information on Voice Performance Engineering 625
admission control, aggregate call admission control, and traffic monitoring

and placement.
Access/source jitter
Access jitter refers to the jitter in the access network that generally uses
low-speed links (< 10 Mb/s). As the data loading increases relative to the
voice, the probability that a data packet is in the process of transmission
increases, and the voice jitter increases, even with strict priority of voice
over data. For a given relative voice/data loading, as link speed drops, long
data packets take more serialization time, scaling the voice jitter. Jitter in
all low-speed packet access networks (cable, Enterprise, xDSL) can dwarf
network jitter in high-speed networks by several orders of magnitude.
Depending on the access link speed, the potentially deleterious effects of
heterogeneous jitter may be bounded by limiting data load, and segmenting
and/or preempting long data packets.
Jitter distribution as a function of changing % voice load
Voice traffic - 1 queues stat mux, E/D/1
10m s G.729 voice packets with silence suppre ssion.
Jitter: 94ms
Instantaneous sou rce jitter delay (sec)
Jitter (delay variation) : 94 ms

Mean delay: 6.2ms end-to-end
Standard deviation: 9.6ms delay
Variance: 92us
Simulation conditions:
256K DSL links
24 calls on AAL2 Cu=1ms
G.729, 10ms with silence sup
Voice packet
Voice Call Duration
Figure C-2: Jitter delay distribution on a congested link (> 90% average
voice load)

Jitter: 3ms
Insta nta ne ous sou rce jitte r de la y (se c)
Simulation conditions:
256K DSL links
24 calls on AAL2, CU=1ms
Congestion control mechanism
G.729, 10ms with silence sup
Voice Call Duration

Figure C-3: Jitter delay distribution on a non- congested link (<90% average
voice load). On an appropriately loaded output link, late packet
clusters are less systematic, and of much shorter duration and
frequency. Jitter delay distribution is bounded to tolerable limits
and packet loss probability in limited size buffers is negligible.
Jitter buffer dimensioning

In Voice over IP (VoIP), a jitter buffer is a shared data area where voice
packets can be collected, stored, and sent to the voice processor in evenly
spaced intervals. Variations in packet arrival time, called jitter, can occur
because of network congestion, timing drift, or route changes. The jitter
buffer, which is located at the receiving end of the voice connection,
intentionally delays the arriving packets so that the end user experiences a
clear connection with very little sound distortion. The jitter buffer must be
designed for an expected profile. To do that requires deterministic
conditions.
Where the persistence of instantaneous arrival and the average arrival rate
do not exceed available buffer space, then no packets will be lost in
network routers and a receiving media gateway.
To achieve that condition requires that the average arrival rate does not
exceed a certain percentage (utilization/loading) of the outgoing link speed
of each multiplexing stage in the connection. For voice traffic with an
assumed uniform periodic profile at point of origin and a constant
bandwidth requirement, then that percentage can be as high as 90%
provided the traffic is all voice, or if shared with data then that voice has
absolute priority over data.
The buffer can then be selected to accommodate the expected longest
persistence of instantaneous loading above its average level of fill.

Under those circumstances, where voice packets originate, traverse and

terminate over links in excess of 10 Mb/s, one can expect the induced jitter
(resultant from the convolution of the behavior of the concatenated
multiplexing stages) to be of the order of a few milliseconds at most.
Consequently, a 10 ms jitter buffer should be sufficient. Jitter buffers can
and should be sized independently of a packet size.
Where those circumstances do not prevail (that is, network loading is not
controlled or bounded), there is little that can be said about determining the
correct jitter buffer size. Without a firm expectation of the instantaneous
and average traffic profile (that is, knowledge of the total traffic admitted to
the network and its load balance across the network), the probability of
unbounded persistence and uncontrolled average loading is increased, but
also unbounded. No size of jitter buffer in any router or receiving media
gateway can be considered big enough. One should always engineer a
managed network for well-behaved normal operation, with a sufficiency of
controls and monitors and a capacity suited to demand. In such
circumstances, a jitter buffer of a few milliseconds will have no negligible
impact on voice quality and fully absorb the packet delay variation of
packet switching.
Voice QoE = f (Conversation dynamics, Voice Signal Fidelity)
= f (Delay, Distortion)
= f (Delay, Speech Distortion, Sound & Echo level)
Where:
Distortion includes impairments due to compression coding, end
devices, and lost/late voice packets.
Echo includes impairments due to hybrid inductive coupling (trans-
hybrid loss) and acoustic coupling in the terminal handset/headset.
TELR (talker echo loudness rating) is the parameter defining the
level of echo signal reflected back to the talker.
Loss plan includes impairments due to non–optimal signal loudness
— SLR (send loudness rating), RLR (receive loudness rating), CLR
(circuit loudness rating) and TELR (talker echo loudness rating).
Delay includes impairments due to propagation, processing and
packetization, queuing/jitter, switching. Delay contributes to echo
impairment, but is also impairment on its own when the total delay
become sufficiently high.
As we combined the various sources of delay, distortion, echo and signal
level, the determination of the voice quality becomes more complex and
less trivial. A standardized tool and method has been developed by ITU to
assist in the voice quality prediction. The ITU-T G.107 E-model is an
analytical tool for estimating the end-to-end conversational voice quality
across any network.

QoE Performance metrics Contributing Factors

1. Delay
Packetization/de-packetization
DSP/CPU processing
Propagation (distance)
Queuing (congestion, inadequate buffer size)
De-jitter buffer
E-Model Transmission Rating (R)
Channel coding, interleaving, scheduling/poling
See ITU Rec. G.107
2. Distortion
Codec speech coding
Use of voice activity detector
R summarizes the effects of network
Late/loss packets
impairments including delay, distortion, loss plan
Transcoding
and level of echo[2]
Sound Level (Loss pads, terminal loudness ratings)
Echo Level (Echo control methods, suppressor, canceller)
Figure C-4: QoE performance metrics

The E-Model combines all the contributing objective characteristics of a
connection (including signal level, circuit noise, sidetone, echo, delay,
jitter, and codec distortion) to obtain a prediction of the subjective voice
performance. It is based on a large body of subjective voice quality data,
which takes into account a wide range of impairments. Although the E-
model is based upon subjective voice quality data, the E-model remains an
objective method of predicting voice quality, and as such, is repeatable and
consistent (the same conditions will always return the same prediction).
Thus the voice Quality of Experience can also be expressed in terms of
“R”.
Voice QoE = f (E-Model Transmission Rating R)
Transmission-rating R, the output variable of the ITU E-Model (Rec.
G.107), is a voice quality indicator. R can be used to predict and compare
conversational quality of voice calls on all types of networks.
Avoiding lost packets and missing data

In an uncongested network using the correct jitter buffer delay, there will
be no or negligible network packet drops. However, in congestion
situations, packets are lost and jitter is uncontrolled, leading to many late
packets. The best solution to late and lost packets is to engineer the
network to preclude or at least minimize the situations that cause delays
and packet loss. Networks can be engineered to keep the packet loss rate
low or even zero.
Some strategies for minimizing lost packets are described below. Some of
these need to be implemented on a network basis to operate; others can be
used on a single channel to improve the quality on an individual call. In
most cases, their availability in the network will depend on implementation

by the vendor, so the following is given as a shopping list rather than a

recipe.
Match link speed to traffic load. Links should have sufficient capacity to
carry peak loads. The Implementation Section addresses this and other
network engineering issues in detail.
Call admission control. In networks with a high proportion of voice
traffic, call admission control can prevent congestion by limiting the
number of calls that can be active through various nodes in the network.
Service classes ensuring expedited forwarding for VoIP packets. Giving
voice packets priority will reduce variability in packet arrival time, and
voice packets will not be deleted to relieve congestion. This is most
effective if the network is carrying a substantial proportion of data traffic. If
the network is carrying a high proportion of voice traffic, there may still be
queuing delays in the routers with associated increase in jitter and lost
packets.
Adaptive jitter buffer. An adaptive algorithm responds to packet loss by
increasing the waiting time used by the jitter buffer, and decreases it again
when the loss rate drops back. Adaptively changing the delay in the jitter
buffer can minimize late packets when the system is congested and avoid
adding unnecessary delay when it is not. If possible, adaptation occurs
during silent periods in the speech, so the temporal shift won’t be audible.
Defining user level (QoE) and network level (QoS) performance

requirements
Perceived Quality of Experience also depends on user expectation. For
example, mobile users have different expectations about voice performance
than wireline users. Interactive conversation becomes impacted when delay
exceeds 150 milliseconds (see ITU G.114). Similarly, people have different
expectations for international/overseas calls than for local calls. To gauge
market acceptability, voice quality is best assessed from the perspective of
a known user experience. Where network providers offer reduced quality
levels for discount prices, these may be compared to the known user
experience associated with a lower quality; for example, a discount
wireline service might be benchmarked against circuit-switched wireless.
Since packet network is a replacement technology, the voice quality stakes
are the highest where user expectation is unchanged. Voice quality
planning needs to be related to the network it replaces; that is, the current
PSTN.
A five-step process has been defined based upon a top-down approach and
is shown in Figure C-5. Network level requirements are driven by end-user
application requirements following a top-down approach, as opposed to a
bottom-up, which we believe may lead to inadequate QoE performance.

}
Top/Down
User Define Service QoE Performance Metrics
Approach
QoE
Space Define Service QoE Performance Targets
}
Identify Network Level Contributing Factors &
Dependencies affecting QoE Metrics Network
Architecture
Determine QoS Enabled Network
Architecture Requirements and Configuration
(QoS) space
Run Simulation and Analyze Network

& QoE Metrics Behaviour
NO YES
Meet QoE
Validated = Services QoE
Targets
requirements are provided by the
QoS enabled solution
Figure C-5: Process for determining user level (QoE) and network level
(QoS) requirements
Performance metrics and targets defining QoE for the different telecom
services are in different states of development. The requirements for
interactive voice services are more or less completely understood, while the
requirements for browsing and remote applications remain undetermined.
Voice service users are interested in experiencing clear, noise-free, and
echo-free conversations. All the parameters contributing to the QoE of an
ordinary voice call have been combined in an industry standard model
(ITU-G.107, the E-Model). In order to provide an estimate of voice quality
based upon The E-Model, a set of sixteen input parameters is required to
generate its output factor—the transmission rating (R). Some of these
parameters depend on underlying packet network behavior, and various
methods assist in deriving estimates of these parameters using analytic or
simulation tools such as OPNET* Modeler.
Table C-1 shows the voice quality performance targets derived from
known users experiencing PSTN calls. These targets are what we should
aim for to provide equivalent quality to the PSTN network. Targets are
presented for both the A-side and B-side listener. Note that local and
regional calls have no ECAN. ECANs are active in National/International
and mobile calls. It has been determined that a difference of 3R is not
noticeable by typical users and, therefore, packet networks could be
engineered within this margin in order to provide an equivalent
replacement technology. A difference of 3-7R might be noticeable but
most likely acceptable. Larger R degradations (greater than 7R) are more
likely to be noticeable.

Table C-1: Voice quality performance targets based upon known user experience. Note that
PBX have better loss plan and echo control resulting in better R. Also local and
regional calls have no ECAN active, contrary to national and International.
User QoE Performance Targets Network Performance Targets Notes

Services Performance Core
QoE metrics QoE targets End-to-End
metrics Network
Conversational Voice Conversational voice
[10]
R-factor delta R PMO-FMO < 3R E2E delay 150ms 100ms ITU G.114, [3]
delay < 150 ms jitter 0ms see note [1]
distortion Ie < 3R packet loss 10-3 10-3 [2]
Conversational
sound level OLR = 10dB
voice
(CBR and VBR) echo level TELR = f (D) [3]
Path Interruptions Due to Failure Path Interruption Due to Failure
Frequent Interruption 80ms (affects speech Frequent 80ms
[4]. [8]
intelligibility) Interruption
Infrequent interruption 3 sec (perceived as call Infrequent 3 sec
[9]
drop) Interruption
Voice Band Data Voice Band Data
FAX page transfer time 30 sec per page jitter N/A N/A G.1010
Modem N/A packet loss 10-5 10-5 [5], G.1010 [14]

Voice Band Data T.30 FAX
session interruption N/A, handled by retrans. 6 sec T.30
T.38 FAX not critical [12]
V.17 modem 50ms V.17
V.34 modem 70ms +/- 5ms V.34
V.90 modem 1.6 sec [13]
Call Control Call Control
call setup time ITU standards E2E delay variable 20ms [7]
call tear-down time ITU standards packet loss 10-3 10-3
Call Control
inadequately handled interruption
work in progress work in progress Q.543
call attempts duration
[1] Should be engineered to meet E2E delay, 10ms is recommended - based upon Succession Voice Quality & Bearer Interworking Accreditation v3.3 (Blouin & Bruckheimer),
[2] sufficiently low such as loss only occur randomly as packet loss concealment algorithms are not efficient on bursty losses
[3] Succession_UA-AAL-1_VQ_ECAN_Planning_Report (F. Blouin, M Armstrong, R. Britt)
[4] J. G. Gruber and L. Strawczynski, "Subjective effects of variable delay and speech clipping in dynamically managed voice systems," IEEE Transactions on Communications, vol. COM-33, pp. 801--808,
[5] Performance of Analogue Fax over IP (Brueckheimer)
[6] Based upon a 20000km international call, 100ms was allocated for propagation delay
[7] 20ms includes 10ms for propagation and 10ms for message processing. This budget allows for 4-6 messages @ 20ms to complete a transaction
[8] IEEE INFOCOM 2002 1 Perceived Quality of Packet Audio under Bursty Losses Wenyu Jiang, Henning Schulzrinne
[9] Impact of Network Outages on Voice Quality (F Blouin. L.Thorpe)
[10] PMO: Present mode of operation, that is the existing infrastructure, most likely TDM. FMO: Future Mode of Operation, that is, the new packet/replacement solution
[11] assumes most of core delay will be due propagation delay - should be budgeted to accommodate international calls 15000-20000 km
[12] depends on the signalling protocol - H.323, SIP…
[13] not specified in the standard, implementation based, default is set to 1.6 sec
[14] 10-5 was ontained from 10-6 max BER in G.1010 and the packet size (20ms)
Table C-2:PSTN wireline conversational voice, voice band data and call control
performance

Quality metrics for voice: further details

Each of the characteristics previously discussed (and other, conventional
voice impairments, such as noise and harmonic distortion) can be measured
individually. However, it is useful to have an overall indicator of voice
quality. Various metrics have been devised to quantify the overall perceived
voice quality of a component or a system. Three common metrics are
discussed below: the subjective measure called Mean Opinion Score
(MOS); an objective MOS estimator called Perceptual Evaluation of
Speech Quality (PESQ) (pronounced “pesk”); and a computed metric
called transmission rating (R), which is calculated from objective
measurements of fifteen contributing parameters using an ITU standard
tool called the E-Model. Since most quality metrics are based on MOS in
some way (sometimes in name only), we will start with a taxonomy of
MOS before looking at the details of the three metrics.
Quality metrics are evaluated for individual connections. There is no single
value to describe a network: networks carry many calls, both simple and
complex, and the quality is determined by the access types, the transport
technology, the number of nodes the call passes through, the distance,
packet transport links speeds, and many other factors that differ from one
connection to another. To compare networks, specific connections
(reference connections) representing equivalent calling conditions are
defined that can be measured and compared.
Types of MOS. Mean Opinion Score began as a subjective measure.
Currently, it is more often used to refer to one or another objective
approximation of subjective MOS. Although all “MOS” metrics are
intended to quantify QoE performance and they all look very similar
(values between one and five), the various metrics are not directly
comparable to one another. This can result in a fair amount of confusion,
since the particular metric used is almost never reported when “MOS”
values are cited. To clarify the situation, ITU has defined distinct types of
MOS, as shown in Table C-3. This may seem unnecessarily complex, but
at least it reminds us that there are fundamental differences between
individual metrics and those numerical values are not necessarily directly
comparable just because they are both called “MOS.”

Source Listening-only Conversation
Measured subjectively MOS-LQS MOS-CQS
Measured objectively MOS-LQO MOS-CQO
Computational Estimation MOS-LQE MOS-CQE

Key:LQ=listening quality; CQ=conversation quality; S=subjective; O=objective; E=estimated
Table C-3:Six types of “MOS” as defined in ITU Rec. P.800.1. Comparisons between scores
from different categories are invalid.
Subjective MOS. MOS-LQS and MOS-CQS are direct measurements of

user impression of overall voice quality, and as such are a direct measure of
QoE. Subjective MOS is the mean (average) of ratings assigned by subjects
to a specific test case using methods described in ITU-T P.800 and P.830.
MOS-LQS is obtained from listening tests, where people listen to recorded
samples and rate their quality; MOS-CQS is obtained from conversation
tests, where people talk over experimental connections and rate their
quality. Quality ratings are judged against a five-point scale: Excellent
(five), Good (four), Fair (three), Poor (two), and Bad (one). MOS is
computed by averaging all the ratings given to each test case, and falls
somewhere between one and five. Higher MOS reflects better perceived
quality.
Mean Opinion Scores (MOS) are not a measure of acceptability. While
perceived quality contributes to acceptability, so do many other factors
such as cost and availability of alternative service.
MOS-LQS (Listening Quality Subjective) and MOS-CQS (Conversation
Quality Subjective) are strongly affected by the context of the experiment.1
There is no “correct” subjective MOS for any test case, process, or
connection. This is extremely inconvenient, since it means that it is not
possible to specify performance or verify conformance to design specs
based on subjective MOS.
PESQ (P.862). Subjective studies take significant time and effort to carry
out. “MOS” estimators (MOS-LQOs, where O is for Objective) such as
PESQ, were developed to provide bench tests of the voice quality of a
1. “Context” refers to things like the order in which the test cases are presented in the experiment, the
range of quality between the worst and best test cases used in the experiment, and whether the
subjects are asked to do a task before making a rating. If an experiment is repeated exactly (with
different subjects), the similar scores will be obtained within a known margin of error. This is not the
case from one experiment to another. Consistency from test to test is found in the pattern of scores,
not in the absolute value of the scores. For example, the MOS-LQS for G.711 may be 4.1 in one study,
3.9 in another, and 4.3 in a third, but whatever the value obtained, we expect to see a higher score for
G.711 than G.729, and G.729 and G.726 (32 kb/s) to be about equal.

narrowband communications channel or device. PESQ and similar

methods2 are quick to execute and precisely repeatable; however, they
don’t provide an estimate of the conversational voice quality. In addition,
they don’t provide much assistance in troubleshooting the cause of a
problem.
To perform a PESQ test, one or more speech samples are put through a
device or channel, and the output captured for analysis. PESQ computes
the quality estimate by comparing the input (reference) signal with the
output signal. The more similar the two waveforms, the less distortion there
is and the better the assigned score. To be sure the differences are real and
meaningful, the algorithm does some preprocessing to equalize the levels,
time align the signals, and remove any time slips (where some time has
been inserted or deleted).
Original
Perceptual
Difference Model
System under
test Σ
Output
Cognitive
Model
The system under Quality

test may be a codec, Estimate
a network element,
or a network
Figure C-6:Block diagram showing the operation of PESQ and similar
objective speech quality algorithms. A known signal is input to
the system under test, and the algorithm analyses the difference
between them by applying a model of human auditory
perception followed by a model of human judgment of
preference to arrive at a quality estimate.
Otherwise, such differences might influence the score beyond their actual
effect on the subjective quality. PESQ then applies perceptual and cognitive
2. Many MOS-LQOs have been defined; aside from PESQ, the best-known are PSQM (Perceptual Speech
Quality Measure), and PAMS (Perceptual Analysis Measurement System). As the standard, PESQ
should be used in preference to the older measures.

models that represent an average listener’s auditory and judgment

processes.
Raw PESQ output is not an accurate MOS estimate, so the scores are
usually converted to a Listening Quality score using one of several
available conversion rules.
PESQ measures only the distortion in the sound signal; it does not measure
conversational voice quality. PESQ does not account for listening level,
delay, or echo. As noted above, these are all important determinants of the
voice quality. Separate measures of these characteristics must be
considered along with a PESQ score to appreciate the overall performance
of the channel.
PESQ detects distortions and estimates the associated degradation, but it
provides no information about the cause. There are many potential sources
of distortion: added noise, multiple transcoding, late or lost packets,
corrupted speech data, and so on. This means that PESQ can identify that a
problem exists, but not diagnose it.
To obtain repeatable, reliable results, PESQ users need to fully understand
the characteristics of both the reference signal they are using and the
channel under test. While PESQ ignores absolute signal level in its
computations, the absolute level can influence the final quality. Some
codecs vary in coding distortion depending on the level of the input signal.
Also, the input signal is usually filtered by the sending electronics of the
phone. This filtering is wanted and improves the final heard quality of the
sound. However, PESQ will see this filtering as a mild distortion.
Commercial implementations of PESQ often indicate that the signal should
be inserted into a channel via the handset cord. RJ11 cables may be
supplied for this purpose. The interface between the handset and the
telephone electronics is proprietary, and each vendor’s level and filtering
may be unique at this point in the path. Where the signal is inserted this
way, it may be difficult to control the signal level or to know how it has
been filtered.

Maximum R for
1 given delay
1: High baseline quality, very robust to unplanned impairments

2. Medium baseline quality, moderate robustness
3. Basic quality, less robust to unplanned impairments
4. Low quality, little robustness to unplanned impairments,
not recommended as planning target.
Figure C-7:The normal operating range of R, along with regions under the
curve shown with their interpretations for network planning.
Generally, relative comparisons of R are used to show changes
expected in a shift from an existing to a new network, or
differences between one proposed network and another. The
descriptions given here can be useful in the interpretation of
absolute values of R where that is necessary.

637
Appendix D
Additional Information about IPv6
Concepts covered
Routing in IPv6
Additional Details of Network Control in IPv6
Application Programming Interfaces in IPv6
Detailed Descriptions of Tunnelling Mechanisms
More on Interworking between IPv4 and IPv6
Introduction
This appendix extends the chapter on IPv6 technology in the main body of
the book covering concepts and functionality that are generally not related
to the performance or design of networks for real-time applications. The
information is presented here because it is necessary to fully understand
how an IPv6-enabled network works.
Directing Packets in an IPv6 Network

Address Types in IPv6
IPv6 provides both unicast and multicast addressing as in IPv4. It
introduces a new variant of unicast called ‘anycast’, but does not support
broadcast addressing recognizing the problems with packet broadcast that
were encountered in IPv4. The latest version of the IPv6 Addressing
Architecture [RFC3513] lists the various types and scopes for IPv6
addresses (see Table D-1). A large part of the address space (more than
80%) is currently left unassigned. However, any address that doesn’t fall
into one of the assigned spaces should be assumed to be a global unicast
address.

638 Appendix D Additional Information about IPv6
Address Allocation Type Prefix

Type
Reserved (‘unspecified address’) ::0/128
IPv4 Compatible IPv6 addresses ::0/96
IPv4 Mapped IPv6 Addresses ::FF:0:0/96
Reserved for NSAP allocations 2000::/7
Reserved for IPX allocations (deprecated in 4000::/7

Unicast subsequent draft)
Aggregatable global (most common type) 2000::/3
Link-local FE80::/10
Old site-local (now deprecated) FEC0::/10
Centrally allocated ‘local use’ addresses FC00::/8
Locally allocated ‘local use’ addresses FD00::/8

Address classes defined for several different scopes
with flags indicating if the address is a ‘well known’
one typically registered with IANA or a locally
created temporary one. The x and y fields are both 4
bits long:
x = 0 => Well-known
x = 1 => Temporary
y = 0 => Reserved
Multicast y = 1 => Interface local FFxy::/16
y = 2 => Link local
y = 3, 4 => Unassigned
y = 5 => Site local
y = 6, 7 => Unassigned
y = 8 => Organization local
y = 9-D => Unassigned
y = E => Global
y = F => Reserved
Table D-1: IPv6 Address Types

Appendix D Additional Information about IPv6 639
IPv6 introduced the concept of address scopes in order to avoid the ad hoc
mechanisms based on the Private Addressing Scheme (address 10.x.x.x
etc) [RFC1918] and using IPv4 Network Address Translators to
circumvent the shortage of IPv4 globally unique addresses.
Addresses with a particular scope are not valid outside that scope and
should not be propagated outside that scope. The link-local scope
(addresses only valid on the wire to which the interface is connected) and
global scope are clearly defined and have been fully accepted. However
unicast site-local scope addresses have been extremely contentious for a
number of reasons, including the following:
the difficulties of defining the bounds of a site;
ensuring that any addresses which leak outside the site do not result
in ambiguous routing or loops; and
supporting the merging of sites.
There are a few changes in terminology for the parts of IPv6 addresses. In
IPv4, the address is logically split into a ‘network part’ and a ‘host part’.
Originally, the boundary was at a multiple of eight bits depending on the
class of the address (‘classful addressing’). Classless Inter-Domain
Routing (CIDR) modified this so the boundary could be at any bit position,
removing the concept of address classes.
‘Host part’ is really a misnomer because the address applies to an interface
rather than the whole host; this has been remedied in IPv6. All the
interfaces sharing the same network part make up an IPv4 ‘subnet’. In IPv4
networks, each interface can only have a single IP address. Consequently,
the subnet is usually identified with the physical link to which the
interfaces are connected – in some cases techniques are used to bridge
several physical links into a single virtual link and the subnet then spans the
whole virtual link.
The possibility that interfaces can have more than one IP address is a major
difference between IPv4 and IPv6. As a result, the identification between
subnets and links disappears from IPv6 and each link can support several
subnets.

Sidebar: IPv6 Address Prefix Notation

IPv6 uses the same kind of notation for address prefixes that is used in
IPv4: Classless Inter-Domain Routing (CIDR).
The basic format is the following:
1080:0000:0000:02C0:0000:0000:0000:0000/58
which indicates that the 58 (decimal) most significant bits make up the
address prefix. The remaining bits should be zero. This can be
abbreviated by removing any leading zeros in fields and any complete
fields of zeroes to the right of the prefix length:
1080:0:0:2C0/58
In IPv6, each interface has an Interface Identifier (IID) replacing the ‘host
part’. The rest of an IPv6 address is the Subnet Identifier corresponding to
the IPv4 ‘network part’. As in IPv4, a contiguous set of bits starting from
the left-hand end of the address is known as an (Address) Prefix. The count
of significant bits in the Prefix is the Prefix Length. The same notation used
in IPv4 CIDR is used to specify prefixes.
Although the term ‘subnet’ is extensively used in IP networks, it is actually
not defined at all for IPv6 and is not very well defined in IPv4. The
interfaces belonging to an IPv4 subnet share a host part size and the value
of the network part. The logic for forwarding packets in IPv4 assumes that
addresses with the same network part are connected to the same link as the
sending interface and so can be sent directly without involving a router. A
subnet in IPv6 is a set of interfaces that share a Subnet Identifier. Typically,
they will all be attached to a single link but, as in IPv4, it is possible for
them to be distributed on several links. Forwarding of packets directed to
multilink subnets is more complicated than for single link subnets and they
should probably be avoided, if possible. However, they can be used for very
simple networks that want to avoid setting up any routing at all but, for
example, need to use several Ethernet segments.
An IPv6 node cannot determine whether the destination interface for a
packet is connected to the same link just by inspecting the Subnet
Identifier. Instead, the node normally relies on information from the routers
on the link to determine which Subnet Identifiers are ‘on link’ (see
“Neighbor Discovery and Stateless Auto-Configuration” for further
details).
Each interface has a link local address that is created and used as part of the
node startup process. The link local prefix can be thought of as identifying
a ‘link subnet’ that encompasses all the interfaces connected to the link –
other subnets derived from other sorts of addresses may cover only a subset
of those interfaces. The link local subnet identifier is reused on each link,

but this doesn’t cause problems because packets using these addresses are
never forwarded beyond the local link.
The original proposal for site-local addresses allocated a single prefix to be
used on all sites in a similar way to IPv4 ‘private addresses’, which are
extensively used in Enterprises with NATs. This proposal has been very
contentious. The same site-local addresses would have been used in many
sites, which makes it very difficult to merge addressing schemes when
companies are reorganized and can lead to routing ambiguities if routes to
these addresses leak out to the global routing system.
Site-locals have now been replaced by globally unique ‘local use’
addresses [I-D.ietf-ipv6-unique-local-addr]. This scheme allows for a set of
essentially unique site prefixes to be created either by acquiring one from a
central source or generating a random prefix locally using cryptographic
hashing techniques. Such prefixes have a very low probability of clashing
with any other local use prefix. The prefixes would not normally be used
outside the sites owning them or associated sites that agree to co-operate,
but would cause fewer problems if they did ‘escape’. They will
significantly reduce the problems of merging two sites, as it would be
extremely unlikely that the sites had the same local use prefix.
The aggregatable global unicast addresses are usually constructed by
adding a 64 bit prefix to a 64 bit IID. Often the IID will be a globally
unique number that can be combined with any address prefix to make an
address. The IID can be derived from the MAC address of the interface,
manually configured or generated cryptographically. At present, the
address prefix would be delegated from a provider that would also provide
routing of IPv6 traffic to and from nodes using this address prefix. This
‘provider addressing’ (PA) is designed to support the policy of ‘strict
aggregation’ described in “Unicast Routing and Addressing” .
The IPv6 address architecture allows for two types of addresses:
Unique stable IPv6 addresses: Assigned though manual
configuration, a DHCP server or auto-configuration using the IID
derived from a MAC address.
Temporary transient IPv6 addresses: Assigned using a random
number for the IID.
Transient addresses can be generated cryptographically and altered from
time to time to address security and privacy concerns as described in
[RFC3041]. Cryptographically generated addresses can also be used to
help secure the process of neighbor discovery (see “Neighbor Discovery
and Stateless Auto-Configuration” ).

Unicast Routing and Addressing

The routing of IPv6 packets is exactly analogous to the routing of IPv4
packets. Routers build a forwarding table based on routes received from a
number of sources:
Local subnets implied by the prefixes of addresses assigned to
interfaces on the node and information received from local routers
Static routes configured into the node directly or via DHCPv6
Routes received through dynamic routing protocol sessions
The standard dynamic routing protocols used for unicast routing with IPv4
have been adapted for use with IPv6:
Inter-domain routing is handled by a simple extension to the
multiprotocol version of BGP4 (MP-BGP4) [RFC2545]. The
mechanisms used are identical to those used in IPv4, but the
Network Layer Reachability Information (NLRI) records now carry
IPv6 address prefixes as well or instead of IPv4 prefixes.
For intra-domain routing, there is a choice of protocols:
Integrated IS-IS (iIS-IS) was already able to handle multiple
address families and does not assume that there is only one
subnet per link. The extension to IPv6 is trivial, especially since
the transports used for iIS-IS are link layer protocols rather than
IP [I-D.ietf-isis-ipv6]. A network designer has the option to run
just one instance of iIS-IS to install routes for a network which
can support both IPv4 and IPv6.
OSPFv3 [RFC2740] is the update of OSPFv2 to support IPv6
networks. The changes here are more significant because
OSPFv2 assumes the one-to-one correspondence between
subnets and links built into IPv4. OSPFv3 has to deal with the
possibility that the interfaces on a link can belong to more than
one subnet. OSPFv3 would need further modification to
support IPv4 as well as IPv6.
RIPng [RFC2080] is a direct extension of RIPv2 for IPv4. It is
still useful for small, simple networks.
IPv6 was also intended to address the problems of route aggregation that
were beginning to cause problems in IPv4 even in the 1990’s. As originally
designed, IPv6 would have had a very strict limit on the number of prefixes
that needed to be routed in the core of the Internet. The core is known as
the ‘Default Free Zone’ (DFZ) because every packet has to have an explicit
route in this area of the Internet rather than using a ‘default route’—it is the
area where the top level providers peer and exchange traffic. Unfortunately,
this technically excellent proposal does not match the way in which the
Internet has developed, and it also falls foul of anti-trust legislation because

it would effectively have limited the number of top level Internet providers
to 8192.
The basis of ‘strict aggregation’ is that a network should acquire address
space for its network from a provider – Provider Addressing (PA). The
provider delegates authority for this part of the address space to the
customer and, in turn, will provide the default route for traffic to and from
the customer using these addresses. The addresses come from a larger
block delegated to the provider by a ‘larger’ provider who in turn will route
traffic to and from all of their customer providers. Address delegation is
repeated up to the point where the highest level providers get address space
from the regional Internet Routing Registries such as APNIC, ARIN*,
LACNIC and RIPE. These registries, in turn, have address space delegated
to them by ICANN*/IANA.
The essence of the Internet is global connectivity, so a provider needs to be
able to route traffic from customers to any other destination. Providers at
the top level generally do this by connecting to all the other top level
providers through Internet exchange points and private peering
connections. All the others use a default connection through their parent
provider to route traffic not addressed to customers of peer networks with
which they connect directly.
Providers can build as many connections as they find economically and
technically expedient between their networks and other providers at any
level in the address delegation tree to carry traffic between their customers.
If this scheme is correctly implemented, the number of routes that a
provider has to deal with will be limited by the number of providers that it
peers with, rather than the number of customers who want to advertise
individual routes, as now happens in IPv4. Strict aggregation seeks to
prevent traffic for third parties being carried across links between provider
peers: providers will normally filter incoming and outgoing traffic to
eliminate packets that are not going to or from their customers.
It is now left up to the registries to define policies on what size of address
block they will delegate to providers with particular sizes of customer base,
and to provide recommendations on how this space should be further
divided up by the customers.
A typical small Enterprise or home network might expect to get a ‘/48’
address prefix, whereas a large Enterprise or a medium-sized provider
might get a /32 or /35 address prefix. This gives a lot more scope to
network managers to produce creative addressing plans as compared with
the very restrictive allocations now being given out for IPv4. For example,
the ‘sparse addressing’ plan can be used; instead of packing allocations as
tightly as possible and with a minimal allowance for growth, as has usually
been done with limited IPv4 allocations, it may be desirable to allocate
each new subnet from the middle of the available spare space. This may

seem wasteful, but it will give maximum scope for growth without the
nuisance of having to renumber. Even with a /48 allocation an Enterprise
could configure more than 65000 sub-networks with /64 prefixes using 64
bit IIDs….and there will still be something like 1500 addresses per square
meter of the Earth’s surface.
Multiple Addresses for Interfaces

Any interface on an IPv6 node may have several IPv6 addresses. This is
partly because of the multiple address scopes provided by IPv6. The
network model used for IPv6 assumes that the interface connects to a
‘link’, which is in turn connected to one or more other interfaces on other
nodes. The link can either be a physical one (like an Ethernet segment) or a
logical one (such as a frame relay circuit), but the IPv6 model is biased
towards types of links that support some form of broadcast or multicast on
the link—the prototype is Ethernet and multi-access links that don’t
support multicast or broadcast have to have some way of emulating it. This
doesn’t affect point-to-point links—there is only one possible destination
node making multicast and unicast equivalent!
Each link can host one or more IPv6 subnets with different Subnet
Identifiers (see “Address Types in IPv6” ). Each interface can attach to any
combination of these subnets.
All interfaces are required to have a link local address. This address
connects the interface to the ‘link subnet’ that is used during initial
configuration to contact local routers and information services on the same
link. It might then acquire a local use address and/or one or more global
unicast addresses associated with the subnets to which it is attached.
IPv6 also provides support for the dynamic renumbering of networks; for
example, when an organization changes IP service provider, it would need
to acquire a new address block that can be routed through this new
provider.
During the transition, each node would expect to have addresses for each of
the networks, one of which would be marked as ‘deprecated’ to avoid its
use for new communication sessions. The marking would be changed over
between the new provider and the old provider after deployment of the new
addresses was complete so as to direct traffic through the new provider.
IPv6 provides a mechanism for a node to select the most appropriate source
address from those it has available when transmitting a packet (see also
“Source and Destination Address Selection” ).
Anycast
Anycast is a new facility in IPv6, intended to ease the implementation of
services that should be available from any node but may be implemented
on more than one server. In many cases, this is useful just because it is

more efficient and more scalable to access the ‘nearest’ server rather than
centralizing the service; but, it is also convenient if the information
delivered should be different depending on where the request is made from.
At the application level, this might be because the information relates to the
geographic or topological location from which the request is made. One
example might be the network information delivered by ‘two-faced’ DNS
implementations, which restrict the publicly accessible information to
protect privacy but provide full information to insiders.
An anycast address is indistinguishable from an ordinary unicast address—
the distinction lies in what the destination does. At present, anycast is only
implemented on routers. The router is responsible for knowing the correct
server corresponding to an anycast address and forwarding the message
accordingly.
One application that has been suggested for anycast is locating a DNS
server without having to have the address configured or supplied by DHCP.
Multihoming in IPv6
The strict aggregation rules for Provider Addressing provide a major
problem for IPv6 traffic management that has not yet been resolved at the
time of writing. Traffic using a destination address delegated from a
provider normally has to be routed through that provider. Similarly,
outgoing traffic using a delegated address as source address has to be
routed through the delegating provider. This makes it extremely difficult to
provide redundant connectivity for an IPv6 network through the
multihoming techniques developed for IPv4. Connections between peers in
the delegation tree do not get around this because they only handle traffic
between customers of the connected peers.
In IPv4, a ‘more specific’ (that is, longer) prefix using the address space
provided by the network’s main IP service provider or as part of a ‘provider
independent’ allocation could be advertised via the BGP routing protocol
through an agreeable alternative provider. In this case, if the main provider
suffered a breakdown and had to withdraw its main route, the BGP routing
system would automatically reroute traffic through the alternate provider.
The cost to the IP routing system is the large number of long prefixes that
need to be managed by routers in the core of the network. This is one of the
causes of the explosion in the number of routes processed by BGP in the
core of the network – now more than 100,000 and still growing.
IPv6 wishes to avoid this problem but must still provide a multi-homing
solution to meet customer expectations for resilience and robustness. A
number of solutions have been discussed at length in the IPv6 and IPv6
Multihoming Working Groups in IETF; but, as of yet, there is (in mid
2004) no real solution in sight. This is a major problem for the deployment
of production IPv6 networks. It is not clear what problems the eventual
solution will pose for real-time networks beyond the problems of having

multiple addresses, which are discussed in “Source and Destination

Address Selection” , and the possibility of increased latency if a
changeover has to occur while a connection is active.
Multicast in IPv6
IP Multicast has a much more fundamental role in IPv6 than it did in IPv4.
This is partly because the support protocols that are used; for example, to
resolve the linkage between Layer 3 IP addresses and Layer 2 MAC or link
addresses that were not part of the IP protocol suite. Many of these
protocols, such as the Ethernet Address Resolution Protocol (ARP), relied
on link layer broadcast capabilities to determine the interface associated
with an IP address.
The functions of the various link layer specific support protocols have been
integrated into a uniform framework at the IP layer in IPv6 (see “ICMPv6
and IPv6 Network Configuration” ). Many of these functions rely on the
ability to send messages to groups of interfaces on a link according to their
roles (for example, to all nodes or to all routers) even before the sending
interface knows what other nodes are attached to the link.
As a result, IPv6 is heavily dependent on multicast packet delivery at least
at the level of one link. Well-known link scope multicast addresses are used
for a number of purposes during node startup.
IPv6 does not use broadcast mechanisms at the IP layer at all and the
broadcast address available on each subnet in IPv4 has no analog in IPv6.
In order to optimize the delivery of multicast, nodes have to inform routers
about the multicast groups to which they are listening (that is, the multicast
addresses for which they will accept packets). Routers then only need to
propagate multicast packets onto links where there is at least one listener.
The information needed is maintained by exchanges using the Multicast
Listener Discovery protocol (MLD v1 [RFC2710] or v2 [RFC3596]),
which replaces the IGMP used for the same purpose in IPv4.
Multicast Routing
After a long period of development, the IETF has settled on a small number
of protocols for routing multicast IP packets. The latest generation of
multicast routing protocols is relatively independent of the underlying
addressing and unicast routing infrastructure, and the protocols are
specified for both IPv4 and IPv6.
Multicasting with link scope is needed for the correct operation of all IPv6
networks and does not need any dynamic routing protocols, but networks
that use multicast at larger scopes require one or more of the multicast
routing protocols. The multicast routing protocols are still mostly under
development. Currently, an IPv6 network might provide the following:

Protocol Independent Multicast – Dense Mode [I-D.ietf-pim-dm-

new-v2]
Protocol Independent Multicast – Sparse Mode [RFC2362],
[I-D.ietf-pim-sm-v2-new]
Border Gateway Multicast Protocol [I-D.ietf-bgmp-spec]
Distance Vector Multicast Routing Protocol [RFC1075],
[I-D.ietf-idmr-dvmrp-v3]
Source-Specific Multicast for IP [I-D.ietf-ssm-arch]
Techniques for securing multicast are still under development. Nortel has
been active in the research and standardization efforts and some product
offerings are available. The architecture for securing multicast is discussed
in [I-D.ietf-msec-arch].
Control, Operations and Management

ICMPv6 and IPv6 Network Configuration
As described in the IPv6 chapter, the Internet Control Message Protocol
(ICMP) is an essential assistant to any IP network layer. The set of
functions that ICMP performs for IPv6 is listed in that chapter. This section
presents some details of how the plug-and-play configuration mechanisms
function and some new techniques that are being developed to secure the
process so that it can be of greater utility in attaching to public networks
such as Wireless LAN hotspots.
Neighbor Discovery and Stateless Auto-Configuration

One of the major improvements in IPv6 is complete support for ‘plug and
play’ connection to the network described in [RFC2461] and [RFC2462].

Start
Interface
Identifier (IID)
Generate
link local address
for interface
Router Discovery
Fallback:
Yes DAD: Is No Manual Configuration
Issue Router if IID not from unique
link local address
Solicitation link layer address
unique?
else disable interface
Receive Router
Advertisement(s)
Use router info to

No Check Yes generate address
Policy setting from each prefix
&M=0 and take other
config from router
Check No
policy setting
Yes &O=0
Get address
Use router info to
prefixes from
generate address
DHCPv6: generate
from each prefix:
address from each
Take other
prefix. Use other
config from DHCPv6
config from DHCPv6
DAD: Is Discard
each generated address non-unique
unique? addresses
Unique addresses
assigned to interface
Figure D-1: The IPv6 Auto-configuration process

A host on an IPv6 network will hardly ever need any manual configuration
before acquiring addresses and being able to communicate on its connected
networks; unlike IPv4, where the best that is normally available is that the
user has to tell the host to use DHCP to obtain an address. Two alternative
mechanisms are defined to support this auto-configuration process. They
are known as ‘stateless’ and ‘stateful’ auto-configuration. The first part of
the process is common to the two mechanisms.
When a network interface connects to a new network, IPv6 will normally
create a link local address using the well-known link local subnet prefix
FE80::/64 and an interface identifier (IID). The IID is just a 64 bit long
pattern, intended to be unique to the interface within the scope of the link to
which it is attached. One common way to generate the IID is to extend the

48 bit MAC address of the interface as shown in Figure D-2, but there does
not have to be a relationship between the MAC address and the IID.
Individual(0)/Group(1)
Universal(0)/Local(1)
IEEE802 (Ethernet) MAC Address
xxxxxx00xxxxxxxxxxxxxxxxyyyyyyyyyyyyyyyyyyyyyyyy
(insert 16 bits) EUI-64 Address

xxxxxx00xxxxxxxxxxxxxxxx1111111111111110yyyyyyyyyyyyyyyyyyyyyyyy
(invert bit)
IPv6 Interface Identifier

xxxxxx10xxxxxxxxxxxxxxxx1111111111111110yyyyyyyyyyyyyyyyyyyyyyyy
Local(0)/Global(1)
Figure D-2: Constructing an IID from a MAC address

Alternative ways to generate unique random IIDs have been proposed,
especially to overcome privacy concerns associated with advertising an
address that can be linked to a particular piece of hardware [RFC3041].
However the IIDs are generated, it is intended that the individual/group bit
and universal/local bits have the same significance in all 64 bit IIDs. The
reason for inverting the universal/local bit so that a zero value means local,
is that local IIDs can just be small integers like one or two instead of
0200:0000:0000:0001. This is convenient for configuring point-to-point
links on routers where the IID only has local significance.
The link local address will initially be a ‘tentative address’ because it has
not yet been shown to be unique. The node joins the all-nodes multicast
group (FF02::1) and the solicited-node multicast group for the tentative
address. The Duplicate Address Detection (DAD) process then uses
Neighbor Solicitation messages sent to this solicited-node multicast group
to verify that no other node on the link is using this address. The solicited
node multicast address is formed by combining the low order 24 bits of the
unicast address with the link local multicast prefix FF02::1:FF00/104. A
node that is already using the tentative address will also be listening on this
multicast address and will respond with a Neighbor Advertisement
message sent to the all-nodes multicast address. Neighbor Solicitations
being used for DAD can be distinguished because they use the unspecified
address (all zeros) as the IP source address.
Because an IPv6 interface may use several addresses, usually derived by
combining a single IID with a number of different prefixes, the use of the
solicited node multicast group reduces the number of groups that the node

has to join to monitor DAD solicitations and reduces the number of

Neighbor Solicitations that need to be processed above the link layer. If no
other node responds claiming prior rights on the address, it can be marked
as a ‘preferred address’ and used for future communications of link scope.
Up to this point, the process is the same for both hosts and routers.
Newly connected routers now join the all-routers multicast group (FF02::2)
and send out unsolicited multicast Router Advertisements to the all-nodes
multicast address to let any hosts that are already connected know that a
new router is available.
Newly connected hosts need to find out what routers are connected to the
link and what subnets they can join. The first stage of this process involves
sending out a Router Solicitation message to the all-routers multicast
group. Routers on the link reply with a targeted Router Advertisement
directed to the host that sent out the solicitation.
Router Advertisements potentially contain a considerable amount of
information:
Router lifetime (nodes should not use routers that have not sent
Router Advertisements for more than the lifetime)
Managed (M) flag: If set to 1, nodes on this link should use a
management system such as DHCPv6 to obtain addresses rather
than stateless auto-configuration
Other (O) flag: If set to 1, nodes should use a management system
such as DHCPv6 to obtain configuration information other than
addresses and should use stateless auto-configuration to get
addresses.
The hop limit that should be inserted into packets transmitted
through this router from nodes on the attached link.
Reachability Time (how long nodes should assume that other nodes
on this link remain reachable after they have confirmed that they are
reachable).
Retransmission Time (time between retransmissions of Neighbor
Solicitation messages during address resolution and unreachability
detection).
The link-layer address of the router interface from which the
message was sent (optional)
The MTU for the link (normally sent on links that support variable
MTU but can be omitted for fixed MTU links)
A list of prefixes relevant to the link and the advertising router.
Each prefix has some associated information:

‘A flag’: If set, indicates that this prefix can be used for stateless
(autonomous) address auto-configuration by nodes on the link.
‘L flag’: If set, indicates that addresses using this prefix are
‘on-link’ so that packets can be sent direct rather than through a
router (it is possible that some of the addresses belonging to a
prefix might be on-link and others off-link—for example,
addresses used for Mobile IPv6 nodes would not be on-link
unless the node was ‘at home’. Packets sent to a mobile node
from another node connected to its home link need to go to the
router to be redirected if the node is not at home. Consequently,
the values of the A and L flags are completely independent.)
Lifetime, during which an on-link prefix should be considered
valid when determining if an address is on-link
Lifetime, during which an address created from an
‘autonomous’ prefix should be considered a preferred address.
Once a host has received Router Advertisements, the stateless and stateful
auto-configuration processes diverge. If a Router Advertisement has the
Managed (M) flag set, the host can use another (stateful) means to obtain
addresses usable for communication beyond the local link (see “Stateful
Auto-Configuration and DHCPv6” ). Otherwise, the host expects to find
one or more prefixes in the Router Advertisement with the ‘A flag’ set that
it can use as subnet identifiers to build global or local use addresses by
adding its IID. These prefixes have to be the exact length of the subnet
identifier—no padding or overlap is possible. In most cases, both IID and
subnet identifier are 64 bits long, but other values are possible for some
address values (see “Address Types in IPv6” ).
This process is known as ‘Stateless Auto-configuration’ because there is no
need for the router that advertises a subnet identifier to maintain any state
about which interfaces are using the subnet. The Router Advertisements
may contain additional prefixes that are marked as ‘on-link’, but are not
intended to be used for auto-configuration. Destinations with addresses that
match on-link prefixes can be sent directly without needing to be sent to a
router as the next hop (see “Packet Transmission, Address Lifetimes and
Deprecation” ).
The host should check that the addresses it creates are unique by using
DAD again; but if the previous DAD checks made for the link local address
using the same IID were successful, the new addresses can be assumed to
be unique, classed as preferred addresses and used immediately for
communications.
A host can use a combination of stateful and stateless auto-configuration if
the network administrator finds this convenient. The most recent updates to
the auto-configuration standards stress the role of administrative policy in
determining what combination of mechanisms is used for a each node.

Address Resolution: Linking IP and Link Level Addresses

In order to send packets to a target IP address, IP nodes need to know the
link layer address of the interface on the ‘next hop’ node to which the
packet needs to be forwarded. IPv4 required support from additional
protocols, normally the Address Resolution Protocol (ARP), to discover
the link layer (MAC) addresses of other interfaces on the same link. ARP
was originally specific to Ethernet link layers.
IPv6 avoids the need for ARP and related protocols by adding optional link
layer address fields to the Node and Router Solicitation and Advertisement
messages. These options provide the link layer address that is associated
with the IP source address of the message.
Routers learn the link layer addresses corresponding to host IPv6 addresses
from Router Solicitations. A host learns the link layer addresses of routers
from the received Router Advertisement messages at the same time as it
discovers their IP addresses and the available subnet prefixes.
Hosts learn the link layer addresses of other hosts on the link by
exchanging Neighbor Solicitations and Neighbor Advertisements. As with
routers and hosts, only a two-message exchange is needed for both parties
to learn the correspondence between link and IP addresses at both ends.
Securing Neighbor Discovery

When IPv6 was originally designed, it was suggested that IPsec could be
used to authenticate and validate the discovery messages. IPsec relies on
bilateral Security Associations (SA) that have to be established before
IPsec can be used. SAs can either be manually configured or established
through a dynamic key exchange protocol such as IKE. However, since
neighbor discovery is the first network operation for a node that is starting
up, a key exchange protocol is not an option – the node hasn’t established
itself on the network yet! To be able to use IPsec to secure neighbor
discovery would require each pair of nodes on the link to have two
manually preconfigured SAs (SAs are unidirectional). Having to pre-
configure many SAs would wipe out the operational advantages of IPv6
plug and play network connection and the process becomes impractical for
a network with more than a very few nodes because of the need for ~2*n2
SAs for a network of n nodes.
The solution being standardized (as of mid-2004) relies on zero-knowledge
techniques plus security certificates to provide a basis of trust for nodes
joining the network. Although this still requires a node to be configured
with the identity of a certificate authority that provides a common basis of
trust for all nodes on a network, configuring a few authorities (often just
one) will allow the node to join many different networks and authenticate
its address to the network. This is very important to a mobile node that

might want to join; for example, many different wireless LAN networks on
an ad hoc basis.
The Secure Neighbor Discovery (SEND) process [I-D.ietf-send-cga],
[I-D.ietf-send-ndopt] generates an IID linked to the subnet identifier, a
random modifier and a locally generated public/private key pair through a
cryptographic hashing process. Additional fields, including a digital
signature based on the key pair, are added to the ICMPv6 neighbor
discovery messages. This signature allows recipients to verify that the IPv6
address was generated for the subnet identifier in use by the originating
node and that the link layer address supplied is correctly associated with
this IPv6 address. Conversely, hosts receiving router advertisements can
verify, on the basis of the configured certificate(s), that they come from a
trusted source. In combination with a suitable node authentication
mechanism, this allows the nodes on a network to establish that packets are
coming from trusted sources and are being routed by a trusted router.
Stateful Auto-Configuration and DHCPv6

If a host is to be auto-configured but the Router Advertisement specifies
that it is ‘managed’, the host has to use a ‘stateful’ mechanism to obtain
prefixes or complete addresses. The standardized protocol for this is the
Dynamic Host Configuration protocol (DHCPv6). The process is stateful
because the DHCPv6 server has to remember which address it has assigned
to a host, and the server may be configured to hand out the same address to
a host in different sessions.
The general way that DHCPv6 works is similar to DHCP for IPv4, but the
options are different and less extensive. DHCP was developed as an
extension of the Sun Microsystems proprietary BOOTP and inherited many
options that are now little used. The opportunity has been taken to
eliminate these options, but consequently some applications may need
extensions beyond what has so far been standardized.
As in DHCP, a host can contact an on-link DHCPv6 server directly using a
multicast address, or an on-link router can provide a DHCPv6 relay
function that allows DHCP information to be obtained from a server
elsewhere on the same site.
DHCPv6 can be used to provide additional configuration when Neighbor
Discovery and Stateless Auto-Configuration are used to set up addresses.
Router Advertisements signal to nodes that they can do this by setting the
‘Other’ (O) flag in the advertisements. DHCPv6 provides an optimization
for cases where nodes need configuration information that is not node
specific. A stateless mechanism defined in [RFC3736] is used and the
number of messages needed is reduced. If a network only needs the
stateless DHCP capabilities, a simplified server can be used and the cut
down system is often referred to as ‘DHCPlite’.

Packet Transmission, Address Lifetimes and Deprecation

Each node manages four tables in which it consults when it has to send a
packet:
destination cache: Maps the destination IPv6 address to the
corresponding address of the next-hop neighbor.
neighbor cache: Maps IPv6 addresses to the corresponding
neighbor's link-layer address.
prefix list: Contains the list of on-link prefixes obtained by Router
Advertisement messages.
router list: Lists the IPv6 addresses of routers that have recently
sent Router Advertisement messages.
Most of the entries in these tables have limited lifetimes and may be
removed when the associated timers expire.
When a node has to transmit a packet, the first thing it must do is find the
next hop for the destination concerned. The next hop is a node directly
connected to the link with which the source is associated. In many cases,
the source will have sent a packet to the destination in question on an
earlier occasion so that the next-hop address will already be stored in the
destination cache; consequently, the source consults this cache first. If it
does not contain the IPv6 address of the next hop, the next-hop
determination procedure must be used.
This procedure operates as follows. The node that must transmit the packet
performs a longest prefix match against the prefix list to determine whether
the node is on-link. If the destination is on-link, it is also the next hop.
Otherwise, the sender selects a router from the router list at random and
uses it as the next hop for the destination concerned. It stores this router's
IP address in the destination cache so that it can be used again for the
subsequent packets.
The node does not use a special criterion in selecting the next-hop router.
Consequently, the selected router may not always represent the best route
to the destination. In such cases, the router may send an ICMPv6 Redirect
message to inform the source that there is a better next-hop to the
destination.
At this point, the neighbor's IPv6 address is known, but its link-layer
address must still be determined in order to send the packet. The link-layer
address is stored in the neighbor cache. If the address is not present in the
cache, it can be found by means of the Address Resolution procedure (see
“Address Resolution: Linking IP and Link Level Addresses” ).
Once the next-hop link-layer address is known, the source can send the
packet.

Start
Destination
Address
Present:
Next Hop Address from
Destination Cache In Absent
Destination
Cache?
Check
Reachable Next Hop Addr Stale
in Neighbor
Cache No More
Routers Alternative
Drop
Router
Reachable Do Neighbor Packet
Selection
Unreachability
Not
Detectionr
Reachable
Next Router:
Router
On-link Destination Off-link Address
Address in
Prefix List?
First
Drop Router
Packet Router Selection
List Empty
Do Address
Resolution on Check
Destination Address Absent Router Address Stale
in Neighbor
Cache
Reachable
Do Neighbor
Do Address Unreachability
Resolution on Detection Not
Router Address Reachable
Reachable
Packet Transmission
Figure D-3: IPv6 Packet Transmission process

Redirect messages are potentially a major security loophole—a malicious
node could divert packets by sending spurious redirect messages. Nodes
can be configured to disallow both sending and acting on received redirect
messages. In a network that is using only stateful auto-configuration, it is
desirable that routers are configured to inform hosts about the on-link
prefixes that are in use through Router Advertisements so that it is not
necessary to send redirect messages relating to these prefixes.
Auto-configured addresses are almost always obtained on limited-lifetime
leases. The length of the lease will be specified by means of a preferred and
a valid lifetime when the prefix or address is obtained. Address leases can
be extended either by Router Advertisements or through DHCPv6, if they
were obtained through stateful auto-configuration. The initial link local
address has an infinite lifetime and never needs renewing.

Auto-configured addresses are in one or more of the states shown in the

following illustration. The relationship between the states of an auto-
configured address and the preferred and valid lifetimes are also shown.
Valid
States
Tentative Preferred Deprecated Invalid
Time
Preferred Lifetime
Lifetimes
Valid Lifetime
Figure D-4: States for Auto-configured addresses

The following table shows the states and their descriptions.
State Description
Tentative The address is in the process of being verified as unique.

Verification occurs through duplicate address detection. The
Router Advertisement message specifies the period of time
that an address can remain in this state.
Preferred The address has been verified as unique. A node can send
and receive unicast traffic to and from a preferred address.
The Router Advertisement message specifies the period of
time that an address can remain in this state.
Deprecated The address is still valid, but using it for new

communication is discouraged. Existing communication
sessions can continue to use a deprecated address. A node
can send and receive unicast traffic to and from a deprecated
address.
Valid The address can send and receive unicast traffic. This state
covers both the preferred and deprecated states. The Router
Advertisement message specifies the period of time that an
address can remain in this state. The valid lifetime must be
greater than or equal to the preferred lifetime.
Invalid The address can no longer send unicast traffic to or receive

it from a node. An address enters this state after the valid
lifetime expires.
Table D-2: States for auto-configured addresses

If an address in the Neighbor Cache is not used for a time, the cache entry
will become ‘stale’ and the node will need to refresh the information by
repeating the Address Resolution exchange. Router addresses can also
become stale, but will normally be refreshed by routers multicasting
unsolicited Router Advertisements. If a host misses an unsolicited Router
Advertisement, it can also solicit the router explicitly.
If a link layer address changes (for example, because of a change of IID for
privacy reasons or a hardware changeover) a node can multicast an
unsolicited Neighbor Advertisement to notify other nodes on the link of the
change: nodes receiving the advertisement update their Neighbor Cache to
reflect the change.
Finally, Neighbor Unreachability Detection (NUD) tracks interface failures
and nodes that leave the network. The corresponding entries in Neighbor
Caches will be removed when they become stale and there is no response to
Neighbor Solicitations.
Application Programming Interface

The BSD* standard socket interface has been retained and extended for
IPv6 [RFC3493]. The extensions allow both source and binary
compatibility with legacy applications. A new address family AF_INET6 is
used when creating a communication end point (‘socket’) if the socket is to
support IPv6 communications. A single IPv6 socket can be used to
communicate with both IPv6- and IPv4-capable nodes: IPv4 will be used if
the destination address is an IPv6 address using the ‘mapped IPv4’ prefix
(see “Address Types in IPv6” ).
Some of the supporting functions in the BSD suite are IPv4 specific and
cannot be extended so as to give backwards compatibility. These functions
have been retained unchanged to support legacy applications and new
functions provided for applications that are written to allow the use of
IPv6. The most important of these is getaddrinfo, which replaces
gethostbyname and provides protocol and address family independent
mapping from textual names to IP addresses for node names and services;
getaddrinfo is the normal route to performing DNS lookups of domain
names. This function is extremely flexible (marketing speak for ‘has a lot
of options’). By appropriate setting of the options, it can support
application protocol family preferences. For example, a new application
might request return of just IPv6 addresses because it needs some IPv6-
only feature (such as flow labels), whereas an adapted application might
want to use IPv4 only if no IPv6 addresses are available - getaddrinfo
can handle this by returning IPv6-mapped IPv4 addresses when no native
IPv6 addresses are available.
The basic socket interface does not provide programming interfaces to
some of the features of IP protocols. IPv6 provides more such features and
an ‘Advanced Socket Interface’ [RFC3542] has now been standardized to

avoid the ad hoc schemes that some operating systems provided for IPv4
(for example the use of ioctl in BSD and Linux systems). Some further
extensions exist to support particular aspects of IPv6, such as Multicast
Listener Discovery v2 [RFC3678], and more are planned, such as
interfaces to support Mobile IPv6.
Source and Destination Address Selection

Any IPv6 node may have multiple addresses and several may be bound to a
single interface. Upper layer protocols and applications need to select the
most appropriate destination and source addresses for a communication
session or even a single UDP packet to be transmitted.
When getaddrinfo (see “Application Programming Interface” )
invokes DNS to look up the IP address corresponding to the domain name
of a destination, applications must expect to receive a list of addresses.
There is no guarantee that a destination will be reachable at that moment
using any one of the addresses – a network failure or congestion might
render any of them temporarily unusable:
a network may be in the process of renumbering – depending on
how far along the process has got, either the old or new address may
be ‘deprecated’ and should not be used for new communications
a misconfiguration might result in DNS advertising a local use
address
DNS is not in a position to suggest preferences amongst these addresses,
but the host will typically want to provide a policy, such as preferring IPv6
addresses to IPv4 addresses. This policy can be implemented by sorting the
list of addresses returned by getaddrinfo. The application must still
expect that its first choice of destination address may not work when tried
and be prepared to retry using the other addresses in the list in turn.
Having multiple possible source addresses affects the choice of addresses
in two ways:
For a destination address to be considered at all, the host needs to
have a compatible source address (for example, a host that has no
global scope unicast addresses cannot send to a global scope
address) and the preferences for usable destination addresses are
affected by the preferences for the corresponding source addresses
(for example, destinations that can be reached using a ‘preferred’
source address would take precedence over a destination that can
only be reached using a ‘deprecated’ source address).
Applications normally leave the choice of source address to be used
for a packet up to the socket layer in conjunction with the IP stack.
Once a destination address has been selected, a compatible source
address can be selected using another set of precedence rules. If the
application chooses to specify the source address, the socket layer

has to check that the source and destination are compatible – an

error will result if they are not.
The precedence rules and algorithms needed to handle these interactions
are implemented in the socket layer as specified in Default Address
Selection for IPv6 [RFC3484]. They are defaults in the sense that
applications or protocols are free to choose any available and compatible
pair of source and destination addresses. For example, an application may
be aware that on certain destination hosts, the peer application can only
communicate using IPv4 even though the host offers IPv6 addresses and
the default rules would prefer the IPv6 addresses.
Security APIs
These APIs are not fully standardized, but most implementations tend to
follow the APIs developed for the KAME project, one of the leading
developers of IPv6 software for the open source community.
The operating system for a node that supports IPsec needs to provide a
secure database of Security Associations (SADB) that can be accessed by
the IP layer during packet transmission. The usage of IPsec for individual
communication sessions is constrained by an overall security policy set by
the system administrator. The policy can typically be set to implement
varying degrees of security ranging from mandating the use of IPsec for all
incoming and outgoing connections through to allowing individual
applications a choice of whether to use IPsec on outgoing packets and
accepting both secured and unsecured incoming packets.
Typically, the SADB is implemented alongside the IP protocol stack within
the kernel of the operating system for performance and security reasons.
The operating system then has to provide APIs and libraries which:
Allow the creation and manipulation of SA data structures (KAME
provides the ipsec_set_policy library)
Allow suitably privileged processes (such as a key management
protocol daemon) access to the SADB (KAME uses socket based
communication with a specialized PF_KEY protocol family to
exchange messages between user processes and the SADB)
[RFC2367]
Allow individual applications to set the security requirements for
each communication socket that it uses within the constraints of the
overall security policy (KAME provides additional options for
setsockopt and getsockopt that allow the application to
modify the default security policy).
More information about a typical implementation can be found in the
NetBSD implementation [NetBSD-IPsec] and the KAME project webpage
has more details of ongoing work [KAME].

IPv6 Transition Mechanisms

Tunnelling is (in this case) the encapsulation of IPv6 traffic within IPv4
packets so that they can be sent over an IPv4 backbone, allowing isolated
IPv6 end systems and routers to communicate without the need to upgrade
the IPv4 infrastructure that exists between them. Tunnelling is one of the
key deployment strategies for both service providers and Enterprises during
the period of IPv4 and IPv6 coexistence.
A considerable variety of tunnelling mechanisms have been proposed in
the ngtrans/v6ops working groups and many of them are implemented in
various vendors' routers. One key distinguishing factor is the way in which
tunnels are configured:
Manual configuration for 'IPv6 Manually Configured Tunnels' and
'IPv6 over IPv4 GRE Tunnels'
Semi-automatic for tunnels set up by the 'Tunnel Broker'
Fully automatic for 'IPv4 Compatible', '6to4', 'ISATAP' and '6over4'
tunnels.
Both ends of manually configured tunnels have to be set up individually,
while the automatically configured tunnels need only to have the
mechanism enabled on the appropriate routers. The overhead of setting up
manual tunnels makes them more suitable for situations where there is
regular communication between the islands, whereas automatic tunnels suit
more transient situations. Some types of tunnels (notably ISATAP) are
specifically designed for intra-domain use within a single site or campus,
whereas others can be used both intra- and inter-domain connections.
A major issue for IPv4 tunnels carrying IPv6 traffic is the ability of tunnels
to traverse NATs. Such tunnels use a special protocol number (61) and
many NATs will not allow this 'protocol' to pass because there is no easy
way to provide a suitable mapping and remember state about the mapping.
To overcome this problem, alternative tunnelling schemes have been
developed that encapsulate the IPv6 packet in an IPv4 UDP transport
packet that can pass through NATs. The Teredo scheme was developed by
Microsoft, but is now being offered for standardization; the Silkroad*
proposal is a more recent development from the Institute of Computing
Technology in China, which has some different characteristics.
Table D-3 summarizes the characteristics of the various tunnelling
mechanisms.

Tunnel Primary Use Benefits Limitations

Mechanism
IPv6 Stable and secure DNS with support Tunnel between two points
Manually links for regular for IPv6 not only. Large management
Configured communication. required. overhead. No
Tunnel independently managed
Connection to
NAT2.
6bone1.
IPv6 over Stable and secure Well-known Tunnel between two points
IPv4 links for regular standard tunnel only. Management
GRE Tunnel communication. technique. Only overhead. No
tunnelling method independently managed
that will allow iIS-IS NAT. Cannot use to
to work through connect to 6bone.
tunnels.
Tunnel Broker Standalone Tunnel set up and Potential security

isolated IPv6 end managed by ISP. implications.
systems.
Automatic Single hosts or Very simple to set Obsolescent.

IPv4- small sites. up. Communication only with
Compatible Infrequent other IPv4-compatible
Tunnel communication. sites. Does not scale well.
No independently
managed NAT.
Automatic Connection of Easy to set up with No independently

6to4 Tunnel multiple remote no management managed NAT. Careful
IPv6 domains. overhead. attention has to be paid to
Frequent configuration of routing
communication. and filtering to avoid
security loopholes.
ISATAP Campus sites. Similar to 6to4 Limited commercial

Tunnels Transition of mechanism, but available.
nonrouted sites. targeted at a single
site.
6over4 Campus sites. — Obsolete. Requires IPv4

Tunnels Transition of multicast capability on
nonrouted sites. sites. Expect to be moved
to historic status.

Tunnel Primary Use Benefits Limitations

Mechanism
TEREDO Tunnels through Works Proprietary. Requires

tunnels NATs and independently of specialized servers at one
firewalls NATs and firewalls end and specific
addresses.
Silkroad Tunnels through Works Uses specialized servers

tunnels NATs and independently of but normal addresses
firewalls NATs and firewalls
1. The 6bone is an experimental network that allowed early adopters of IPv6 to interconnect using
specialized address formats either via native IP v6 connections where they were available, or tunnels
across the existing IPv4 infrastructure to an IPv6 Internet exchange where traffic could be exchanged.
This system was essential before significant numbers of ISPs were offering IPv6 connectivity. With
the extension of such services and the greater availability of addresses from regional Internet
registries, it has been decided that no further 6bone addresses will be issued as of January 2004 and
the 6bone will be shut down in due course.
2. The phrase ‘No independently managed NAT’, which occurs in several of the entries in this table,
indicates that the tunnels cannot be used to connect from inside a domain behind a NAT to a point
outside it unless there is co-operation between the tunnelling mechanism and the NAT.
Table D-3: IPv6 Transition Tunnelling Mechanisms
All of the tunnelling mechanisms require that the nodes at the endpoints of
the tunnel can run the appropriate IPv4 or IPv6 protocol stacks on the
relevant network interfaces. One simple way that this can be achieved is to
use a router that can operate in Dual-stack Mode on all interfaces as
described in RFC 2893, and has a tunnel pseudo-interface.
Not all kinds of tunnel can report the MTU of the tunnel path and so a
packet may have to be fragmented, especially because the tunnel
encapsulation extends the packet. In the case of IPv4 tunnels, the tunnel
egress would normally be expected to reconstruct the IPv6 packet, but IPv6
tunnels may have to behave like a host stack and send IPv6 packet
fragments through the tunnel.
It may be possible to protect the IPv6 in IPv4 tunnels using IPv4 IPsec by
applying authorization and/or encryption to the tunnel interface as well as
the physical interfaces.
Types of Tunnels
IPv6 Manually Configured Tunnels

This is the most basic kind of tunnel described in [RFC 2893]. At the
ingress point of the tunnel, the tunnel endpoint has to be assigned an IPv4
address and the IPv4 address of the corresponding egress point has to be
associated with one or more IPv6 address prefixes so that the router can
direct packets destined for these addresses to the tunnel interface for

encapsulation. At the egress point, the standard requires that ingress

filtering be performed before decapsulation to prevent tunnelling being
used to circumvent normal ingress filter security checking. This means that
the egress point has to be configured with the IPv4 addresses of the ingress
points of tunnels that terminate at the egress point. This entire
configuration has to be done manually, and amounts to managing both ends
of a permanent link between two IPv6 domains across an IPv4 backbone.
Manually configured tunnels are convenient for providing stable
connections that require secure connections between two edge routers or
between an end system and an edge router, or for connection to the 6Bone.
Because each tunnel is independently managed, the more systems that are
linked, the more tunnels are needed and the greater is the management
overhead.
IPv6 over IPv4 GRE Tunnels

Instead of the basic IPv6 in IPv4 encapsulation used in Manually
Configured tunnels as described in [RFC 2893], it is possible to use the
Generic Routing Encapsulation (GRE) described in [RFC 2784] as the
method of encapsulation. The scheme is otherwise identical to Manually
Configured tunnels, but has one advantage over other methods.
The Integrated IS-IS (iIS-IS) routing protocol runs directly over Layer 2
data links rather than using IP datagrams for inter-router communication.
Tunnelling techniques other than GRE tunnels cannot distinguish iIS-IS
traffic from IPv6 traffic and, therefore, will rule out the use of iIS-IS across
the tunnel links. GRE tunnels allow multiple different types of payload
packet to be distinguished on any tunnel, and can carry both iIS-IS and
IPv6 traffic over the same tunnel. The trade-off for this is the lack of any
OAM information from within the tunnel.
GRE encapsulation can, in principle, be used with any of the automatic
tunnelling techniques described below.
IPv6 Tunnel Broker

To reduce the management burden of manually configuring tunnels, [RFC
3053] describes a Tunnel Broker service that would reduce the
management overhead of performing configuration for both ends of the
tunnel.
A Tunnel Broker used to establish tunnels between domains suffers from
the usual drawbacks that come from having to configure devices in both
domains: a router or end system in one domain has to accept configuration
changes from a remote server in another domain, with the potential security
implications of this activity.

Automatic IPv4-Compatible Tunnels

IPv4-compatible tunnels are also defined in [RFC 2893], and are one of the
earliest transition mechanisms that were described. The new version of this
standard, which is in preparation as [I-D.ietf-ngtrans-mech-v2], omits
automatic tunnels because the mechanism does not scale well on large
networks and requires an IPv4 address for each endpoint to which packets
are to be routed, obviating the main reason for deploying IPv6 in the first
place.
Automatic 6to4 Tunnels

This mechanism described in [RFC3056] is expected to be the dominant
automatic tunnelling mechanism used for IPv6/IPv4 coexistence. It allows
isolated IPv6 domains to be connected over an IPv4 network and allows
connections to remote IPv6 networks, such as the 6Bone.
This mechanism is much more scalable because it uses the connectionless
IPv4 network as a virtual nonbroadcast link layer. The IPv4 address of the
tunnel destination endpoint is embedded in a globally routable IPv6 prefix
that is used for all 6to4 destinations that can be reached through the tunnel
endpoint. Assuming both ends of the communication use this type of
address, the tunnel endpoints can build the encapsulating IPv4 header by
extracting the IPv4 addresses from the 6to4 IPv6 addresses.
A relay mechanism has been defined that will allow 6to4 tunnels to provide
links between domains that use ‘normal’ IPv6 address prefixes rather than
just domains using the specialized 6to4 prefixes with embedded IPv4
addresses. An anycast mechanism for finding a suitable relay service is
described in [RFC3068].
Automatic ISATAP Tunnels

Intra-site Automatic Tunnel Addressing Protocol (ISATAP)
[I-D.ietf-ngtrans-isatap] is a relatively new transition mechanism designed
to allow individual end systems or hosts on a predominantly IPv4 site to
use IPv6 to communicate with an IPv6 router or other IPv6 capable host
using the IPv4 infrastructure as a virtual non-broadcast link layer as in the
6to4 tunnelling mechanism.
In this environment, it is assumed that nodes will already have an IPv4
address. This address is used to form an Interface Identifier (IID) for the
node/interface. Within the site, these IIDs will be unique—whether the site
is using globally unique or private IPv4 addressing—so that unique IPv6
addresses can be built for all node/interfaces, which can be linked locally,
site locally or globally routable, by prepending a suitable prefix. These
addresses can be used to inform a tunnel endpoint how to encapsulate the
IPv6 packet in an IPv4 packet and dispatch it over the IPv4 infrastructure.

It is not necessary that there are any IPv6 routers on the site; isolated IPv6
capable hosts can interwork without needing routers.
Automatic 6over4 Tunnels

Transmission of IPv6 over IPv4 Domains without Explicit Tunnels
[RFC2529] defines a mechanism that uses IPv4 multicast domains to
achieve essentially the same aims as ISATAP. Because of the reliance on
IPv4 multicast, this scheme is now obsolescent; either 6to4 or ISATAP
tunnels should be considered as an alternative.
Teredo Tunnels
Teredo is a proprietary specification designed by Microsoft and described
in [I-D.huitema-v6ops-teredo]. It has the distinction that it can operate
through and independently of NATs and firewalls because it uses UDP
encapsulation. However, it relies on a specialized server at one end of the
tunnel and needs to use a particular address format.
Silkroad Tunnels
Silkroad is an alternative solution to the Teredo proposal for tunnels that
have to traverse NATs and firewalls, which is described in
[I-D.liumin-v6ops-silkroad]. It also requires a server to assist in
determining what kind of NAT may be present on the tunnel path and
requires some modifications to access routers which terminate the Silkroad
tunnels. Unlike Teredo, Silkroad does not need specialized addresses.
Mechanisms for Interworking between IPv4 and IPv6

This section describes various ways in which communication can be
established between an IPv4 host and an IPv6 host. Several basic
techniques are used: application layer relay, translation at the network level
and temporary allocation of an IPv4 address to an IPv6 node are all
represented here. The mechanisms also fall into two categories: those that
require modifications to the end hosts and those that are entirely
implemented in routers and specialized translation nodes, which may be
implemented in routers or specialized servers.
Many of the translation techniques require complex adaptation for
particular protocols and may have security problems. Transition to IPv6
without using translators is desirable when it can be achieved. Translators
will introduce significant delay and represent a nexus of failure and a target
for security attacks so that network designers should try to avoid using
them if at all possible.

Translation Primary Use Benefits Limitations Requirements

Mechanism
NAT-PT IPv6-only hosts No dual stack. No end-to-end Dedicated

to IPv4-only No end system IPSec. translator node.
hosts. changes. Dedicated DNS with
translator is a support for
single point of IPv6.
failure. Can
only translate
certain
protocols with
help of ALGs
Transport Translation Freeware No end-to-end Dedicated

Relay between IPv6 implementation IPSec. server. DNS
Translator and IPv4 on available. Dedicated with support for
(TCP-UDP dedicated server is single IPv6.
Relay) server. point of failure
and needs
ALGs as for
NAT-PT
BIS Legacy IPv4- End-system Equivalent to Updated IPv4

(Bump in the only implementation. NAT-PT but protocol stack.
Stack) applications on running in host. Temporary local
dual stack hosts Single point of pool of IPv4
communicating failure addresses per
with IPv6-only eliminated but host (not seen
hosts. still needs outside host).
ALGs.

Translation Primary Use Benefits Limitations Requirements

Mechanism
BIA Using legacy End-system Intended as fix Specialized

(Bump in the IPv4 implementation. when 'shim' library
API) applications on Needs only an applications translating IPv4
IPv6 only hosts. IPv6 stack. cannot or have API calls into
not been ported IPv6 API calls.
to run on native
IPv6.
Experimental.
SOCKS64 IPv6-only hosts Freeware Similar to Client and

(SOCKS- to IPv4-only implementation Transport gateway
Based IPv6/ hosts. available. Relay software in the
IPv4 Translator, but host and dual
Gateway) with improved stack SOCKS64
security. server/relay.
SOCKS64
server would
have to co-
operate with
any NAT
implementation
Table D-4: IPv4 - IPv6 Interworking Mechanisms
The translation mechanisms that allow communication between IPv6-only
and IPv4-only hosts, such as NAT-PT or BIS, use an algorithm called
Stateless IP/ICMP Translator (SIIT) [RFC 2765]. This mechanism
translates, on a packet-by-packet basis, the headers in the IP packet
between IPv4 and IPv6, and translates the addresses in the headers between
IPv4 and either IPv4-translated or IPv4-mapped IPv6 addresses. The
mechanism assumes that a pool of IPv4 addresses is available to it for use
when mapping IPv6 source addresses in packets going from IPv6 to IPv4.
Unfortunately, many IP protocols and applications in both IPv4 and IPv6
have knowledge of information from the network layer (such as the IPv4 or
IPv6 address length or actual IP addresses, rather than domain names). For
these protocols the translation has to extend beyond the network layer
header into the application layer payload. The protocols for which
Application Level Gateways (ALGs) have to be provided are a superset of
those that need similar treatment at IPv4 only NATs (for example, DNS,
FTP). To confirm whether a particular protocol is affected, consult the set
of eight IETF documents, which contain a Survey of IPv4 Addresses in
Currently Deployed IETF Standards covering almost all published RFCs.
There is an introduction to these documents in [RFC3789]; the survey

documents both the dependencies and the protocol fixes that have been
provided or, in some cases, will be needed.
Network Address Translator - Protocol Translator (NAT-PT)

NAT-PT [RFC2766] translates packets at the network layer as described in
SIIT [RFC 2765]. NAT-PT has many of the same characteristics, including
limitations, as conventional IPv4 NATs.
NAT-PT was designed to support a deployment of IPv6-only (or IPv6
mainly) hosts in a closed domain, which need to communicate with legacy
IPv4 hosts and services outside the domain. As with NAT, all the traffic
requiring translation has to be directed through a small number of devices
on the border of the IPv6-only domain that can speak IPv6 on one side and
IPv4 on the other. As with NAT, issues such as performance, requirement
for ALGs, stateful translation leading to fate sharing and difficulties in
rerouting in the event of failure have to be taken into account when
deploying NAT-PT. Considerable discussion of the applicability of
NAT-PT occurred in the ngtrans and v6ops working group, and it is
probable that an updated version of the NAT-PT specification will be
published incorporating the outcome of these discussions.
One application where NAT-PT could be used without so many pitfalls is as
a 'front' for a specific legacy service to allow it to accessed over IPv6. In
this case, the NAT-PT device can be optimized to support a small number
of protocols rather than trying to translate every possible protocol that is in
use today.
Transport Relay Translator (TRT)

The Transport Relay or TCP-UDP Relay Translator [RFC 3142] is very
similar to NAT-PT but works at the transport layer. UDP and TCP transport
connections coming from the IPv6 side are terminated at the transport relay
device and the transport PDUs relayed to a separate transport connection
on the IPv4 side, and vice versa. A specialized DNS server provides
mapping of addresses between IPv6 and IPv4.
The mechanism is particularly appropriate for bidirectional traffic because
it is not affected by asymmetric routing; but unidirectional and multicast
traffic are not supported and it is even less friendly to IPsec than NAT-PT,
which can potentially transmit ESP transport mode traffic. TRT is most
useful for native IPv6 networks that want to access IPv4-only web servers
or similar specific applications without the expense of upgrading either the
IPv6 or IPv4 sides. TRT is currently defined for TCP and UDP; other
transport protocols could be supported but definitions have not yet been
standardized.
Freeware implementations of TRT are available from various locations, but
will require fast hardware if reasonable throughput is to be obtained.

Bump in the Stack Translator (BIS)

The Bump in the Stack translator (BIS) [RFC2767] model allows non—
IPv6-capable applications running on a dual IPv4/IPv6 stack host to
communicate with IPv6 only hosts. The host with BIS has three modules
added to the IPv4 stack that intervene between the application and network
layers: these are an extension to the name resolver, an address mapper and
a translator.
When an IPv4 application needs to communicate with an IPv6 only host,
the IPv6 address is mapped to an IPv4 address out of a pool local to the
dual stack hosts. The IPv4 packet generated by the IPv4 only application is
then translated into an IPv6 packet according to SIIT [RFC2765].
BIS can be considered as an implementation of NAT-PT within the stack of
a host. Although it requires modifications to the end host, it can potentially
sidestep many of the limitations of separate NAT-PT boxes— there are no
problems about fate sharing, topology or performance (beyond the extra
cycles needed on each BIS implementation), and it may be possible to
incorporate end-to-end security.
Bump in the API Translator (BIA)

Bump in the API (BIA) [RFC3338) is an experimental solution that could
be used in the later stages of the transition to support legacy IPv4
applications that cannot be upgraded to use an IPv6 API (for instance
because the source code is not available). The technique is similar to BIS,
but the translation is incorporated into a shim library, which is called by the
legacy application and translates IPv4 API calls into IPv6 API calls. Hence,
this host only needs an IPv6 stack.
SOCKS64 Translator Gateway

The SOCKS64 Translator Gateway [RFC3089] is a system that accepts
SOCKSv5 [RFC1928] connections from IPv4 hosts, and relays them to
IPv4 or IPv6 hosts in a similar way to the TRT mechanism but with
additional authentication. For sites that already 'socksified' (that is, use
SOCKS-aware clients and a SOCKS server in co-operation with a
conventional NAT), the SOCKS64 Gateway provides an easy way to enable
IPv4 hosts to connect to IPv6 hosts.
Because of the existing SOCKS negotiations, no DNS modifications or
address mappings are needed. The principle can also be used to allow IPv6
hosts to connect to IPv4 hosts, IPv4 hosts over IPv6 networks and IPv6
hosts over IPv4 networks. The latter cases resemble tunnel techniques, but
without possible problems with fragmentation or hop limits.


In the later stages of deployment of IPv6, IPv6 will become the norm for
communication; there will be islands of IPv4 connectivity without IPv6
connectivity, and some parts of the network will only have IPv6
connectivity. The last transition mechanism presented here uses IPv4 in
IPv6 tunnels to move IPv4 packets across areas where there is only IPv6
connectivity. It could also be used at an earlier stage to allow Dual Stack
nodes, which would normally have a private (10.x.x.x or similar) IPv4
address, to temporarily have the use of a globally routable IPv4 address and
achieve end-to-end IPv4 communication without NAT.
Tunnel Primary Use Benefits Limitations Requirements

Mechanism
DSTM Legacy IPv4 IPv4 Not widely Dual-stack hosts

applications applications and supported. with IPv4 over
and protocols protocols fully IPv6 tunnelling.
on IPv6 only or supported: No Tunnel endpoint
IPv4/IPv6 dual ALGs required. boxes on border
stack domains. Temporary IPv4 of IPv4 domain.
address Dedicated server
allocated from to provide a
pool. temporary
global IPv4
address.
Table D-5: Tunnelling IPv4 in IPv6
Dual Stack Transition Mechanism (DSTM)

Dual Stack Transition Mechanism (DSTM) is currently being investigated
as an experimental mechanism outside the v6ops working group and is
specified in [I-D.bound-dstm-exp]. DSTM allows a Dual Stack node that
has its IPv4 stack enabled but does not have an IPv4 address allocated—
probably because it is located on an IPv6-only domain—to temporarily
acquire an IPv4 address from a local pool so that it can communicate with
IPv4-only applications or hosts across the IPv6 domain. The IPv4 packets
generated are either transmitted over the IPv4 infrastructure, if present, or
encapsulated in IPv6 and tunnelled to a remote endpoint on the edge of the
IPv4 only domain, where the packets are decapsulated and forwarded using
native IPv4 forwarding.
DSTM has the advantage that it supports existing IPv4 applications and
protocols without compromise: no ALGs or similar kludges are needed.
However, it requires both host modifications, a specialized DSTM server to
supply the IPv4 addresses and the IPv6 address of the remote tunnel

endpoint, and the tunnel endpoint decapsulation boxes with IPv4 capability
on one side and IPv6 capability on the other.

References
[I-D.bound-dstm-exp] Bound, J., “Dual Stack Transition Mechanism,”
draft-bound-dstm-exp-01 (work in progress), April 2004.
[I-D.huitema-v6ops-teredo] Huitema, C., “Teredo: Tunnelling IPv6 over
UDP through NATs,” draft-huitema-v6ops-teredo-02 (work in progress),
June 2004.
[I-D.ietf-bgmp-spec] Thaler, D., “Border Gateway Multicast Protocol
(BGMP): Protocol Specification,” draft-ietf-bgmp-spec-06 (work in
progress), January 2004.
[I-D.ietf-idmr-dvmrp-v3] Pusateri, T., “Distance Vector Multicast Routing
Protocol,” draft-ietf-idmr-dvmrp-v3-11 (work in progress), December
2003).
[I-D.ietf-ipv6-unique-local-addr] Hinden, R. and B. Haberman, “Unique
Local IPv6 Unicast Addresses,” draft-ietf-ipv6-unique-local-addr-05 (work
in progress), June 2004.
[I-D.ietf-isis-ipv6] Hopps, C., “Routing IPv6 with IS-IS,” draft-ietf-isis-
ipv6-05 (work in progress), January 2003.
[I-D.ietf-msec-arch] Hardjono, T. and B. Weis, “The Multicast Security
Architecture,” draft-ietf-msec-arch-05 (work in progress), January 2004.
[I-D.ietf-ngtrans-isatap] Templin, F., Gleeson, T., Talwar, M. and D.
Thaler, “Intra-Site Automatic Tunnel Addressing Protocol (ISATAP),”
draft-ietf-ngtrans-isatap-22 (work in progress), May 2004.
[I-D.ietf-ngtrans-mech-v2] Nordmark, E. and R. Gilligan, “Transition
Mechanisms for IPv6 Hosts and Routers,” draft-ietf-ngtrans-mech-v2-00
(work in progress), July 2002.
[I-D.ietf-pim-dm-new-v2]Adams, A., Nicholas, J. and W. Siadak,
“Protocol Independent Multicast - Dense Mode (PIM-DM): Protocol
Specification (Revised),” draft-ietf-pim-dm-new-v2-05 (work in progress),
June 2004.
[I-D.ietf-pim-sm-v2-new] Fenner, B., Handley, M., Holbrook, H. and I.
Kouvelas, “Protocol Independent Multicast - Sparse Mode PIM-SM):
Protocol Specification (Revised),” draft-ietf-pim-sm-v2-new-09 (work in
progress), February 2004.
[I-D.ietf-send-cga] Aura, T., “Cryptographically Generated Addresses
(CGA),” draft-ietf-send-cga-06 (work in progress), April 2004.
[I-D.ietf-send-ndopt]Arkko, J., Kempf, J., Sommerfeld, B., Zill, B. and P.
Nikander, “SEcure Neighbor Discovery (SEND),” draft-ietf-send-ndopt-05
(work in progress), April 2004.
[I-D.ietf-ssm-arch] Holbrook, H. and B. Cain, “Source-Specific Multicast
for IP,” draft-ietf-ssm-arch-04 (work in progress), October 2003.

[I-D.liumin-v6ops-silkroad] Min, L., Xianguo, W., Yibing, C., Mingye, J.

and L. Defeng, “Tunnelling IPv6 with private IPv4 addresses behind NAT
devices,” draft-liumin-v6ops-silkroad-01 (work in progress), May 2004.
[KAME] “The Kame Project”, http://www.kame.net
[NETBSD-IPSEC] NETBSD Operating System Project, “NETBSD IPsec
Frequently Asked Questions,” http://www.netbsd.org/Documentation/
network/ipsec/
RFC 1075, Waitzman, D., Partridge, C. and S. Deering, “Distance Vector
Multicast Routing Protocol,” IETF, November 1988.
RFC 1918, Rekhter, Y., Moskowitz, R., Karrenberg, D., Groot, G. and E.
Lear, “Address Allocation for Private Internets,” BCP 5, IETF, February
1996.
RFC 1928, Leech, M., Ganis, M., Lee, Y., Kuris, R., Koblas, D. and L.
Jones, “SOCKS Protocol Version 5,” IETF, March 1996.
RFC 2080, Malkin, G. and R. Minnear, “RIPng for IPv6,” IETF, January
1997.
RFC 2362, Estrin, D., Farinacci, D., Helmy, A., Thaler, D., Deering, S.,
Handley, M. and V. Jacobson, “Protocol Independent Multicast-Sparse
Mode (PIM-SM): Protocol Specification,” IETF, June 1998.
RFC 2367, McDonald, D., Metz, C. and B. Phan, “PF_KEY Key
Management API, Version 2,” IETF, July 1998.
RFC 2461, Narten, T., Nordmark, E. and W. Simpson, “Neighbor
Discovery for IP Version 6 (IPv6),” IETF, December 1998.
RFC 2462, Thomson, S. and T. Narten, “IPv6 Stateless Address Auto
configuration,” IETF, December 1998.
RFC 2529, Carpenter, B. and C. Jung, “Transmission of IPv6 over IPv4
Domains without Explicit Tunnels,” IETF, March 1999.
RFC 2545, Marques, P. and F. Dupont, “Use of BGP-4 Multiprotocol
Extensions for IPv6 Inter-Domain Routing,” IETF, March 1999.
RFC 2710, Deering, S., Fenner, W. and B. Haberman, “Multicast Listener
Discovery (MLD) for IPv6,” IETF, October 1999.
RFC 2740, Coltun, R., Ferguson, D. and J. Moy, “OSPF for IPv6,” IETF,
December 1999.
RFC 2765, Nordmark, E., “Stateless IP/ICMP Translation Algorithm
(SIIT),” IETF, February 2000.
RFC 2766, Tsirtsis, G. and P. Srisuresh, “Network Address Translation -
Protocol Translation (NAT-PT),” IETF, February 2000.

RFC 2767, Tsuchiya, K., HIGUCHI, H. and Y. Atarashi, “Dual Stack

Hosts using the “Bump-In-the-Stack” Technique (BIS),” IETF, February
2000.
RFC 2784, Farinacci, D., Li, T., Hanks, S., Meyer, D. and P. Traina,
“Generic Routing Encapsulation (GRE),” IETF, March 2000.
RFC 2893, Gilligan, R. and E. Nordmark, “Transition Mechanisms for
IPv6 Hosts and Routers,” IETF, August 2000.
RFC 3041, Narten, T. and R. Draves, “Privacy Extensions for Stateless
Address Auto configuration in IPv6,” IETF, January 2001.
RFC 3053, Durand, A., Fasano, P., Guardini, I. and D. Lento, “IPv6 Tunnel
Broker,” IETF, January 2001.
RFC 3056, Carpenter, B. and K. Moore, “Connection of IPv6 Domains via
IPv4 Clouds,” IETF, February 2001.
RFC 3068, Huitema, C., “An Anycast Prefix for 6to4 Relay Routers,”
IETF, June 2001.
RFC 3089, Kitamura, H., “A SOCKS-based IPv6/IPv4 Gateway
Mechanism,” IETF, April 2001.
RFC 3142, Hagino, J. and K. Yamamoto, “An IPv6-to-IPv4 Transport
Relay Translator,” IETF, June 2001.
RFC 3484, Draves, R., “Default Address Selection for Internet Protocol
version 6 (IPv6),” IETF, February 2003.
RFC 3493, Gilligan, R., Thomson, S., Bound, J., McCann, J. and W.
Stevens, “Basic Socket Interface Extensions for IPv6,” IETF, February
2003.
RFC 3513, Hinden, R. and S. Deering, “Internet Protocol Version 6 (IPv6)
Addressing Architecture,” IETF, April 2003.
RFC 3542, Stevens, W., Thomas, M., Nordmark, E. and T. Jinmei,
“Advanced Sockets Application Program Interface (API) for IPv6,” IETF,
May 2003.
RFC 3596, Thomson, S., Huitema, C., Ksinant, V. and M. Souissi, “DNS
Extensions to Support IP Version 6,” IETF, October 2003.
RFC 3678, Thaler, D., Fenner, B. and B. Quinn, “Socket Interface
Extensions for Multicast Source Filters,” IETF, January 2004.
RFC 3736, Droms, R., “Stateless Dynamic Host Configuration Protocol
(DHCP) Service for IPv6,” IETF, April 2004.
RFC 3789, Nesser, P. and A. Bergstrom, “Introduction to the Survey of
IPv4 Addresses in Currently Deployed IETF Standards Track and
Experimental Documents,” IETF, June 2004.

675
Appendix E
Virtual Private Networks: Extending
the Corporate Network
NAT IPSec
IP IP MPLS
IPv4 IPv4 IPv6 L2TP
Ethernet VPN Ethernet
Figure E-1: Reference diagram
Introduction
This appendix covers Virtual Private Networking (VPNs), which one of
the group of technologies introduced into the original IP network to
increase its capabilities.
In general terms, VPNs have relatively little interaction with use of the
network for real-time applications. As with standard IP networks, effective
Quality of Service capabilities will be needed to ensure that the traffic is
delivered efficiently in the face of network congestion and applications
deliver the quality of experience that users have come to expect from
traditional TDM networks. However, VPNs typically use the same type of
mechanisms to provide QoS capabilities and the protocols used for real-
time applications run unchanged across VPNs.
Virtual Private Networks are already widely deployed in today’s Internet.
They provide a means for Enterprises to extend their internal networks
across multiple sites using leased or public infrastructure, as well as
allowing ‘road warriors’ such as salesmen to link into the corporate
network without exposing their assets to public access. A number of
different technologies are used to implement VPNs, both at Layer 2 and
Layer 3. Layer 2 solutions create virtual 'private wires' between 'points of
presence' (PoPs); increasingly, the private wires are emulated by MPLS
paths rather than using actual Layer 2 transports in the network core.
Layer 3 technologies typically involve ‘tunnelling’ the private network
packets through the public infrastructure by encapsulating the whole

676 Appendix E Virtual Private Networks: Extending the Corporate Network
private packet as the payload of a public network packet at the tunnel

entrance and stripping off the encapsulation at the tunnel egress.
VPN ID
Type Class Tunnel technology
carried as
RFC 2547 – Provider MPLS Top label in
“BGP/MPLS VPNs” Provisioned MPLS stack
Virtual Routers Provider Various Implicitly
Provisioned (IPsec, MPLS,
GRE, IP in IP,
Layer 2, etc)
IPsec Tunnels Enterprise IPsec Implicitly
Provisioned (Configured for inter-site,
Client-Server for
Road warrior)
Table E-1: Layer 3 VPN variants

The equipment that terminates and transports packets for VPNs will need
significantly greater performance, especially to implement Layer 3 VPNs,
because there are additional bits to transmit and additional processing to
perform the encapsulation and decapsulation of the packets at the tunnel
end points:
The encapsulations used for VPN tunnels add extra overhead in the
form of an additional header (of one kind or another) when the
packets are traversing the public network.
VPNs have to extend the addressing structure to identify the private
network to which a packet belongs when it is being transmitted
across the public infrastructure. This can be done either explicitly,
as in BGP/MPLS VPNs [RFC2547] where an MPLS label stack
entry is used, or implicitly, where specific tunnels are used to carry
the packets belonging to a VPN between its points of presence.
All VPNs impose an extra processing load. Part of the load is per
packet, where the packet enters and leaves the tunnel between
points of presence, and part per VPN, to set up and maintain the
tunnels.
VPNs add an extra layer of complexity to the management process,
and if configured tunnels are used, there is a significant
management load setting up and maintaining the tunnels.
This appendix outlines the various different technologies that may be
encountered when VPNs are implemented.
The development of VPNs

Private networks have a long history: almost from the dawn of electronic
communication, railway companies established private telegraph and
telephone networks alongside their tracks. Before computers became an
established business tool, private data networks were being used to provide

Appendix E Virtual Private Networks: Extending the Corporate Network 677
secure communications in three types of applications, which now have

shorthand titles:
Intranets: Many heavily computerized companies (such as banks)
established private networks to link up their branches, which spread
around a country or across the world.
Extranets: Some industries, such as the airlines with SITA (Société
Internationale de Télécommunications Aéronautiques), established
a consortium to provide a private network that would allow the
companies in the consortium to exchange information globally.
Later, large companies—notably in the automotive industry—
established private networks linking a manufacturer with its
subcontractors and suppliers.
Personal Connectivity: With the widespread introduction of
modems, companies could also establish data communications with
salespeople on extended trips using dial-up telephone connections
across the Public Switched Telephone Network (PSTN).
Private Networks using Dedicated Lines

Before the advent of the commercialized Internet, these private networks
were constructed by leasing dedicated communication lines from the local
PTT and organizing specialized international links as necessary. While
these networks are generally seen as very secure—a security breach would
normally require physical access to the private lines—they offer
connectivity and little else. They do not scale well, are extremely
management intensive and very expensive for providers and for the end
customer. Also, provisioning new links is usually very slow as new
physical connections will often have to be installed and connected.
Virtual Private Networks using Switched Virtual Circuits

As soon as data network providers began to offer switched Layer 2 ‘virtual
circuit’ services using technologies such as X.25, frame relay and ATM,
the physical private networks could be transformed into Virtual Private
Networks. These Layer 2 VPNs are easier to provision, and the use of
virtual circuits instead of physical connections allowed the service provider
to use a common, reusable infrastructure for all the VPNs provided. The
cost savings that resulted were passed onto customers, and there is
currently a very large business in providing such virtual circuits to
Enterprises.
Layer 2 VPNs using virtual circuits do not present any special challenges to
real-time network designers. Generally, the delay in modern switched
circuits is closely matched to the speed of light propagation delay, although
legacy X.25 networks work at lower transmission speeds. However, in
many cases, security will dictate that data passing across the VPN is

encrypted, and this may introduce significant delay unless specialized

hardware is used. There may also be problems if IP QoS has to be
employed to prioritize packets as this has to be mapped to whatever QoS
mechanisms are provided by the Layer 2 technology if circuits are be
multiplexed onto a common physical link.
Making Use of the Internet for Private Networking

As soon as the commercialized Internet became ubiquitous, Enterprises
and service providers became interested in using it as a convenient
infrastructure for Virtual Private Networking.
Before looking at ways in which Intranets and Extranets can exploit the
Internet infrastructure to provide private networks between static domains
in the network, let us see how to cater for the small highly mobile outpost.
The 'Road Warrior' usually carries a single computer and makes an
intermittent connection into the network from just about anywhere – the
connection point is not known in advance and so no provisioning of
connections is possible.
Catering for the ‘Road Warrior’

Staff members who are traveling on business and cannot physically connect
to the corporate network would like to be able to use the Internet to provide
a logical extension of the office network into their laptop computer. The
Internet should be ideal for this since it excels at providing here-to-
anywhere connectivity. ‘Road Warriors’ will use just about any means to
attach their computers to the Internet when they need to. Until recently, this
predominantly meant dial-up connections across the PSTN used a modem
in the laptop and connected to a modem bank attached to the corporate
network1.
Hotels and home networking have now started to offer direct connection to
the Internet through broadband connections. Mobile telephones may
provide a data connection, such as the GPRS facility on GSM telephones
and many locations offer wireless LAN connectivity using one of the
802.11 ‘Wi-Fi’ standards. All of these connections send traffic across the
public Internet, no matter what the local access medium is.
The key problem with all these connections is securing the link to ensure
that:
Only authorized staff members can initiate a connection to the
corporate network
1. PSTN connections typically use the Point-to-Point Protocol (PPP) [RFC1661]to carry IP packets across
the telephone network together with the Layer 2 Tunneling Protocol (L2TP) [RFC2661],[
I-D.ietf-l2tpext-l2tp-base] across the links between the PSTN exchange and the ISP's network. Since
PSTN connections are rapidly becoming obsolete, this isn't discussed any further here.

Communications, once established, are secure from eavesdropping

and do not provide a security loophole through which intruders can
gain access to the corporate network.
Whatever the connection mechanism, the mobile station has to authenticate
itself to the corporate network before communication is allowed. Typically,
this involves the mobile user supplying identification and password
information, which can be verified by the corporate end.
When the PSTN is used for connection, the inherent security of the PSTN
will provide much of the security for the connection; however, when the
communication is going across the public Internet, additional security for
the communications has to be provided. The normal means used is an IPsec
Tunnel.
Figure E-2: Connecting a Road Warrior

Figure E-2 shows how this is usually done. The corporate home network
provides a VPN gateway, such as a Nortel Contivity gateway. The gateway
is accessible from anywhere in the Internet through a globally unique
address.
The road warrior’s computer is either configured with this address or can
look it up in DNS using a domain name. When the road warrior wants to
connect to the corporate network, the road warrior runs an application,
usually known as an ‘Extranet Client’, which makes a connection to the
VPN gateway using the IP address that has been assigned to it when it
connected to the Internet. The computer then authenticates itself to the
gateway using passwords or security certificates. If authentication
succeeds, the Extranet Client and the gateway co-operate to provide the
Road Warrior’s computer with a ‘virtual IP address’ taken from the same
address block that the corporate network is using. They then establish an
IPsec tunnel with a security association between the two ends within the
corporate home network and in the Road warrior’s computer. The virtual IP
address may be either a globally unique address or a private address (such
as 10.x.y.z) if the corporate network is using NAT techniques.

The IPsec tunnel then behaves like an extra link connected to the corporate
home network. Packets to and from the Road Warrior’s virtual address are
routed through the tunnel and the VPN gateway, and the Extranet Client
encrypt and decrypt the traffic according to the parameters in the security
association so that it cannot be subverted or intercepted. The Road
Warrior’s computer has become a logical extension of the corporate
network and the Road Warrior can now do anything that normally could be
done on the computer in the office, although it might be a long walk to the
office printer.
The Extranet Client may also prevent traffic entering or leaving the Road
warrior’s computer apart from through the IPsec tunnel by forbidding the
use of ‘split tunnels’—a split tunnel would potentially allow packets sent
between the address allocated to the Road warrior’s computer when it
connects to the Internet, and addresses that are not inside the corporate
network to reach the Road Warrior’s computer directly from the Internet
without passing through the corporate firewall. This opens a security
loophole that could be exploited to attack the corporate network by routing
packets from the Internet directly into the Road Warrior’s end of the
Extranet tunnel. If split tunnels are not allowed, the Extranet Client restricts
communication with the Internet to packets that are carrying the IPsec
tunnel traffic.
In addition to using Extranet Clients, Road Warriors may also be able to
use a more restricted form of VPN to access a limited set of corporate
applications. Suitably adapted applications can be accessed securely
through a web browser using the Secure Socket Layer (SSL) [SSL3]. This
kind of SSL VPN has the advantage that enabled applications can be
accessed from almost any web browser on any computer with Internet
access rather than requiring a special application on a corporate laptop. The
downside is that only the adapted applications can be accessed. One very
useful example is the web access to e-mail, which is frequently offered by
ISPs and some corporate networks.
More Flexible VPNs

The virtual circuit form of Layer 2 VPN is a significant improvement on
dedicated lines, but they still have a number of drawbacks. The service
provider VPN infrastructure is tied to a single medium and the customer
has to use the same access medium at all attachment points, known as
Points of Presence (PoP). With the introduction of commercial Internet
services, this becomes even more of a burden if the Internet infrastructure
is to share the same physical links. Furthermore, even where the Internet
and VPN infrastructures share the same physical network, separate
administrative and maintenance tools are needed for the two subsystems.
Finally, while provisioning is much easier than it was for dedicated lines, it
is still complex. This is especially evident in the effort needed to add a new
site to an existing VPN. If the VPN offers a full mesh of connections

between the PoPs, as is frequently the case, a new site must establish a
connection with each existing site—an O(n2) problem. The problem is not
so acute if a ‘hub and spoke’ model can be adopted; therefore, this might be
appropriate where traffic mostly flows between a central office and a set of
branch offices—the small amount of inter-branch traffic can be handled by
routing it via the central site.
A wide-area IP network infrastructure, possibly with the addition of MPLS
technology, can be used as the basis for a number of different types of VPN
that offer various advantages over the virtual circuit VPN for linking
permanent PoPs into a single company Intranet or providing business-to-
business connectivity with partner companies in an Extranet.
Current VPN offerings exploiting an IP infrastructure can be divided into:
Layer 2 VPNs, which tunnel Layer 2 frames across ‘pseudo-wires’
between PoPs, and
Layer 3 VPNs, which route Layer 3 packets across a virtual IP
network overlaid on the physical IP infrastructure.
Each category can be further divided into solutions where customer
equipment at the edge of the customer’s network does needed work to
establish the VPN, and the management burden falls on the customer
(Customer Edge – CE-based solutions); and solutions where a service
provider offers a managed service where the customer connects to

specialized equipment at the edge of the provider’s network (Provider Edge

– PE-based or Provider Provisioned – PPVPNs).
Site C
Site B
Site D
Site A
Site E
CE – Customer Edge Router

PE – Provider Edge Router – has knowledge of VPN
P – Provider (Core) Router – without knowledge of VPN
Figure E-3: Terminology used for VPNs over IP infrastructure
Many of the VPN technologies that are currently deployed or envisaged
use MPLS to create pseudo-circuits across a common packet infrastructure.
MPLS offers a number of advantages to the service provider that were
discussed in Chapter 18; adding VPN capabilities to the services offered
over MPLS further increases the revenue opportunities from an IP/MPLS
network without requiring additional capital expenditure on network
infrastructure.
Layer 2 VPNs
Standardization work for Layer 2 VPNs using IP infrastructure is still in
progress at the time of writing (mid 2004). Two prestandard schemes
(named for their chief advocates) have been deployed. These schemes
allow service providers to offer Layer 2 VPN services over IP/MPLS
infrastructure. They use a common encapsulation scheme, defined in
[I-D.ietf-pwe3-arch] to embed ATM, frame relay, Ethernet and PPP/HDLC

frames into MPLS packets before sending them across the Label Switched
Path (LSP) that has been preestablished between the PoPs emulating a
'private wire'. The schemes differ in the way in which the addressing
information for the VPNs is signaled between PE sites.
‘Draft-Martini’ uses the MPLS Label Distribution Protocol (LDP)
to distribute Virtual Circuit (VC) labels between PE nodes. It
requires significant manual provisioning of both ends of each VC
and, hence, retains some of the disadvantages of the basic virtual
circuit VPN schemes. It is best suited to simple point-to-point
connections and small VPNs and is frequently described as a
pseudo-wire scheme. Efforts are under way at the IETF to reduce
the provisioning load, possibly by using BGP to perform
autodiscovery of the end-points while continuing to use LDP for
LSP setup.
‘Draft-Kompella’ uses a BGP session between the PE nodes to
distribute information about the CEs connected to the PE node, and
to allow autoconfiguration and provisioning of the LSPs between
the PEs.
These prestandard proposals are now being refined into standardized
offerings for edge-to-edge pseudo-wire services emulating the traditional
Virtual Private Wire Services (VPWS) and Virtual Private LAN Services
(VPLS), which emulate a bridged LAN extended over a wide area IP/
MPLS infrastructure. The proposals are described in [I-D.ietf-l2vpn-vpls-
ldp] and [I-D.ietf-l2vpn-vpls-bgp].
On traditional IP infrastructure without MPLS, a Layer 2 VPN can also be
constructed using the Layer 2 Tunnelling Protocol (now in its third version,
L2TPv3[I-D.ietf-l2tpext-l2tp-base]). L2TP VPNs offer similar capabilities
to ‘Draft-Martini’ but can be constructed by customers using standard IP
connectivity to support tunnels between PE nodes without special services
from the provider.
All Layer 2 VPNs offer a number of advantages:
Tunnel-based VPNs are, from the customer’s point of view,
indistinguishable from ‘traditional’ Layer 2 VPNs using physical
connections or virtual circuits. Migration from one to the other
raises few issues.
The service provider does not participate in the customer’s Layer 3
routing, which, therefore, remains totally private to the customer.
The provider does not have to do anything special to keep
individual customers’ routes separated from each other and routes
in the Internet infrastructure; there is no need to manage per-VPN
routing tables in the PE nodes.
The customers can run whatever Layer 3 protocols they choose
across the Layer 2 VPN.

Layer 2 VPNs will automatically support Layer 3 multicast traffic.

No special action is needed to replicate packets in the PE nodes as
this will normally be done by the CE nodes. While this reduces the
load on the PEs and improves the scalability of the system, it
imposes extra load on the links from CEs to PE that need to carry
multiple copies of multicast packets.
In the case of VPNs offered as a service by a provider, there are additional
advantages:
Administrative responsibilities are separated, with the service
provider being responsible for Layer 2 connectivity and the
customer being responsible for Layer 3 connectivity.
Instability in a customer’s network is likely to have less effect on
the service provider’s network and other customers because very
little control information needs to be exchanged.
The amount of control information that the service provider has to
maintain on behalf of each VPN is strictly limited. Also, the
provider can use common tunnels to carry the traffic of multiple
VPNs between PEs, limiting the configuration and routing
information needed. This improves the scaling properties of the PEs
and eases the configuration burden, especially for Draft-Kompella
VPNs.
There are additional advantages and some optimizations that can be made
if the Layer 2 VPN is only going to carry IP traffic. If the VPN is carrying
heterogeneous traffic, the Layer 2 header has to be carried across the VPN
tunnels and all PE-CE links for the VPN must use the same technology (for
example, frame relay). If the VPN is carrying only IP traffic at Layer 3, the
Layer 2 header can be stripped off at the ingress PE after it has been used to
determine the destination PE – all that needs to be carried across the tunnel
is the VPN identifier. At the destination PE, the correct CE can be
identified from the VPN identifier. The IP packet is encapsulated with the
correct Layer 2 overhead for the link to the egress CE, which need not be
the same technology as the ingress PE to CE link. As an additional service,
the ingress PE can use IP packet classifiers to perform Differentiated
Services QoS marking, policing and shaping. The marking can be used
either to allocate the packet to a separate tunnel to the destination PE each
offering an alternative Classes of Service (CoS), or to select the treatment
of the packet at each router where a single tunnel carries multiple traffic
service classes.
Layer 3 VPNs
If the VPN traffic is exclusively IP packets, the optimum solution may be a
Layer 3 VPN, especially if customer sites are connected to the service
provider with a variety of Layer 2 technologies. The IETF provides a

document [RFC2764] which sets out a framework for all the different kinds
of Layer 3 IP-based VPNs.
If the customer wishes to manage the VPN rather than buying a service
from the provider (CE-based solution), the CE nodes can be configured to
provide IP tunnels to the remote CEs across the provider IP network. At the
simplest level, these tunnels could be IP in IP tunnels where the IP packets
originating in the corporate network, possibly using private IP addresses,
are encapsulated with an outer IP header at the tunnel ingress CE, using
globally routable addresses and routed to the egress CE across the Internet.
Of course, this offers almost no security or privacy, and so most customer-
managed VPNs use IPsec tunnel gateways as their CEs. In this case, the
corporate IP packet is encrypted and authentication data added before
encapsulation with the outer IP header.
The ability to share physical links or Layer 2 virtual circuits between many
tunnels makes the provisioning and management of a customer-managed
Layer 3 VPN slightly simpler than for a traditional VPN; but, it is still an
O(n2) problem if the VPN sites have fully-meshed connectivity and
additional equipment may be needed to support the tunnel endpoints and
IPsec encapsulation.
Figure E-4: Components of a network-based VPN

As an alternative, service providers are now offering provider provisioned
Layer 3 VPNs (PPVPNs). As with Layer 2 VPNs, the dominant provider
provisioned Layer 3 VPN technologies use virtual circuits in the provider

network to carry traffic between PEs. Some proprietary solutions use ATM
VCs; but, increasingly, MPLS is being used to provide the core VCs.
All the Layer 3 VPN solutions are variants on a theme—the provider
network implements a virtual overlay network for each VPN linking all the
PEs with attached CEs in the VPN. At each PE in the VPN, the PE has a
master routing table for the physical provider network and a virtual routing
table for each VPN that uses the PE. The master routing table is built by the
provider’s IGP running on the physical network. The virtual routing tables
and associated forwarding tables (VRFs) are built in various different ways
depending on the VPN solution. The virtual router has (real) interfaces to
the attached CEs at the PE and (virtual) interfaces linking to the other PEs
via the virtual overlay network.
Advantages of using a Layer 3 VPN include:
The customer can attach to the VPN using any Layer 2 technology
supported by the provider, and the technology used need not be
uniform across all the attachments. Layer 2 VPNs can overcome
this limitation only at the cost of losing Layer 3 independence and
being able to transport (typically) only IP packets.
A Layer 3 VPN can often handle more CEs per VPN than a Layer 2
VPN. For Layer 2 VPNs, the number is limited by how many
circuits are supported by the Layer 2 technology on each link. For
example, frame relay using two octet DLCIs would only allow a CE
to interconnect at most about a thousand other CEs in a VPN.
Providers can offer routing services as a value-added service on a
Layer 3 VPN. This can be a considerable advantage for a customer
where the network managers have limited routing expertise. For a
Layer 2 VPN, each CE router has as to exchange routing
information with all the other CE routers to which it is connected
by the VPN and building the routing scheme is entirely the
customer’s problem. For a provider provisioned Layer 3 VPN, each
CE router needs only a default route to the PE router—the provider
handles the routing between the PE routers in the connected PoPs.
Because the PE routers have visibility of the IP packets, the
provider can offer classification and CoS routing as value added
services.
Service providers can also provide multicast routing, forwarding
and packet replication in PE routers. In a Layer 2 VPN, multicast
issues have to be handled by the CE routers, which may have to
replicate packets resulting in duplication of traffic passing along the
access links between CE and PE. These access links are frequently
a bottleneck and using a Layer 3 VPN would allow best use to be
made of the available bandwidth.

To summarize, Layer 3 VPNs offer a good solution when the customer

traffic is wholly IP; customer routing is reasonably simple and the
customer sites need to connect to the service provider with a variety of
Layer 2 technologies.
Routing in Provider Provisioned Layer 3 VPNs

There are three aspects to routing in Layer 3 VPNs:
Routing between the CE and the PE routers: Any of the standard
dynamic routing protocols or static routes can potentially be used to
route packets from the customer sites to the PE. There may be
advantages to the provider if the routing protocol used on the CE-
PE links is the same as the protocol used to route packets between
the PEs in the VPN because fewer different protocol instances need
to be run on the PE routers, and a simpler configuration may be
needed.
Routing in the service provider network (backbone): The choice of
routing protocol to be used in the service provider’s network should
not be constrained either by the mechanisms used to route VPN
packets on the links between PEs or the choice of routing protocols
used by the VPN customers. The chosen routing protocol(s) are
used to construct the routing and forwarding tables for the physical
links of the provider network. These tables are often used to set up
VPN connections between PEs and provide a basis for the VRFs for
each VPN.
Routing between PEs in a VPN: Setting up the VRFs on the PEs to
route packets belonging to a VPN between PoPs will normally be
done by running a dynamic routing protocol instance for the VPN
overlay network in which the (virtual) links are the connections
between the PEs. The protocol used depends on the type of PPVPN
as described in the next section.
Types of Provider Provisioned Layer 3 VPN

Two major types of Provider Provisioned Layer 3 VPN are currently being
standardized:
Virtual Router VPNs: The standard for these is still under
development in mid-2004: the current draft can be found in
[I-D.ietf-l3vpn-vpn-vr]. In this solution, each PE runs a number of
‘virtual routers’—one corresponding to each VPN forwarding
table. The virtual routers for a VPN in each of the PEs where the
VPN has a PoP are linked by virtual circuits. The virtual routers
then run whatever routing protocol is preferred by the customer
(subject to support by the provider) with the virtual circuits
emulating the physical links that would normally connect each

routing protocol instance to its neighbors. Routing information is

then exchanged and the VRFs created. Note that this solution
requires a separate tunnel or virtual circuit between each pair of
PEs for each VPN supported on both PEs in the pair.
BGP/MPLS [RFC2547] VPNs: This solution was originally
described in [RFC2547]; a new version [I-D.ietf-l3vpn-rfc2547bis]
is being developed to reflect actual practice. RFC2547 VPNs are
probably the most commonly used form of provider managed VPN
at present with several implementations on the market.
The basic version of this type of VPN transports IPv4 VPN traffic across a
provider network using MPLS tunnels. An extended version of BGP
distributes routes for each customer's VPN traffic between the PEs. These
routes are either learned from the routing protocol running between the PE
and the attached CEs or configured into the PE for each VPN. Using BGP
for the PE-CE routing protocol has several advantages for an RFC2547
VPN, including a reduced number of protocols running on the PE, reduced
need for configuration of the information transfer between routing
protocols and the ability to pass route attributes with the routes advertised
by the CE and redistributed by the PE.
Several variations on the RFC2547 theme have also been proposed:
[I-D.ietf-l3vpn-ipsec-2547] and [I-D.ietf-l3vpn-gre-ip-2547] use different
types of tunnels (IPsec and GRE respectively), and
[I-D.ietf-l3vpn-bgp-ipv6] describes how to transport IPv6 VPN traffic
across the (possibly IPv4) provider network.
The MPLS tunnel connecting a pair of PEs can be shared by traffic from
any of the RFC2547 VPNs, which are attached at both the PEs reducing the
tunnel setup overhead compared with the VR solution.
BGP Extensions for RFC2547 VPNs and MPLS details

BGP is the logical choice to distribute routes from attached VPN sites
because BGP is a common choice for service providers' backbone routing
protocol. However, this is not as straightforward as it might at first seem,
because the VPN address spaces may overlap if customers are using private
addressing schemes, and BGP will only store one route for each address
prefix. The trick used by RFC2547 is to make these identical addresses
different by extending BGP. A new BGP address family is defined for VPN
routes, and addresses in this VPN-IPv4 family consist of an eight octet
Route Distinguisher (RD) plus an IPv4 prefix.
The Route Distinguisher is a way to make the VPN addresses unique, but it
is not used to give any other information about the route, such as
associating it with a particular VPN. Instead, each route can have one or
more separate BGP Extended Community 'Route Target' (RT) attributes
that are used to tell receivers in which VRF(s) the route should be installed.

To make a simple VPN, one RT is used for all the routes associated with
the VPN—each site where the VPN has a PoP also has a VRF for the VPN
in the PE router, and this VRF installs all the routes using the VPNs RT.
Using multiple RTs for a single VPN allows more complicated structures
such as 'hub and spoke' arrangements. Routes advertised by spokes (for
example, branch offices) use one Route Target and routes advertised by
hubs (for example, main offices) use a different one. The VRFs associated
with hubs only import and install routes with the spoke Route Target
attribute and vice versa; consequently, each spoke site only needs tunnels to
the hubs rather to every other spoke site. There is a great deal of flexibility
in the system, but considerable management effort is needed in the provider
network to maintain the Route Distinguishers and Route Targets.
In the data plane, packets arriving at a PE either:
come from a CE over an 'attachment circuit', or
come from another PE over an MPLS tunnel.
In either case, if the packet is a VPN packet, it will be associated with one
of the VRFs in the PE. The VRF for an attachment circuit is configured
into the PE—any packets arriving over the attachment circuit can be
forwarded by looking up the destination address in this VRF. Packets
coming from other PEs are following a route that was advertised from this
PE. Before the advertisement is sent out, the advertising PE creates the
VPN Route Label. This is a local MPLS label that can be used to identify
packets using the route to reach the advertising PE and associate them with
the correct VRF. Since this label is only interpreted by the PE that creates
it, it need not be different from VPN Route Labels created by other PEs.
Also, the VPN Route Label doesn't label any MPLS path unlike a standard
MPLS label. The VPN Route Label is carried in the advertisement as an
extra BGP attribute and is recorded in the appropriate VRFs when the route
is installed.
When a PE has to forward a VPN data packet to another PE, it identifies
the route to use from the correct VRF and then adds an MPLS header to the
packet. Two labels will be pushed onto the label stack of the packet: first
the VPN Route Label for the route, and then the label of the label switched
path (MPLS tunnel) towards the destination PE. The packet is then
dispatched down the tunnel and is switched across the backbone to the
destination PE as with any other MPLS packet. The VPN Route Label
remains at the bottom of the label stack and is not inspected until the packet
reaches the destination PE.
The VPN Route Label identifies the VRF in which the destination IP
address should be looked up, and the packet can then be dispatched either
to a local attachment circuit or to a remote PE depending on the results of
the lookup.

One of the reasons for an Enterprise to use a VPN is to be able to

administer all its sites as a single IGP routing domain (Autonomous
System [AS]). The resulting AS is partitioned across several
geographically separated sites and the IGP routing information has to be
passed between the sites. To avoid routing loops, standard BGP will not
allow routes learned from an AS to be redistributed into the same AS. In a
VR-based VPN, this information is carried across the provider's network
transparently without involving BGP, and the customer's IGP is unaware
that the domain is partitioned. In an RFC2547, the provider's BGP instance
imports and exports routing information for the customer AS at each
separate PoP. Three special attributes (AS-Override, AllowAS-In and
Route-Origin) have to be used to modify standard BGP behavior so that
each customer can have a single but partitioned AS.
Quality of Service in Layer 3 VPNs

VPNs generally involve a large number of flows across a backbone
network making per-flow Quality of Service (QoS) impractical. Instead,
VPNs may be able to offer more than one Class of Service (CoS) using the
Differentiated Services model. Service providers can deliver alternative
CoSs in various ways in provider provisioned Layer 3 VPNs:
The CoS applies to the whole VPN. The customer buys a separate
VPN for each level of service required including separate
attachment circuits from CE to PE. The customer is then
responsible for determining which traffic should be given enhanced
treatment.
The provider offers a packet classification service in the PE with
either separate virtual circuits between PEs with different CoS or
Differentiated Services support on a single virtual circuit. The
provider has to install and manage classification rules provided by
the customer. To make best use of a provider classification service,
the traffic also has to be prioritized on the attachment links, which
are often the traffic bottlenecks. Again, separate attachment circuits
or Differentiated Services support can be used for this purpose.
VPN Scaling, Security and Performance

The architectural design of VPN schemes is dominated by the need to
avoid scaling problems while maintaining the isolation and security that a
good VPN system should provide. A trade-off usually has to be made
between management and configuration complexity, processing load, and
isolation because VPNs inherently have O(n2) aspects—each CE/PE pair
has to be associated in management, control and data planes.
For Layer 2 VPNs, each CE-PE link has to carry a separate VC to every
other CE in the VPN if a fully meshed connection is required—hub and
spoke arrangements can reduce the number needed. This limits the number

of sites per VPN to the number of VCs supported by the Layer 2—typically
in the low thousands.
For the Virtual Router Layer 3 VPN scheme, a large number of tunnels
between the VRFs on each PE has to be constructed and the number of
MPLS labels available may become a limitation.
All of the architectures discussed here can reduce the number of tunnels
between PEs by multiplexing the traffic for all the VPNs with CEs
connected to both terminal PEs onto a single tunnel. Isolation of the traffic
is maintained by suitable encapsulation, but at some cost in additional
processing at each end and additional encapsulation overhead in packets.
RFC2547 VPNs provide this optimization as part of the basic architecture;
however, other schemes need to provide it outside the basic architecture of
the scheme. In the case of Layer 2 VPNs, carriers already have extensive
experience creating and managing the VCs needed and minimizing the
overhead in the backbone network. Arguably, the scheme used in RFC2547
slightly reduces the security of the isolation between VPNs because there is
no actual tunnel associated with the VPN Route Label; it is possible that
customer routes could leak out to the wider Internet. To ensure that VPN
isolation is not compromised, the backbone P and PE routers need to
ensure that inappropriately labeled packets are not accepted from outside
the backbone; otherwise, a malicious device could insert packets that
would be routed into a VPN although not originated by the VPN.
Management and configuration complexity, coupled with control traffic
and processing overheads, are perhaps the major issues limiting the
scalability of provider provisioned Layer 3 VPNs compared with Layer 2
VPNs. Each PE has to have resources to run software, build the VRF and
maintain state for each VPN with attached CEs. In all cases, the inter-PE
virtual links have to be created and attached to the correct VRFs. For VR
VPNs, the provider may be offering a choice of routing protocol support;
but, in any case, has to configure the protocol with the correct virtual links
to other PEs and co-operate with the VPN customer to set up the virtual
router correctly either by allowing the customer access to the configuration
with all the associated security concerns or performing the configuration on
the customer's behalf. For RFC2547 VPNs, the customer choices may be
more limited—the provider may require the CE to run BGP so that the
customer has to work out the routes to export at the CE and configure BGP
accordingly. If the CE-PE routing uses an IGP, the PE has to run an
instance of the IGP as well as BGP and the provider has to configure the
extra routing protocol instance and the information exchanges between the
IGP and BGP. RFC2547 VPNs may also require additional configuration
of BGP route attributes if the customer wishes to run a single partitioned
AS across sites.
The additional configuration and management overhead of Layer 3
provider provisioned VPNs is a significant barrier to large scale

deployment, even though automated solutions exist or are being actively

developed to assist the processes. Configuration for basic scenarios is
reasonably straightforward, but considerable effort may be needed for more
complex topologies. For basic scenarios, customer configuration is very
simple, but there is still a considerable amount of work to do to add
security, deal with multihoming and participate in multiple VPNs.
All types of VPN will tend to increase the amount of control and
management traffic carried across the backbone network. At the minimum,
there is an additional layer of routing control plus any traffic needed to set
up and manage the tunnels. In the case of Layer 2 VPNs, scalability
concerns are mainly focused on the tunnel setup and management because
VPN routing remains a customer responsibility. For Layer 3 VPNs, extra
routing control traffic has to be carried. This is a particular concern for
RFC2547 VPNs where every PE has to exchange BGP information with
every other PE that has a PoP for any of the VPNs with a CE connected
locally. One single BGP instance handles all the physical and VPN routes
at each PE, which can result in very large numbers (possibly 100,000's) of
routes needing to be distributed by one BGP process. Provided that the
provider part of the VPN is in a single autonomous system, existing
techniques such as iBGP Route Reflectors can be used to minimize the
need for large numbers of BGP sessions between PEs and the resulting
processing and traffic loads, but the concentration of routes in BGP limits
scalability.
An additional problem for RFC2547 VPNs is that instability in customer
routing directly affects the provider BGP. Instability in customer routing
has less effect on VR VPNs because the VRF routing instances are isolated
from the physical infrastructure. Layer 2 VPNs are totally unaffected by
customer routing instability.

Summary
Virtual Private Networks are a class of extensions to IP networks. VPNs
are generally a means to provide a secure extension of a common corporate
environment to geographically separated sites using common public or
leased infrastructure (road warriors and intranets) or to provide a common
environment to partners in a business or industry (extranets). As a side
effect, VPNs can allow networks that use IPv4 private addresses to be
extended across the public Internet without risking address clashes
between the global addressing or other VPNs. A number of different
techniques were discussed, suitable either for connecting individual mobile
'road warriors' or fixed sites with multiple nodes. The possibilities for fixed
sites covered both customer-provisioned and provider-provisioned VPNs
using both Layer 2 and Layer 3 connectivity. The advantages and
disadvantages of the various techniques were discussed with some
emphasis on the Layer 3 PPVPN techniques, which include the Virtual
Router (VR) and MPLS-BGP (RFC2547) schemes. The use of MPLS as a
common infrastructure for implementing both Layer 2 and Layer 3
PPVPNs was noted.

References
[I-D.ietf-l2tpext-l2tp-base] Lau, J., Townsley, M. and I. Goyret, “Layer
Two Tunnelling Protocol (Version 3),” draft-ietf-l2tpext-l2tp-base-14
(work in progress), June 2004.
[I-D.ietf-l2vpn-vpls-bgp] Kompella, K., “Virtual Private LAN Service,”
draft-ietf-l2vpn-vpls-bgp-02 (work in progress), May 2004.
[I-D.ietf-l2vpn-vpls-ldp] Lasserre, M. and V. Kompella, “Virtual Private
LAN Services over MPLS,” draft-ietf-l2vpn-vpls-ldp-03 (work in
progress), April 2004.
[I-D.ietf-l3vpn-bgp-ipv6] Clercq, J., Ooms, D., Carugi, M. and F.
Faucheur, “BGP-MPLS VPN extension for IPv6 VPN,” draft-ietf-l3vpn-
bgp-ipv6-03 (work in progress), June 2004.
[I-D.ietf-l3vpn-gre-ip-2547] Rekhter, Y. and E. Rosen, “Use of PE-PE
GRE or IP in BGP/MPLS IP VPNs,” draft-ietf-l3vpn-gre-ip-2547-02
(work in progress), April 2004.
[I-D.ietf-l3vpn-ipsec-2547] Rosen, E., Clercq, J. and C. Sargor, “Use of
PE-PE IPsec in RFC 2547 VPNs,” draft-ietf-l3vpn-ipsec-2547-02 (work in
progress), March 2004.
[I-D.ietf-l3vpn-rfc2547bis] Rosen, E., “BGP/MPLS IP VPNs,” draft-ietf-
l3vpn-rfc2547bis-01 (work in progress), September 2003.
[I-D.ietf-l3vpn-vpn-vr] Knight, P., Ould-Brahim, H. and B. Gleeson,
“Network based IP VPN Architecture using Virtual Routers,” draft-ietf-
l3vpn-vpn-vr-02 (work in progress), April 2004.
[I-D.ietf-pwe3-arch] Bryant, S. and P. Pate, “PWE3 Architecture,” draft-
ietf-pwe3-arch-07 (work in progress), March 2004.
RFC 1661, Simpson, W., “The Point-to-Point Protocol (PPP),” STD 51,
IETF, July 1994.
RFC 2547, Rosen, E. and Y. Rekhter, “BGP/MPLS VPNs,” IETF, March
1999.
RFC 2661, Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, G. and
B. Palter, “Layer Two Tunnelling Protocol (L2TP),” IETF, August 1999.
RFC 2764, Gleeson, B., Heinanen, J., Lin, A., Armitage, G. and A. Malis,
“A Framework for IP Based Virtual Private Networks,” IETF, February
2000.
[SSL3] A. Frier, P. Karlton, and P. Kocher, The SSL 3.0 Protocol,
Netscape Communications Corp., Nov 18, 1996.

695
Appendix F
IP Multicast1
IP television head-end system
Figure F-1 illustrates a block diagram of the logical component subsystems
in the IP television head-end. There are three main components in the head-
end:
the media subsystem
the application server subsystem
the web server subsystem
These three components are shown in Figure F-1 with some references to
horizontal integrated dialogs between the systems. The focus is on the
media subsystem. The other modules allow for the correlation of user
accounts and billing. These systems also provide the maintenance of
subscribers and of the actual system itself.
1. Please refer to the Disclosure Notice given in the Section VI Introduction (page 498).

696 Appendix F IP Multicast
Figure F-1: An example of an IP television system

As shown in Figure F-1, there are two dialogs that an IP-based Set Top Box
(STB) has with the head-end.
It will have at least one multicast stream for the television media.
It will have an HTTP session with the Web Server Subsystem.
There are two methods to get IP multicast MPEG2 onto the network.
direct encoding of the entire composite stream
encoding of the individual channels
The first is to simply encode composite source into MPEG2 and multicast
it out via the encoder's IP interface. It makes more sense in scale, and also
offers obvious content enhancement options. However, if a rebroadcast
relationship can be made with a satellite provider (Hughes, PrimeStar, etc.)
for sourcing the media via Digital Video Broadcast/Asynchronous Serial
Interface (DVB/ASI) contribution, then the network would have satellite
receivers for digital and analog sources capable of simultaneous key
decryption. These sources deliver all contributions to a Statistical
Multiplex device (for example, an IP Media Gateway).

Appendix F IP Multicast 697
The main task of the IP Media Gateway is to receive the channel-based

DVB/ASI feeds, Stat-Mux the multiple Multiprogram Transport Stream
(MPTS) MPEG-2 sources, and re-present them as Single Program
Transport Stream (SPTS) MPEG2 multicast sources (obviously one
channel per group). Figure F-2 illustrates how these devices can be put in
tandem to offer a series of translation points that are then presented to the
IP network as multicast streams.
Figure F-2: DBS/ASI to IP translation

Referring back to Figure F-1, there is also an HTTP dialog that the STB has
with the Web Server subsystem. It is this dialog that provides the channel
selection in the way of the Electronic Programming Guide (EPG). Using
the Application Server Subsystem, the subscriber accounts are maintained
for channel access. These are typically arranged in packages made
available for a fee.
When an STB boots up, it registers itself to the IP head-end. A startup
guide is provided to the STB with a walk-through for first-time access.
Once the dialog is completed, the EPG is provided to the STB and channel
selection can then occur. Figure F-3 illustrates the boot-up scenario.

Figure F-3: IP set top box initialization

When a channel is selected, the STB issues an IGMP join message for the
IP multicast group that is associated with the channel (CNN equalling
225.6.7.8, for example). It is here that this technology compares
unfavorably with traditional cable television. With cable, each channel is
available to the subscriber in a perceivable instant. The subscriber can surf
the channels as well, spending as little as three to five seconds on one (or
several) and perhaps hours on another.
Core feature enhancements for IP multicast

Another issue to consider is the fact that multicast cannot be protected by
the IP head-end. The whole paradigm of IP multicast is based on the role of
the network being aware of the viewing audience, not the source. Only the
head-end limiting the subscriber's view of available channels in their
subscription package limits the channels displayed. This means that an
industrious individual could sniff out multicast service advertisements and
then manually join the group with an external client monitor of some sort.
Worse yet, a hacker could pirate a channel on an intermittent and random
basis and inject undesirable content into the broadcast.

This is clearly an unpredictable paradigm that can only be made tractable

by an approach that may be thought of as “worst-case” engineering. This
means that the features have been conceived and introduced because of the
systemic stress issues associated with IP multicast for television services. It
is important to realize that these features need to work in concert with each
other. Each feature addresses a certain requirement into the overall solution
scenario. In order to understand the role that each feature plays in the end-
to-end solutions, it is useful to use a reference model and adhere to it for
the remainder of this chapter.
Sidebar: PIM-SSM
Originally, a PIM shortest-path tree (SPT) was established as a result of
a policy-based event within the PIM-SM rendezvous point (RP) for the
given multicast group. At this point, the edge PIM router will execute a
shortest-path tree build because it now knows the unicast IP address of
the sending source because of the packets that it has received via the RP
shared tree path. Once the SPT is built, the edge router initiates a prune
to the RP shared tree. Further logic dictates that if the unicast IP address
of the source were known a priori, there would be no need for the RP
shared tree phase of the tree build. A PIM router could immediately
reference its unicast routing table and perform a build to the SPT. This is
the essence of PIM-SSM.
In order to co-exist with a PIM-SM/IGMPv2 deployment, it became
necessary to establish an addressing range for the single-source mode of
behavior. The range selected was the 232/8 addressing range, or Class D
address range 232.0.0.0 through to 232.255.255.255. All IGMP requests
for channels within this range are to be source-specific. The RP is not
invoked and will ignore all activity in this address range. This means that
any request for an IP multicast group within the SSM range must be
accompanied by the unicast IP address of the sending source or of the
S,G format where S is the unicast IP address of the sending source and G
is the multicast IP address of the channel. All non–source-specific
requests within the 232/8 range that are directed to the *,G format will
be ignored by a PIM-SSM router.
All of this means two things: first, the requesting client must support
IGMPv3; and second, that the sending source must support source-
specific service advertisements or have a method for making the unicast
address known to the client. Also implied is that in order to use PIM-
SSM, the provider may very well have to re-address its existing IP
multicast deployments. It was perceived correctly that these
requirements and limitations would preclude the introduction of PIM-
SSM. As a consequence, a number of features were introduced to ease
the adoption of the technology.

IP multicast routing has one challenge that needs to be solved that is

inherent in its methodology. The addressing scheme is nonroutable. The
Class D address space (224.0.0.0. to 239.255.255.255) is not a unicast-
based source scheme. It is, rather, a multicast-based destination scheme.
Consequently, the same IP address (225.6.7.8, for instance) can show up
literally anywhere there is the presence of IP. This means that the tracking
of the addressing space usage is an important aspect in DVMRP IP
multicast management. It also means that PIM-SSM (which is, as its name
states, source-specific and unique) goes a long way toward resolving this
dilemma.
The fact that a Class D IP address is nonroutable means that there is no way
for the edge router to know where the source is located without additional
information. In DVMRP, knowledge of the source location is provided by
unicast route table exchange by DVMRP itself. This allows the protocol to
determine if multicast data received from a source is received on the right
interface, which is a process called Reverse Path Forwarding (RPF) check.
DVMRP uses a flood and prune mechanism that allows Layer 3 switches or
routers to build the multicast tree. The process of building the multicast
tree is based on the RPF check. If a switch or router does not have any
receivers for the group or does not have any neighbors, it will request to
prune the multicast stream. If receivers are present or DVMRP neighbor(s)
did not prune the stream, data is forwarded. This is shown in Figure F-4.
Remember that DVRMP is source-driven.
Figure F-4: DVRMP multicast tree building

In PIM-SSM technology, knowledge of the location is provided by a priori

knowledge of the unicast IP address of the sending source for the Class D
IP multicast address. By using the unicast address of the sending source,
any intermediate PIM-enabled router will be able to reference its unicast
routing table (which can be generated by any unicast routing protocol;
hence, the name “protocol-independent”) to find the reverse-path route
back to the sending source from the requesting edge. This is illustrated in
Figure F-5.
Figure F-5 shows two sources that are using the same Class D address but
are distinguished from one another by the different unicast addresses of the
sending sources. Specifically, 10.11.12.13/232.1.1.2 is different from
11.12.13.14/232.1.1.2. Each edge router can reference its own
independently derived unicast routing table to find the shortest path back to
the sending sources. There is no confusion on the location of the sending
multicast sources. The network can clearly distinguish between the two
multicast events.
Figure F-5: PIM SSM source-specific shortest-path trees
Bandwidth factors at the network edge

All of these features result in some very predictable traffic profiles at
different levels in the network. If the network is engineered to support the
worst-case profiles, the implementation will be successful. This section
illustrates how the bandwidth factor is determined at the different levels of
the network.

As stated earlier in the example, there are eighty channels of MPEG2 IP

multicast at 3 Mbps per channel. Engineering for the worst case determines
that the core network capacity must allow for 240 Mbps of IP multicast
activity. This is a given, and the requirement is easily met by the Passport
8600. However, as traffic moves closer to the edge of the network, attention
must be paid to the factor of bandwidth and the usage of multicast controls.
Figure F-6 illustrates an edge aggregation as is commonly seen when
working in conjunction with the DSLAM. As stated earlier, this is a well
laid out aggregation method, which is explained in more detail in the
following.
Figure F-6: An example of bandwidth factors in DSLAM aggregation

As shown in Figure F-6, there is a paring back of multicast traffic activity
as traffic progresses to the edge of the network, and the traffic patterns are
more specific to the viewing subscriber base. In a typical configuration,
there would be two DSLAMs aggregated to a single Ethernet switch using
twenty ports. Because each subscriber can have two STBs, there is a
maximum potential active count of 240 STBs (120 STBs or sixty
subscribers per DSLAM). This means that there could very well be an
instance where all eighty channels would be active at the same time. This
works out to a worst-case bandwidth factor of 240 Mbps.
A gigabit Ethernet trunk will easily handle this profile. In the DSLAM,
aggregate feeds each slot on the DSLAM on a per-module basis, meaning
that each module has a dedicated full-duplex 100 Mbps connection to the
Ethernet switch. Assuming that each module supports six subscribers, there
is the maximum active Set Top Box (STB) count of twelve. This means that
if a situation were to arise where everyone was watching a separate
channel, the network would be dealing with 12 x 3 for a total of 36 Mbps.
There is still plenty of room for any residual traffic2.

A worst-case scenario of everyone changing their channels to a completely

different channel (within the residence) at the same time would be 72
Mbps, and this could only be a period of time within the LMQI interval
(between one and two tenths of a second). The probability of everyone
performing this action at the comparatively same instant is mathematically
very small, but nonetheless the aggregate link would handle such a
scenario.
Moving out to the actual subscriber line, it must be apparent that no
residual (that is, traffic from the last channel surfed) can be tolerated for an
appreciable duration. The worst-case assumption is dealing with a
bandwidth profile of 7.5 Mbps (a loop length of 7500 feet). With 3 Mbps
channels and two Set Top Boxes (STB), any residuals would cause issues
by overrunning the bandwidth profile. To avoid this, the DSLAM must
support adjustable IGMP timers or Adjustable LMQI. Then, the DSLAM is
able to quickly determine if there is another interested client by any IGMP
reports that are received for that group. If no reports are received, the
stream is pruned back at the DSLAM. If no IGMP report is received at the
Ethernet switch (from any of the other six subscribers), the stream is
pruned back there as well. Going further, if no IGMP message is received
for that group at the core router, the routed interface is pruned from the
group.
Consequently, the bandwidth profile can go from 3 Mbps (everyone
watching the same channel) to 240 Mbps (everyone watching different
channels) at the core router edge. Ironically, the more stressful time will be
when “nothing is on,” and a large number of people are looking for
something to watch and most probably finding different things. During a
high-viewership event such as the Super Bowl, there is likely to be much
lower bandwidth demand because the majority of the viewing audience
will be watching the same channel.
Conditional access & content protection methods

In areas of commerce that involve a transient product such as
entertainment, it is important to be able to assure content providers that the
viewing audience is the paying audience. One of the big downfalls of
multicast is its inherent vulnerability to mischievous behavior. Once
content becomes digitized, it becomes difficult to protect it from pirates
that would steal it or hackers that would inject input into the channel and
corrupt it. Because of the default behavior of the Internet Service Model
(ISM) for multicast, any source can join or send into any group. It is
important to provide additional policy controls for the use and management
of multicast services for the delivery of television.
2. Residual patterns come from the last channel that the subscriber was surfing that the switch has not yet
disconnected from the subscriber.

While a Single Source Mode (SSM) for multicast provides protection from
a rogue source hijacking the channel (because of its source-specific
nature), it does not provide content protection capabilities for the channels
being multicast. In theory, a non–paying individual could sniff out the
information required to join the group and then use a generic decoder to
actually view the content. Specific IGMP-based filters can efficiently
provide protection from this form of piracy.
Another aspect to consider is the need to direct certain multicast flows to
certain portions of the network. The reason for this may be demographic or,
in the case of a hybrid provider (a provider who has different access
networks within the subscriber base), to assure that higher-speed channels
do not get propagated out to portions of the network that cannot handle the
traffic load.
In Figure F-7, a simple hybrid provider network is shown. One leg of the
network supports DSL access speeds with the bandwidth limitations that
were discussed earlier in this article. The other portion of the network
provides direct ETTU. In this portion of the network, the channel speed can
effectively be doubled. This obviously provides an enhanced viewing
experience for this portion of the subscriber base, but it also adds the
complexity of assuring that the high-speed channels are not forwarded over
to the DSL access portion of the network. Multicast routing policies can
effectively deal with this scenario.
Figure F-7: An example of using multicast route policies to direct traffic

patterns
As shown in Figure F-7, by grouping or “scoping” multicast addresses and
their relative source networks into logical groups, it becomes relatively

simple to provide the network protection required by using send-and-

receive multicast routing policies that direct the aggregate channels to the
correct portions of the network. This assures that the high-speed channels
cannot be made available to the DSL subscriber base, and vice versa. For
example, edge policies allow the control of the users being able to receive
only certain channels, based on the bandwidth. Other policies allow
switches to forward or not forward specific channels.
To protect against rogue source intrusion, a service provider can deploy
PIM-SSM and mandate policies allowing only for certain S,G
combinations. Any other sources attempting to send into a given group will
be dropped at the ingress of the network. Going further, in most service
provider networks, it is relatively easy to mandate known source subnets.
By doing this, it becomes very easy to create an overall deny-source policy
for all other subnets (especially the subscriber subnets, since this is likely
to be the source of any rogue activity).
In non-SSM deployments, the traditional ISM for multicast is
implemented. In these deployments, a *,G source is allowed by default.
Consequently it becomes important to provide protection in some other
fashion. In PIM-SM deployments, the RP becomes an option for these
controls. The issues that need to be addressed here are those of scalability
and robustness. RPs can be susceptible to spoofing and DOS attacks. In
DVMRP deployments, there is no RP; consequently, something else must
be used. IGMP source filters allow for the specification of allowed source
subnets or endpoints. The use of this technology is preferable because the
policy is held at the edge ingress to the network, not in the core of the
network (the likely location of the RP).
Implementing IGMP source filtering at the edge provides guarantees
against rogue sources in a very reliable fashion. Going further, IGMP
source filters can assist in protecting RPs within the provider's network,
which would be in compliment to any policies mandated at the RP. In both
cases (SSM and DVMRP) and as default behavior, a subscriber’s profile
will forbid any multicast traffic to be sent to the network using the IGMP-
based filter's capabilities.
The third area of concern is content theft. It is important to realize that the
term needs to be identified prior to discussing it. In order to set a baseline,
it is necessary to step back and analyze what happens in a normal cable
television network. Content theft in this arena is defined as an individual
who is illegally gaining access to the broadcast media, not the recording of
the media. The subscriber can, at their disposal, record any movie or
program that they are paying for with their cable TV service. This is a
commonly accepted practice. The legal language is that, as a paying
customer, the subscriber can record the program and view it as many times
as they like. They can even invite friends over to watch the recorded
program. However, they cannot charge for the viewing. The same holds

true with IP-based delivery of broadcast television. The paradigm that must
be met is prevention of access to content that the customer has not paid for,
not whether they choose to record it.
The issues of recording content are complex, both from a technical and a
legal perspective. The entertainment industry has been working on this
issue for quite sometime with little headway. From a technical perspective,
watermarking content is a feasible approach, but that would involve the
support of the encoder and the decoder or TV. Furthermore, watermarking
content would prevent any recording of the content, not just the content that
someone intends to resell. It would, in essence, render VCRs useless if all
content were protected in this matter. This would create a consumer outcry
that the entertainment industry would not find beneficial to their cause.
Needless to say, the issues as they stand are best not addressed by any
network-based technology. It is more appropriate for them to be addressed
by content creation, transmission and receiving technologies as well as the
entertainment and legal communities.
Given this, the role that the network plays, as stated earlier, is to assure that
the viewing audience is the paying audience. This can effectively be
provided by IGMP-based filters with the join filters, which are
implemented at the edge ingress of the network. Many providers choose to
provide channel bundles to ease the administration of this aspect of the
service. By arranging the multicast addresses to correspond to these
bundles, it becomes easy to allow or deny access based on them.
As an example, the provider may have a premium offering where all of the
channels are grouped into a common representation. In the previous
example of channel speed grouping, the third octet could be used for this
purpose. By this means, all of the channels contained within that super-
group would be allowed or denied based on the policy. So if the premium
service were represented by a .192 value in the third octet range, the
provider would have two super-groups: one for high-speed channel access
232.128.192.x, and one for low-speed channel access, 232.192.192.x. As
shown earlier, the route policies to direct the different channel speed
groups are already in place. Now it becomes a simple matter of
transmitting an allow or deny message to the 232.128.192.x range in the
case of the high-speed offering. If an aggregate of several channels (which
is the bundle) were within the addressing range, then all the channels
would be allowed or denied based on the filter policy.
Additionally, the provider could be channel-specific as well—perhaps offer
a single channel as a limited-duration promotional offer for the whole
bundle. In this instance, the super-group 232.128.192.x would be denied,
while 232.128.192.10 might be allowed for a period of thirty days.

User identification and statistics requirements

If desired, access to some TV channels may be billed based on the length of
time a customer watches this TV channel. This is achieved by using IGMP
to track the viewing times per channel per subscriber’s interface. This
approach is refined so that the service provider is able to choose on which
range of IP multicast group addresses this accounting is required. For
example, there may be channels that users can watch without paying any
extra fees, or that the service provider is not interested in gathering
statistics on. However, for some other channels, a fee might be paid by
viewing duration, or statistics can be gathered on those channels for
popularity ratings.
The tracking of channel watching times is accomplished by maintaining a
database on when a user started watching a channel and when this user
stopped watching the channel. The delta between both times will provide
the watching period and provides the correct statistics—and eventually
billing information—to the service provider. Optionally, the logged
information can be stored to a different remote server. This will allow the
service provider to have separate accounting files. This file is retrievable
with FTP, or mechanisms such Radius can be used for this purpose.
The information gathered from this database not only allows the gathering
of accounting information for billing, but also allows gathering of detailed
statistics on what channels are the most watched and the time when
subscribers watch these channels. This provides a way to improve the TV
service offered, and help tremendously in marketing. For example, TV
advertisements can be chosen at the times where users are watching the
most. This opens the door for service providers to derive an extra source of
revenue from their offering.
Web streaming methods and practices

In addition to standards-based RTSP player behavior, which is used by
QuickTime players for streaming web media control, there are two
additional ‘proprietary’ methods: RealNetworks and Microsoft Streaming
Media. These need to be discussed separately because of their prevalence
in the industry and the fact that they are somewhat different than the
generic RTSP method that we discussed earlier. First, we will discuss
RealNetworks streaming method.
RealNetworks streaming method

When a visitor browses a web page and selects a link to a multimedia
presentation that is served by Real Server, the server creates a small
‘metafile’ known as a ‘ram’ file (the file suffix is .rm) and sends it to the
viewers web browser. The metafile contains the address or addresses if
multiple plays are called that the player method that the client will use.

Once the browser has downloaded the metafile, it is handed off to the
viewers RealPlayer client. The Real Player client then reads the data in the
metafile and then requests the presentation from the Real Server. In later
versions of the Real Player client, this is accomplished via RTSP dialog.
Other methods are available: PNA is a streaming dialog that is used by
earlier client version and HTTP provides generic embedded access by port
80.
When the Real Player client requests a URL that begins with rtsp://, it
sends its request call to Real Servers port 554. Requests to pnm:// indicate a
PNA request from an older client and are directed to port 7070. Whereas,
http:// requests are directed to port 80 or 8080 as appropriate and are the
less efficient (nonstreaming) method for delivery. Further details can be
found in the RealNetworks Administrator guide.
As a result of the request, the server (or an intermediate proxy service) will
then serve out the streaming media. RealNetworks provides two methods
for serving streaming content. The first is RealNetworks proprietary Real
Data transport Protocol (RDP), the second is the industry standard RTP/
RTCP. Note that the RTP delivery requires the G2 RTP-based Real Server
and Client.
Because of the tight integration between the client and the server via the
ram file, RealNetworks is a closely licensed method. Even intermediate
network proxy functions must comply with this model and, hence, are
covered by the licensing rights. RealNetworks provides licensed proxy
services by software called RealProxy* that can be used in the network
path to provide caching delivery features at the edge. Many CDN cache
products and solutions directly license the RealProxy service as an add-on
feature. This allows for the intermediate proxy of the ram file and any
corresponding contents for the event according to the rules defined on the
proxy agent.
Microsoft streaming method

Microsoft also uses a metafile method as well as a proprietary stream
delivery method.
When a viewer requests Microsoft streaming content with a web browser
the web server responds with an .asx metafile. Like the RealNetworks .ram
metafile, the .asx metafile tells the Windows MediaPlayer how to connect
to the server to get the media file that is requested. At the very least, the file
would contain a URL that points to the server and the file that the
MediaPlayer wants to play. Again, similar to .ram metafiles, additional
parameters may include start time offsets for the clip as well as play
duration and clips sequences for multiple content plays.
With Microsoft ActiveX*, which is the underlying procedure call
mechanism, the session is started via the DCOM port 135. MMS uses TCP

port 1755 for media stream control and is somewhat analogous to RTSP.
The actual media ports for audio and video are UDP and are dynamically
created. MMS media can also be handled by an intermediate proxy service.


711
Appendix G
QoE Engineering
In this appendix, the focus will be on real-time data application’s
performance targets and their impairments. It should be pointed out that not
all data applications require real-time treatment; but, a subset does require
real-time or quasi real-time response time to achieve the desire interactivity
level. Applications such as gaming, telnet or remote login requires
response time in the millisecond range, and careful attention to the
underlying network architecture and selection of QoS mechanisms.
Data QoE performance metrics & targets

User expectation varies depending on the type of application, as do the
metrics needed to quantify the QoE. The perceived quality for data services
and applications is principally affected by the following factors: Absolute
transaction response time and its Variability.
Data Services QoE = f(Absolute Application Response Time and
Variance)
The “absolute” application response time is the total time to complete a
transaction—a transaction being defined as one or multiple packets. If a
transaction consists of multiple packets, then the application response time
is based upon the total transfer completion time for all packets, not the time
for individual packets. The application response time includes all the delay
the user experiences including propagation, processing, queuing,
transmission as well as additional delay caused by protocol (TCP in
particular), such as timeout and retransmission. The application response
time is most critical in interactive and responsive applications, since the
user is expecting and waiting for an almost immediate response. Many
studies have found that two to four seconds is the optimal range to deliver
acceptable QoE for responsive application. Interactive application on the
other end requires an order of magnitude faster response time in the
subsecond range. For timely application involving large bulk data transfer,
such as FTP, music, video, images, no hard limit has been defined. File size
and previous experience will guide user expectation, as will a “progress
bar” provided by many of these transfer applications. Table G-1
summarizes the data applications QoE targets.

712 Appendix G QoE Engineering
Application type Typical Absolute Application response

Transaction size time
Interactive applications <1Kbytes Accep < 400 ms, opt < 200 ms [1]
(gaming, telnet, remote login)
Responsive applications (e- < 10 Kbytes Accep. < 4 s, opt. < 2 s [1], [2]
commerce, web browsing)
Timely applications 1Mbytes Accep. <120 s, opt. < 30 s [3]
(FTP, bulk data transfer, mp3…)
Notes: [1] ITU G.1010, [2] Nortel Internal study, [3] Estimates
Table G-1: Data Application QoE targets
Data QoE Contributing Factors & Impairments

This section highlights some of the fundamental contributing factors and
impairments affecting TCP-based data applications QoE. TCP flow control
introduces another level of complexity as the application response time is
affected by TCP timeout: retransmission and delayed acknowledgment,
which in turn might have severe impact on real-time data applications if not
tuned or provisioned properly. Figure G-1 illustrates typical data
application QoE characteristics as contributing factors are affecting the
application response time.
Response Time
Data QoE
QoE
Target
Margin
QoE dependencies & factors
(Buffer size, loss rate, # of flows/users, link size)
Figure G-1: TCP-based data application response time QoE relationship

Packet delay affects the overall transaction response time for data
applications. In addition to network-induced delays such as propagation
and queuing delay, TCP-related delays also need to be accounted for. TCP
response time is biased towards lower Round Trip Time (RTT) flows; that
is, TCP sources located closer to the server or simply experiencing less
delay through their connection path (propagation delay) will receive faster
acknowledgement, resulting in a greater share of the available bandwidth
and faster response time. So client/server or peer-to-peer location will be
important to optimize data QoE performance.

Appendix G QoE Engineering 713
Data Applications QoE Factors Dependencies

1. Processing Delay
• Server processing (loading), think time delay, re-direct, disk access
• Task & process scheduling/poling, switching
2. Queuing Delay
• Queuing due to congestion and/or inadequate buffer size
3. Propagation (distance) & serialization (link speed ) Delay
4. TCP related impairments
• Number of concurrent/active TCP connections/flows
• Transaction file size
• TCP retransmission & timeout duration Fairness among competing TCP flows and sources
• TCP packet round trip time – TCP Ack
• Buffer size and # of “in-flight” packets
• TCP congestion window size
5. Packet loss rate & distribution
6. Effective Throughput (a function of all the above)
Table G-2: Data application QoE contributing factors

TCP timeout is the leading source of performance degradation, resulting in
lower throughput and extended transfer time. Packet loss can have severe
impact on data application performance QoE. In the case of UDP/voice
packets, a drop can be perceived as a sound clip or missing syllable unless a
form of packet loss concealment algorithm is capable of interpolating the
missing packet. For TCP data packet, interpolation is not an option because
exact data is required. TCP protocol will then enter in a retransmission
phase, which may incur some delay timeout period in the order of
second(s). Packet drop is the primary way TCP sources learn about
congestion (implicit notification). A dropped packet causes TCP to halt (re-
transmission timeout) and decreasing its congestion window by half. The
pause time is determined by the TCP version, operating systems and on a
set of parameters including the packet round trip time. TCP adjusts the
timeout period as a function of the level of congestion.
Number of active flows and network buffering

The number of active TCP flows is a critical data QoE contributing factor
as each flow consumes network resources, in particular buffer storage
space and bandwidth. As the number of TCP flows (or connections)
increases, there must be adequate storage in the network to maintain the
minimum required in-flight packets for TCP to operate in a steady state
mode. There is a trade-off between queuing delay and packet loss due to
buffer overflow. Small queue size (small buffer) implies low average
queuing delay. In interactive voice services, both delay and packet loss rate
need to be optimized and controlled simultaneously. In TCP-based data
application services, smaller queue sizes (lower delay) are desirable but do
not always lead to faster application response time due to the additional
TCP flow control layer. TCP timeout due to packet loss caused by
insufficient network buffering can actually increase response time.
Therefore, a careful understanding of the transport protocol behavior is and
its impact on QoE is required before determining the number of active

sessions/flows and buffering size. A minimum level of buffering is required

for TCP to operate in a steady state with minimal interruption.
Nodal QoS Mechanisms
FIFO and strict priority scheduler

For voice only traffic network, the requirements on scheduler complexity is
minimal and a simple first-in first-out (FIFO) scheduler should be
adequate. For voice/data network, a strict priority scheduler is required to
prevent voice packet queue built-up and control jitter to acceptable limits.
Figure G-2: Strict Priority scheduler discipline operation

A strict priority scheduler ensures that high priority packets are sent first,
and when all high priority packets have been sent, then lower priority are
scheduled. It should be pointed out that obviously this mechanism might
produce queue starvation; that is, lower priority packets might never get
any scheduler share. Additional protection or control might be required to
set a hard limit of the maximum scheduler bandwidth share for high
priority traffic—starvation avoidance. In addition to the lower priority class
of service starvation problem, another potential problem is expected to
occur as the level of high priority subscription exceeds the capacity of the
scheduler (that is. oversubscription); even high priority traffic will be either
queued or drop. It should be pointed out that a strict priority does not offer
hard guarantees bandwidth, but instead prioritizes traffic with available
scheduler bandwidth resource. Strict priority scheduler is not sufficient to
assure and guarantee conversational voice QoE; additional traffic
engineering and/or E2E centralized QoS mechanisms would be required.
Weighted Fair Queuing

Weighted Fair Queuing (WFQ) is a popular scheduler for supporting
multiple service classes—especially data service class—as it can control
how each class receives a fair share of the scheduler bandwidth resources.
Each class gets weighted amount of service in each cycle. Weights can be
determined based upon SLA or by the proportion of traffic in each class.
WFQ scheme could provide satisfactory queuing delay performance, as
long as the operating conditions are well defined and bounded:

High priority traffic has a well-defined CBR characteristic.

The real-time sensitive traffic (voice/video) in WFQ has a well-
defined CBR characteristic.
Similarly to PQ scheduler described previously, WFQ does not prevent
oversubscription of a given class, which could lead to substantial queuing
delay if not configured or engineered properly.
Figure G-3: WFQ scheduler

Any single WFQ queues can fill up and overflow as traffic load increases.
When this occurs, all flows in that queue suffer packet loss. For real-time
media such as voice and video, this may become a serious problem because
everyone suffers quality degradation. WFQ alone is not capable of offering
guaranteed service, such as priority or premium voice, video or gaming
real-time services. Additional QoS mechanisms, including admission
control methods and/or sufficient overprovisioning, are required to provide
hard QoE guarantees.
Video Packet Queuing Delay (m ax)

with 100 Kbps UDP Voice in PQ with 1 Mbps UDP Voice
1316 bytes video packet in WFQ
10000
1000
Queuing Delay (max, ms)
100
10
0.1
10% 20% 30% 40% 50% 60% 70% 80% 90%
WFQ Share of Vide o Que ue
Figure G-4: WFQ

The use of WFQ scheduling discipline for real-time sensitive traffic is not
as stable as PQ scheme as the scheduler weight needs to be adjusted as a
function of a number of parameters:
Video traffic load and distribution

Virtual Circuit capacity

Higher priority traffic load
Number of WFQ classes
The use of WFQ might be problematic for VBR video as the traffic rate is
continuously changing and would require WFQ share tuning and/or peak
rate provisioning. In summary, WFQ does not offer a stable queuing delay
behavior over a wide range of operating conditions; therefore, is not the
preferred scheduling discipline for real-time sensitive traffic (voice/video),
although it may work under specific operating conditions. WFQ is best
suited for elastic data traffic with no hard QoE requirements.
Traffic Conditioning - Policing/Shaping

The primary objective of traffic conditioning is to perform rate limiting
function to ensure that the traffic going through a given node or service
class does not exceed declared limits. Two distinct types of rate limiting
functions are typically used in packet networks: policing and shaping.
Policing and shaping are not the same and have subtle differences as shown
on Figure G-5.
Policing: provides a soft rate limit with burst capabilities
Does no smoothing or shaping of traffic, but limits bursts and
policed rate
Does no buffering and adds no delay
Propagates bursts
Shaping: provides a hard rate limit with no burst capabilities
Traffic shaping eliminates burstiness
Traffic shaping guarantees that the long-term transmission rate
will not exceed the configured rate
Traffic shaping smooths traffic by storing traffic above the
configured rate in a queue
Packets are lost only when the queue is full

Allowed
burst
Peak
Average Rate
Rate
Policing Shaping
Figure G-5: Rate limiting characteristics of policing and shaping
Figure G-6 shows an example of a two-color policing implementation
using a token bucket architecture. The token bucket accumulates tokens at
the “Committed Rate” up to the burst level. When that happens, the tokens
are discarded. When the incoming packet aggregate conforms to a
“Committed Information Rate” (CIR) with some burst bandwidth in line
with the “Committed Burst Size” (CBS), then packets are marked as in-
profile. Otherwise, when burst size exceeds the “Excess Burst Size” (EBS)
limit, packets are marked as out-profile. After packet classification, in-
profile packets will be discarded only after all out-profile traffic has been
dropped (differentiated dropping probabilities). Excess traffic is tagged and
may be discarded under congestion.
Tokens
1 token = credit for 1 byte
CIR
Overflow CBS bucket depth

Tokens =
EBS burst size
Packets Enough
YES:
arriving Credits?
CBS - Committed burst size
EBS - Excess burst size
CIR - Committed information rate =
Token arrival rate
NO: Exceed
Figure G-6: Two-color Policing

It should be pointed out that short flow traffic conditioning/policing might
not be effective. In situations where traffic flow duration is very short (that
is, a couple packets only), two situations can occur depending on the token
bucket averaging window size used to make decision:
Long average window: traffic parameter might never be exceeded
since the transaction is very short; a couple packets, the connection
is closed. Token bucket is acting is acting as a low pass filter.
Short averaging window: in this case yes, CIR might be exceeded
and policing actions will be taken. However, because the averaging

window is so small, then all transactions get marked as

nonconforming with the net results being disastrous for all.
Queue management
Queue management is a function required to store packets before
transmission on a link or interface. The simplest technique for queue
management is called “tail drop”. Tail drop is a passive queue
management technique, whereby the queue is to set a maximum length (in
terms of packets) for each queue, accept packets for the queue until the
maximum length is reached, then reject (drop) subsequent incoming
packets until the queue decreases because a packet from the queue has been
transmitted. The other class of queue management is called Active Queue
Management (AQM). The basic idea behind active queue management
schemes such as WRED/RED (random early detection) is to detect
incipient congestion early enough to convey implicit congestion
notification to the end-systems, allowing them to reduce their transmission
rates before queues in the network overflow and packets are dropped.
WRED/RED detects congestion by monitoring the queue size and start
dropping packet randomly when a queue threshold is reached. WRED/
RED offers a proactive response to congestion:
preventing synchronization of TCP timeouts and restarts
providing early feedback on congestion in network
Dropping
Probability
Enqueue Packet Randomly Packet Dropped
Packet Dropped
100%
0%
Minimum Maximum AVG Queue

Threshold Threshold Size
Figure G-7: RED (random early detection) operating principle

In practice, the story is slightly different and RED/WRED suffer from
severe limitations that affect overall QoE performance as well as its high
complexity, as five parameters need to be consistently tuned to achieve the
desired performance level. It was observed through simulation and
modeling that WRED provides marginal performance improvement over
tail drop queues in the recommended QoE operating range as shown in
Figure G-8. The application response degrades sharply as the number of

flows increases and is very sensitive to the WRED parameters selected.

The Optimal WRED settings are different for super short and long flows, as
well as load dependent. The 1st generation of AQM, such as Random Early
Detection (RED)/WRED, algorithms were not based on any classical
theory (for example, control theory, optimization techniques). In addition,
the development of these algorithms was built on very simplistic TCP
models. This led to a number of heuristics, which did not provide any
insights into the dynamics of the system, while increasing implementation
complexity. The second generation of AQM, such as DRED, PI and AVQ,
are based on solid classical theory techniques and refined TCP models.
Hence, they produce more controllable and predictable behavior, and lesser
implementation complexity; unfortunately, their deployment has not
reached networking equipment routers/switches yet.
Short Flows Response Time (T3 link)

Best Effort, 200 pkt buffer" QoS with WRED#6
QoS with WRED #2 QoS with WRED #4
QoS qith WRED #5 QoS with WRED #8
10
9
Response Time (sec) (90th PC)
8
7
6
5
4
3
2
1
Recommended QoE operating zone
0
0 300 600 900 1200 1500 1800 2100 2400
Number of Users
Figure G-8: WRED QoE performance (90th percentile response time) against
tail drop (best effort) as a function of the number of users. This
graph compares multiple WRED configurations (mint, maxt, drop
rate) against tail drop. The traffic offered load is IDENTICAL in
both best effort and WRED-enabled solution; hence both QoS
mechanisms are compared on the same loading conditions. The
instability of WRED is reflected in response time variation as
load changes. No optimal setting works for a wide range of
operating conditions


721
Appendix H
PPP Header Overview
The Multiclass Extensions for Multilink PPP provides for Service Classes
to be specified in the Multilink PPP header. Two PPP multiclass formats
are defined. The short sequence number format provides four classes of
service and the long sequence number format provides sixteen classes of
service. The PPP class fields are circled in Figure H-1 for both Long and
Short Sequence Number formats.
Short Sequence Number Long Sequence Number Format

Format with PPP Service with PPP Service Class Extension
Class Extension
In the Short Sequence In the Long Sequence Number

Number format, there are 2 format, there are 4 bits (class)
bits (cls - abbreviated from specified for the PPP Class
class to fit) specified for the Numbers resulting in 16 classes.
PPP Class Numbers resulting
in 4 classes.
Figure H-1: PPP header

722 Appendix H PPP Header Overview

723
Glossary
1xEV-DO 1.25 MHz Evolution, Data Only
1xEV-DV 1.25 MHz Evolution, Data & Voice
1xRTT Single carrier (1x) Radio Transmission Technology - third
generation wireless technology for CDMA, also called
CDMA2000
2.5G Enhanced Second Generation wireless technology - adds some
data networking functionality to 2G systems. An example is
GSM GPRS.
2G Second Generation wireless technology - 2G wireless uses
digital voice transmission across the radio channel.
3G Third Generation wireless technology - 3G uses redefined
channels to allow transport of digital voice and data services.
5-tuple Combination of values for fields of IP packet header used to
specify filters: IP source and destination addresses, protocol
number, source and destination transport identifiers (port
numbers)
6bone An experimental IPv6 network now being wound down
6over4 An obsolete tunnelling technique for carrying IPv6 across IPv4
6to4 An automatic tunnelling technique for carrying IPv6 across IPv4
802.1p A 3-bit field in the IEEE 802.1Q extension to the Ethernet header
used to identify classes of service over Ethernet
802.1Q IEEE standard that defines the operation of VLAN Bridges that
permit the definition, operation, and administration of Virtual
LAN topologies within a Bridged LAN infrastructure. It defines
four additional bytes to the Ethernet frame header providing a
VLAN ID and 802.1p user priority field used for QoS.
A6 Experimental DNS record used to return IPv6 addresses.
AAAA Quadruple A DNS record used to return IPv6 addresses
AAL ATM Adaptation Layer
ABR Available Bit Rate

724 Glossary
Access Control Provides a means to filter packets/cells. Allows a user to permit

or deny traffic from crossing specified interfaces. Packet filtering
helps to control packet movement through the network. Such
control can help limit network traffic and restrict network use by
certain users or devices
ADM Add-Drop Multiplexer - A single-stage multiplexer/
demultiplexer can multiplex various inputs into an OC-n signal.
At an add/drop site, only those signals that need to be accessed
are dropped or inserted. The remaining traffic continues through
the network element without requiring special pass-through units
or other signal processing.
Administrative The part of the network managed by a single organization. Note
domain that a company may have one or more administrative domains
that are managed separately, for example, North American
operations and European operations.
Admission request Part of the H.225.0 Registration, Admission and Status protocol.
These are call-related messages used to get permission from the
H.323 Gatekeeper to make a call.
ADSL Asymmetric Digital Subscriber Line - a technology that allows
analog subscriber lines (twisted pair copper) to carry data. The
technology can handle rates of up to 9 Mb/s downstream and up
to 640Kb/s upstream. The appropriate modem is required.
AF_INET6 Symbol used for IPv6 addressing family in the socket API
AH Authentication Header
AIS Alarm Indicating Signal - a code sent downstream indicating an
upstream failure has occurred.
ALG Application Level Gateway
AMPS Advanced Mobile Phone Service - first generation wireless
service used in North America. AMPS carries analog voice
signals on FM radio channels.
ANSI American National Standards Institute: A membership
organization which develops U.S. industry standards and
coordinates U.S. participation in the International Standards
Organization (ISO).
Answer An SDP message sent by an answerer in response to an offer
received from an offerer.
API Application Programming Interface
APNIC Asia-Pacific Network Information Centre - an internet registry.

Glossary 725
Application Level The protocol architecture encourages application control over

Framing (ALF) mechanisms that traditionally fall within the “transport layer.”
Area In OSPF Internet Gateway Protocol, an area is a region defined
to permit segregation of the routing database information from
the autonomous system.
Area Border A router attached to on the boundary of an area to two or more
Router (ABR) areas inside an OSPF network
ARIN (North) American Registry for Internet Numbers - an internet
registry.
ARP Address Resolution Protocol
AS Autonomous System (routing administrative domain)
ASN.1 Abstract Syntax Notation Number 1
Assured A DiffServ PHB designed to support applications that are send
Forwarding PHB packets at a variable rate. The AF PHB can be used to support
group either real-time or non-real-time applications.
Asynchronous A digital network that has unsynchronized transmission system
network payloads and network nodes that run on their own clocks.
Asynchronous Tandem encoding where the samples or frame boundaries have
tandeming not been temporally aligned; instead, the relationship between
sampling points or frame boundaries is arbitrary. Asynchronous
tandeming introduces more degradation than synchronous
tandeming. Compare with transcoding.
ATM Asynchronous Transfer Mode - a type of multiplexing which
Broadband ISDN will use, where payload is multiplexed into
cells.
ATM service A service class having a set of traffic management parameters
category which defines the associated performance.
ATM traffic Parameters that specify how the rate and burst size of ATM cells
management entering the ATM network are enforced.
parameters
Audio codec A codec specifically designed to encode general audio signals.
Audio compression codecs typically use a model of auditory
perception (rather than a model of speech production, as is
typical of speech codecs). Such codecs typically perform worse
than speech codecs with speech signals but perform better with
non-speech signals.

726 Glossary
Automatic Generally associated with OSI Layer 1, the protection port is pre-
Protection planned; this is for SONET layer. The switch immediately
Switching (APS) activates the spare port and switches traffic upon failure
indication
Autonomous A collection of IP address/networks that is under the control of a
System single entity. Each router that is designated an autonomous
system will contain a full copy of the routing table.
Autonomous An AS boundary router (ASBR) is attached at the edge of an
System Boundary OSPF network Autonomous system. An ASBR runs an
Router (ASBR) interdomain routing protocol such as BGP
B-Frame A bidirectional differential frame which is one type of
compressed video frame. The B-Frame contains motion vectors
and the residual information needed to reconstruct parts of the
image uncovered by the displacement of a moving object.
Backhaul Route traffic “out of its way” to reach the destination. Done to
reach special equipment (such as a satellite ground station), to
reduce cost, or to avoid congested routes.
Back-to-back user Logical entity that receives a request and processes it as a user
agent agent server (UAS). To determine how the request should be
answered, it acts as a user agent client (UAC) and generates
requests. Unlike a proxy server, it maintains dialog state and
must participate in all requests sent on the dialogs it has
established. Since it is a concatenation of a UAC and UAS, no
explicit definitions are needed for its behavior.
Bandwidth (1) For analog signals, the difference between upper and lower
frequency limits; said of a signal, a channel, or a filter. (2) The
amount of data that can be put through a given channel in a given
time. The term is used to refer to either the maximum data rate
that a channel can carry or the minimum rate required for a
particular signal. This use of the term is derived from the
relationship between the frequency bandwidth of an analog
carrier and the maximum rate that the carrier can be modulated to
signal one bit of information. The broader the bandwidth, the
faster the maximum modulation rate, and so the more bits can be
sent per unit time.
BGMP Border Gateway Multicast (routing) Protocol
BGP Border Gateway Protocol
BGP/MPLS VPN Layer 3 VPN implemented using MPLS Label Switched Paths
(LSPs) to carry traffic between PoPs and an extension of BGP to
route traffic. Originally documented in RFC 2547.

Glossary 727
BGP4 Border Gateway (routing) Protocol, Version 4

BIA Bump-in-the-API (IPv6-IPv4 translation technique)
BICI Broadband Inter Carrier Interface
BIP-8 Bit Interleaved Parity-8 - a method of error checking in SONET
which allows a full set of performance statistics to be generated.
BIS Bump-in-the-Stack (IPv6-IPv4 translation technique)
B-ISDN Broadband Integrated Services Digital Network - a single ISDN
network which can handle voice, data, and eventually video
services.
Bit rate (1) For transmission links, the maximum number of bits per
second that can be transmitted over the link. (2) For codecs
specifically, the number of bits per second required to transmit
the encoded signal. The bit rates quoted in this document are
those required by the codec alone; bits used for signaling or
channel coding are not included.
Bit Stuffing In asynchronous systems, a technique used to synchronize
asynchronous signals to a common rate before multiplexing.
BOOTP Boot Protocol (predecessor of DHCP)
Border Gateway Distance vector IP protocol. Many distinctions are created
Protocol (BGP) among which are I-BGP (internal), E-BGP (external) and M-
BGP (multiprotocol)
Broadband Services requiring 50-600 Mbps transport capacity.
Broadband Digital A SONET cross-connect that accepts various optical carrier
Cross-Connect rates, accesses the STS-1 signals, and switches at this level. It is
ideally used at a SONET hub.
Burst rate Refers to Committed Burst Size (BC) - The maximum number
of information units that can be transmitted during the interval T
and Excess Burst Size (BE). The maximum number of
uncommitted information units (in bits) that the network will
attempt to carry during the interval T.
Busy Hour The hour of the day that has the maximum call volume
Bypass Tunnel An LSP that is used to protect a set of LSPs passing over a
common facility

728 Glossary
Call Admission A mechanism that ensures the network has sufficient capacity to
Control (CAC) provide service to a user before admitting the session. Admission
criteria usually require that the user being admitted will receive
adequate performance and the added session will not cause other
users to experience quality degradation.
Call Routing A table maintained in the switch that provides an ordered and
Table conditional list of all possible next hop routings to reach a given
telephone number from that switch (the order is based on the
conditions).
Call Volume The integration of number of calls and the duration of each call.
Capex Capital Expenses
CBR Constant Bit Rate - an ATM service category
CBS Committed Burst Size - the size up to which packets will be
delivered while meeting the service class performance.
CCITT Former name of the ITU-T
CCS Centum Call Seconds (Hundred Call Seconds)
CDMA Code Division Multiple Access - used for digital cellular access.
CDMA splits each “bit” into a binary sequence of smaller units
called chips. A particular pattern of chips (the code) is assigned
to each user. The receiver uses the same code to extract its
intended signal; signals based on other codes appear to that
receiver as noise. CDMA is the basis of the US IS-95 2G system
as well as 3G UMTS and CDMA2000. Compare FDMA,
TDMA.
CDV Cell Delay Variation - in ATM, the variation in time-of-arrival of
cells at the receive end, analogous to IP packet delay variation
(jitter).
CDVT CDV Tolerance - in ATM, the upper limit on the Cell Delay
Variation. Specified for all ATM service categories.
CE Customer Edge
Central Office The term for the local telephone switch where customers’ lines
connect to the phone company. Also called the CO.
CEPT European Conference of Postal and Telecommunications
Administrations. The ECC (Electronic Communications
Committee) under the CEPT handles radio and
telecommunications matters.

Glossary 729
CER Cell Error Ratio - in ATM, a measure of the ratio of cells with
errors to the total number of transmitted cells.
CES Circuit Emulator Service
Channel coding Error protection encoding for a signal transmitted over a wireless
or cellular radio link that is subject to Rayleigh fading and other
interference. Protection might include a checksum, forward error
correction, interleaving, and/or redundant information. Channel
coding is done on the bit stream output of the source codec
(compare source coding).
Chrominance The intensity of color in a television signal relative to a standard
color. Adding white reduces the color intensity.
CIDR Classless Inter-Domain Routing
CIR Committed Information Rate - the rate up to which packets will
be delivered while meeting the service class performance.
Class Selector A DiffServ PHB designed to support legacy routers that only
PHB group support the older form of IP QoS called IP Precedence. The Class
Selector PHB can support either eight priority classes similar to
IP Precedence or can be configured to inherit EF, AF and DF
PHBs.
CLP Cell Loss Priority - in ATM, a traffic management parameter that
specifies whether cells may be discarded if the network is
congested.
CLR Cell Loss Ratio - in ATM, the ratio of cells that are lost
compared to the number of cells originally sent; a required
parameter for some ATM service categories.
CMR Cell Misinsertion Rate - in ATM, the ratio of cells received at an
endpoint that were not originally transmitted by a given source
compared to the total number of cells transmitted from the
source.
CNG, comfort A DSP device that replaces background noise in a signal where
noise generation the background noise has been removed by a voice switch, the
non-linear processor in an echo canceller, or by a DTX feature. A
simple CNG will fill in with white or filtered Gaussian noise.
More sophisticated CNG designs may try to model the voice. In
some cases, information from the sending end is used to
reconstruct the noise to better match any noise present in the
speech.
Co-channel Radio frequency interference broadcast on and intended for the
interference same channel as is being received.

730 Glossary
Codec A device comprised of an encoder and a decoder, which

transforms a signal into a digital format and restores it again.
Some codecs take an analog signal as input, others operate on
standard digital signals, such as linear PCM or G.711 (which
see). The word is derived from code, decode.
Color burst A short (10 cycles) of colour sub-carrier which makes up a
component of every line of an analog television signal. The color
burst serves as a reference to enable the chrominance
information to be accurately decoded.
Compression A codec that removes redundancy from signals, so as to use
codec fewer bits to encode the signal; especially, one that is based on a
model of the speech production or auditory perception process.
Concatenated The linking together of various data structures, for example two
bandwidths joined to form a single bandwidth.
Concatenated A Synchronous Transport Signal (STS-NC) composed of n STS-
STS-1 1s combined. It is used to transport signals that do not fit into an
STS-1 (51 Mbps) payload.
Concatenated VT A virtual tributary (VT x Nc) which is composed of N x VTs
combined. Its payload is transported as a single entity rather than
as separate signals.
Connection In H.323, an association between two endpoints that permits
ongoing communications, both at the signaling level and the
media transfer level.
Connectionless ISO CLNP is a datagram network protocol. It provides
Network Protocol fundamentally the same underlying service to a transport layer as
(CLNP) IP. CLNP provides essentially the same maximum datagram
size, and for those circumstances where datagrams may need to
traverse a network whose maximum packet size is smaller than
the size of the datagram, CLNP provides mechanisms for
fragmentation.
Connectivity An OAM message to check that the LSP reaches its pre-defined
Verification egress node
message
Constraint-based A class of routing systems that compute routes through a
Routing network subject to the satisfaction of a set of constraints
[INTERNET-TE]
Context In H.323, the associations between collections of terminations.
Contexts implement bridging and mixing of the media flows
between terminations.

Glossary 731
Contributing One of several sources of packet streams that contribute to a

Source Identifier combined stream produced by an RTP mixer. Identifiers of the
(CSRC) sources are included in the RTP header of the mixer output
packet. The list of identifiers is called the CSRC list. For
example, an audio conferencing application where a mixer
indicates all the talkers whose speech was combined to produce
the outgoing packet, allowing the receiver to indicate the current
talker, even though all the audio packets contain the same SSRC
identifier (that of the mixer).
Convergence Bringing together different functions into the same
infrastructure. For example, network convergence refers to
putting different kinds of traffic (voice, e-mail, etc) onto the
same network where previously a different network carried each
type of traffic.
CoS Class of Service
CPE Customer Premises Equipment - telephone equipment that is
located on the customer's premises (residence or business). It
may be owned by the customer or leased from the Service
Provider.
CRC Cyclic Redundancy Check
CR-LDP Constraint-based Routing LDP - a routing protocol used in
MPLS; an alternative to RSVP-TE.
CS Call Server
CTD Cell Transfer Delay - in ATM, the time taken for a cell to travel
between two points in a network. Both mean and peak delay
values are measured.
DAD Duplicate Address Detection
DARPAnet Defense Advanced Research Projects Agency network
Data In SONET, OAM&P channels that enable communications
Communications between intelligent controllers and individual network nodes as
Channels well as inter-node communications.
DAVIC Digital Audio Visual Council - DAVIC is a non-profit
organization with the aim of defining standards for the end-to-
end transfer of digital audio, video, and internet based content.
(http://www.davic.org)
DCE Data Communication Equipment - refers to network devices
used to support communication between the data source and
destination, for example a modem.

732 Glossary
DCT Discrete Cosine Transform - a special type of Fourier transform

where all the odd (sine) components are removed by ensuring
that the waveform is symmetrical about the zero time point. This
is achieved by combining the nonsymmetrical series of samples
with a mirror image of itself reversed in time before the datum
point. For discrete sampled signals this is very easy.
DDNS Dynamic DNS
Default A DiffServ PHB designed to support ‘best effort’ applications.
Forwarding PHB
Designated Router Forms adjacency with all the other routers and any new router
(DR) joining must forms adjacency to this router; used to avoid the
problem of each router forming a link with every other router
DFZ Default (route) Free Zone
DHCP Dynamic Host Configuration Protocol
DHCPlite A restricted, stateless version of DHCP
Dialog Peer-to-peer SIP relationship between two UAs that persists for
some time. A dialog is established by SIP messages, such as a
2xx response to an INVITE request. A dialog is identified by a
call identifier, local tag, and a remote tag.
DiffServ Short for differentiated services. Refers to a set of protocols
defining format and content of header fields used for identifying
the service class for a particular packet. A group of IETF RFCs
address DiffServ, beginning with RFC 2474: Definition of the
Differentiated Services Field (DS Field) in the IPv4 and IPv6
Headers.
DiffServ Field The IP4 header TOS octet or the IPv6 Traffic Class octet when
interpreted in conformance with the definition given in RFC
2474. The six MSBs of the DSCP field encode the DiffServ Code
Point, while the two LSBs are currently unused.
Discovery In a messaging context, the messages sent to (normally well
known) addresses to allow end points to discover their
controlling entities; for example, in H.323, these messages would
be multicast to permit discovery of the H.323 Gatekeeper.
Distributed Multi- Expands the MLT concept to spread Multilink Trunks over
Link Trunking multiple switches
DLCI Digital Link Connection Identifier
DMT Discrete Multi-Tone

Glossary 733
DNS Domain Name Service - translates domain names into IP

addresses.
DOCSIS Data Over Cable Service Interface Specification
DS0 Digital Signal, Level 0 - an individual 64 Kb/s data stream
carrying a single voice channel.
DS1 Digital Signal, Level 1 - a 1.544 Mb/s data stream carrying 24
voice channels. DS1 is a North American standard. Compare
with E1
DSCP DiffServ Code Point - a six bit value in the MSB portion of the
DiffServ Field used to select the DiffServ PHB.
DSL Digital Subscriber Line - a method of running high speed digital
communications over copper loop transmission line.
DSLAM Digital Subscriber Line Access Multiplexer - A device that
terminates many subscriber lines and multiplexes the
corresponding data for aggregate transport over ATM or FR.
Newer versions of DSLAMs operate on IP or both IP and ATM.
DSP Digital Signal Processing - a specialized computing function
used to perform complex operations on digitized waveforms
such as speech.
DSTM Dual-Stack Transition Mechanism
DTE Data Terminal Equipment
DVB Digital Video Broadcast
DVB-ASI Digital Video Broadcasting – Asynchronous Serial Interface -
the DVB consortium http://www.dvb.org is an industry standard
group to the use of digital video technologies and broadcasting.
As in most standard bodies there are multiple working groups
that cover such issues as encoding and decoding techniques as
well as multiplexing and in particular to this transmission and
interfaces. DVB- ASI is a standard interface specification that is
commonly used for the transmission and reception of digital
video content.
DVD Digital Versatile Disk (Note: not Digital Video Disk)

734 Glossary
DVMRP Distance Vector Multicast Routing Protocol - this is a Dense

Mode Multicast routing protocol that uses a ‘flood and prune’
methodology. This methodology assumes a dense population of
listeners. Television fits this model well. In the past there were
many factors that limited the scalability and hence usefulness of
DVMRP. New enhancements such as static routes and passive
interfaces allow for an improvement in service profile and
scalability. DVMRP is also capable of creating forwarding paths
for multicast trees that are incongruent with the unicast
forwarding path. This is because DVMRP has its own update
protocol unlike PIM which relies on the routing table created by
the unicast routing protocol. This can be very handy in traffic
engineering design.
DWDM Dense Wavelength Division Multiplexing - a method of putting
multiple signals at different wavelengths onto a single optical
fiber.
E1 Electrical stream 1 - this is a 2.056 Mbps data stream with 32
voice channels. E1 is a European/World standard.
ECAN Echo canceller - a device that uses adaptive filtering to remove
echo from a voice path. The echo canceller tracks the forward
signal and the returning echo, and builds a filter matching the
echo characteristics. The filter is then used to create a matching
echo, which is subtracted from the returning signal to remove the
echo.
Echo A situation in which a signal is picked up from one channel of an
interactive voice call and sent back on the other channel. Echo
can be experienced as talker echo (where the talker hears his or
her own voice coming back) or listener echo (where the listener
hears two instances of the talker's voice, separated by a short
delay. The audibility of echo depends on the level of the echo
signal and on the delay imposed by the network; longer delay
increases the noticeability of the echo. Audible echo can be an
annoying impairment.
ECN Explicit Congestion Notification
ECSA Exchange Carrier Standards Association - an organization that
specifies telecommunications standards for ANSI.
EGPRS Enhanced General Packet Radio Service
EIA Electronics Industries Alliance
E-LSP EXP-Inferred-PSC LSP

Glossary 735
End Office A term for the local telephone switch where customer lines
connect to the phone company, also called the Central Office
(CO).
Endpoint In H.323, a terminal, Gateway, or MCU. An endpoint can call
and be called. It generates and/or terminates information streams.
Equal Cost Protects against link failure and is best used on the links between
Multipath high availability routers where load sharing and quick recovery
from a failure are required. ECMP allows a router running OSPF
to distribute traffic across multiple, equal cost routed paths
Erlang A unit of voice traffic volume. One Erlang is a call volume
sufficient, if all segments were concatenated, to occupy one 64
kb/s trunk for one hour.
Erlang Tables Probability tables that yield the number of trunks required
between two switches to provide a specified level of call
blocking given a calling volume between those switches.
ESP Encrypted Security Payload
Ethernet Virtual A logical “broadcast” domain, containing traffic to certain
Private LAN group members at Layer 2
Excess Sometimes known as the burstable component, the amount of
Information Rate data accepted by the network but marked Discard Eligible (DE).
(EIR) Note: sometimes EIR is referred to as Extended Information
Rate. In either case EIR = BE/T
Expedited A DiffServ PHB best suited for low latency, low loss, and low
Forwarding PHB jitter real-time services
Explicit A two-bit field in the DiffServ field used by routers to indicate
Congestion to neighboring routers that the router is experiencing congestion.
Notification
Explicit Route The path taken by an LSP is explicitly specified. This means that
the route is established by a means other than normal IP routing
Fast Reroute Techniques used to repair LSP tunnels locally when a node or
link along the LSP path fails
FastStart A term related to H.323 setup procedures with an abbreviated
sequence that allows call setup and connection setup to occur in
one round trip.
FCH Fundamental Channel

736 Glossary
FCOT Fiber Control Office Terminal - a generic term for fiber terminal
which can be configured either fully digital, fully analog, or
mixed.
FDMA Frequency Division Multiple Access - a wireless access
technique in which the wireless RF band is shared by dividing it
into narrower bands, each of which carries data for a separate
channel. Compare CDMA, TDMA.
FEC In MPLS, Forwarding Equivalence Class indicating Route +
Class of Service. In digital communications, Forward Error
Correction, a method of detecting and correcting data errors
without the need for retransmission.
FERF Far End Receive Failure - a signal to indicate to the transmit site
that a failure has occurred at the receive site.
Field In video, information representing one of the two scan patterns
that make up a frame in an interlaced television system. Two
fields constitute one frame.
File format A standard method of parsing digital data and any information
needed to read and display it.
FoIP Fax over IP
Foreign agent PDSN serving the local Base Station Controller where the user is
currently connected
FOTS Fiber Optic Transmission System, such as SONET or SDH.
FRAD Frame Relay Access Device
Frame Speech/audio: a segment of speech operated on by a frame-based
codec. Video: a single image from a series displayed sequentially
to simulate motion. Data: a segment of data to be parsed
according to a rule specified by the data format.
Frame roll An effect caused by receiver and camera not being synchronized
and has the appearance of sequential pictures moving vertically
through the screen
FRF Frame Relay Forum, now the MPLS and Frame Relay Alliance
FrNNI Frame Relay Network to Network Interface
FrUNI Frame Relay User to Network Interface
FT1 Fractional T1 is nothing more than a T1 with only some of the
DS0s being used.
FTP File Transfer Protocol

Glossary 737
Gatekeeper In H.323, an entity on the network that provides address

translation and controls access to the network for H.323
terminals, Gateways, and MCUs. The Gatekeeper may also
provide other services to the terminals, Gateways, and MCUs
such as bandwidth management and locating Gateways.
Gateway Generally, a device at a network interface that converts signals
from the form used in one network to the form used in the other.
Gateways handle conversion of signaling, protocols, and/or
content. Compare Media Gateway. In H.323, a Gateway (GW) is
an endpoint on the network which provides for real-time, two-
way communications between H.323 Terminals on the packet
based network and other ITU Terminals on a switched circuit
network or to another H.323 Gateway.
Gateway Switch A switch that handles interconnections between different
autonomously defined networks, that is, between telephone
companies or regions of the country or between countries.
GCRA In ATM, Generic Cell Rate Algorithm
GFC Generic Flow Control
GGSN Gateway GPRS Support Node
Global Recovery The working path LSP is protected by an end-to-end protection
path LSP
GPRS General Packet Radio Service, packet data service on GSM.
GRE Generic Routing Encapsulation
Grooming Consolidating or segregating traffic for efficiency
GSM Global System for Mobile Communications - a wireless
technology originally standardized in Europe (ETSI), but now
essentially global.
GW Gateway - a term used in many protocols to describe interface
between one network type and another (for example, between IP
and PSTN)
HDLC High level Data Link Control
HDSL High speed Digital Subscriber Line
Hello Packets used to verify link operation without transferring large
tables.
High Usage Trunk This is a trunk that is dedicated for a high usage connection
between two switches that are not directly above or below each
other in the switch hierarchy.

738 Glossary
Home agent In a 2G wireless system, the Packet Data Serving Node where the
user maintains a full time presence and has a gateway to other
networks
HTML HyperText Markup Language
HTTP HyperText Transfer Protocol
HTTPS Secure version of HTTP
Hue The position of a particular colour within the visible spectrum of
colours.
HyperText An application level protocol with the lightness and speed
Transfer Protocol necessary for distributed, collaborative, hypermedia information
(HTTP) systems.
I Frame Intra Frame - one of the frame types used in MPEG which
contains complete information for one complete frame.
IANA Internet Assigned Numbers Authority
ICANN Internet Corporation for Assigned Numbers and Names
ICMP Internet Control Message Protocol
IETF Internet Engineering Task Force - a standards body governing
Internet operation
IGMP Internet Group Management Protocol - IGMP is the edge session
protocol for IP multicast. It is supported in both L3 (router side)
and L2 (client side). Router side implementations of IGMP work
in tandem with the L3 multicast routing protocol (DVMRP or
PIM). In many instances the IGMP router side process is part of
the multicast routing process meaning that IGMP does not have
to be enabled as a separate process. On the client side it is
embedded into the Operating System of the device. There are
three versions of IGMP. IGMPv1 and 2 are similar in both
protocol primitives and group representation. IGMPv2 mainly
brings the concept of an implicit ‘leave’ message in to enhance
the edge router performance profile. IGMPv3 uses newer
primitives and embeds group membership into the IGMP
message in such a way as to break traditional methods of IGMP
L2 snooping at the edge. This issue is being investigated.
IGP Interior Gateway-Protocol
IID Interface IDentifer
iIS-IS Integrated IS-IS
IKE Internet Key Exchange

Glossary 739
Interlace The process of scanning a picture and sending alternate lines in

one field than the “skipped” lines in the following field (see Field
below)
Interleave The ability of SONET to mix together and transport different
types of input signals in an efficient manner, thus allowing
higher-transmission rates.
Internal Router A router that has interfaces only within a single area inside an
(IR) OSPF network
International This is a telephone switch that contains a list of all possible
Gateway country codes.
Internet Packet IPX is a Connectionless network layer protocol designed by
Exchange (IPX) Novell NetWare.
IntServ Integrated Services for QoS
INVITE A SIP message that is used to initiate a session between two
parties.
IP Internet Protocol
IPsec IP Security
IPv4 IP version 4
IPv6 IP version 6
IPX Internetwork Protocol Exchange (obsolete)
IRC Internet Relay Chat
ISATAP Intra-site Automatic Tunnel Addressing Protocol
ISDN Integrated Services Digital Network - an international
telecommunications network standard that provides end-to-end
digital connections for both voice and non-voice service.
IS-IS Intermediate System-to-Intermediate System (Routing Protocol)
ISP Internet Service Protocol
ISUP ISDN Services User Part (an SS7 signaling Protocol)
ITU International Telecommunications Union - the UN body
governing international radio and telecommunications. The
Telecom Sector is ITU-T.

740 Glossary
Jitter In IP and other packet networks, the variability in the time of

arrival of packets or cells at the receiver due to differences in
queuing and processing time taken for transport. In general, jitter
is the deviation from expected time of occurrence of a periodic
event such as RTP packet arrival, an optical pulse, or processing
of a timeslot in a bit stream. Jitter is short term variation from
that expected time, either earlier or later. (Medium- to long-term
shifting of the expected arrival time would be called “wander”).
Short waveform variations caused by vibration, voltage
fluctuations, control system instability, etc.
Jitter Buffer Shift register with a variable wait time. A jitter buffer consists of
storage sufficient for a few packets and a fixed, provisionable, or
an adaptive wait time criterion that smoothes out the variability
(jitter) in time of arrival, so that when the payload is unpacked a
constant-rate bit stream can be sent to the receiver.
L2TP Layer 2 Tunnelling Protocol
Label A property that uniquely identifies a flow on a logical or physical
interface
Label Edge MPLS edge node - MPLS node that connects an MPLS domain
Router (LER) with a node which is outside of the domain, either because it does
not run MPLS, and/or because it is in a different domain [MPLS-
ARCH].
Label Switch An MPLS node which is capable of forwarding native L3
Router (LSR) packets [MPLS-ARCH]
LACNIC Latin American and Caribbean Network Information Centre - an
internet registry
LAN Local Area Network
LANE LAN Emulation
LDP Label Distribution Protocol
LER Label Edge Router - access node to MPLS network (usually
Provider Edge)
LER, PE Label Edge Router, Provider Edge node - same functions but
different terminology – source and terminate LSPs
Line A transmission medium with equipment at both ends used to
transport information between two network elements.
Link State Messaging that occurs in OSPF networks to determine various
Advertisements things such as topology, stability of links, etc.
(LSA)

Glossary 741
L-LSP Label-Only-Inferred-PSC LSP

LMQI Last Member Query Interval - LMQI is the amount of time that
an edge router will wait before pruning a particular multicast
group. This parameter is often adjusted to improve the edge
performance for channel change times. This is of particular
importance in scenarios with multiple viewers per VLAN where
an explicit ‘fast leave’ or immediate prune is not appropriate.
Loss pad A device that inserts a programmable amount of loss into a voice
path. PAD is an acronym, standing for Programmable
Attenuation Device. [checking this]
Loss plan A network-wide plan for the intended analog signal level or
digital equivalent at each point in the network. The loss plan
balances the speech level and signal-to-noise ratio of the wanted
signal with the audibility of echo in the network. Also called the
loss/level plan.
LSB Least Significant Bit
LSNAT Load Sharing Network Address Translator
LSP Label Switch Path - a path consisting of labels from an ingress
edge router to an egress edge router
LSR Label Switch Router. Core (tandem) node
LSR, P Label Switch Router, Provider node - Same functions but
different terminology – pass along label mapping and requests –
switch LSPs using label swapping
Luminance The brightness of color in a television signal
MAC Media Access Control (protocol stack layer and address)
MACR Mean Allowed Cell Rate
MAN Metro Area Network
maxCTD Max Cell Transfer Delay
MBS Maximum Burst Size
MCR Minimum Cell Rate

742 Glossary
MCU The Multipoint Control Unit (MCU) is an endpoint on the

network which provides the capability for three or more
terminals and Gateways to participate in a multipoint conference.
It may also connect two terminals in a point-to-point conference
which may later develop into a multipoint conference. The MCU
generally operates in the fashion of an H.231 MCU; however, an
audio processor is not mandatory. The MCU consists of two
parts: a mandatory Multipoint Controller and optional Multipoint
Processors. In the simplest case, an MCU may consist only of an
MC with no MPs. An MCU may also be brought into a
conference by the Gatekeeper without being explicitly called by
one of the endpoints.
Media Gateway A device that converts data from the format required for one type
of network to the format required by another. For example, a
media gateway converts voice traffic from synchronous bit
streams (TDM) to IP packet flows. A master/slave controlled
gateway interface element in the Megaco/H.248 architecture.
Media Gateway A device that provides signaling to one or more Media Gateway
Controller devices. Typically, the Media Gateway Controller participates in
all session or connection signaling. A master/slave control server
in the Megaco/H.248 architecture.
Megaco/H.248 Protocol based on the master/slave principal and is the
international standard for gateway control in decomposed
networks, developed jointly by both IETF and ITU open
standards bodies. Megaco/H.248 is simple yet powerful and very
extensible, allowing for the building of partitioned gateway
functions underneath the call control layer (for example, SIP,
H.323, and others). It is highly flexible for mass deployment of
services in a cost-effective manner, as well as support and
evolution of legacy networks.
Mesochronous A network whereby all nodes are timed to a single clock source,
thus all timing is exactly the same (truly synchronous).
MGCP Media Gateway Control Protocol - originally a proposal for the
Megaco/H.248 Protocol also based on a master/slave approach.
While MGCP had early deployment and is a reality in some
networks, it is not representative of the current industry
direction, nor is it a truly open standard. MGCP offers limited
support of networks other than PSTN, is less flexible and
extensible than Megaco/H.248, and offers reduced multi-vendor
interoperability potential.
MIB Management Information Base

Glossary 743
MLD Multicast Listener Discovery

MLPPP Multilink PPP
Mobile IP IP Mobility Support
MOS Mean Opinion Score - a subjective quality metric consisting of
the average of a pool of subject ratings on a five point category
scale: 1 = bad, 2 = poor, 3 = fair, 4 = good, and 5 = excellent. The
output of objective algorithms that estimate MOS are sometimes
erroneously referred to as MOS. See discussion in Chapter 6
[Voice Quality].
MP-BGP4 Multi-Protocol BGP version 4
MPEG Motion A working group of ISO/IEC which has defined television
Picture Expert compression standards.
Group
MPLS Multi-Protocol Label Switching provides connection-oriented
(label based) switching based on IP routing and control protocols
[draft-ietf-mpls-framework-05.txt]
MPLS fabric A defect that happens at the MPLS layer
defect
MPLS Shim The MPLS header added (shimmed) to a link layer protocol for
Header use by MPLS routers
MPLSoATM Unless specified this refers to MPLS over ATM media using the
native ATM VPI.VCI as the MPLS label. This is also referred to
as ‘Ships in the Night’ operation
MSB Most Significant Bit
MTU Maximum Transmission Unit
MultiLink Allows two Ethernet lines to logically connect to single Layer 3
Trunking interfaces or IP Address
Multipurpose MIME is a specification for formatting non-ASCII messages so
Internet Mail that they can be sent over the Internet.
Extension
(MIME)
NA Neighbor Advertisement - a message in ICMPv6 Neighbor
Discovery
NAPT Network Address and Port Translator/Translation
Narrowband Services requiring up to 1.5 Mbps transport capacity.
NAT Network Address Translator/Translation

744 Glossary
NAT-PT Network Address Translator/Translation – Protocol Translator/

Translation
NCS/J.162 Network-based call signaling which is derived from an early
version of MGCP, and was developed specifically as a focused
approach for master/slave control of analog telephones attached
to cable access devices only. While MGCP and NCS/J.162 share
some heritage, they are mutually incompatible. Originally
developed by the PacketCable™ consortium (as NCS), it has
recently become an ITU-T recommendation (as J.162).
ND Neighbor Discovery - plug-and-play support protocol in IPv6,
part of ICMPv6
NE Network Element - a device performing one or more functions in
a network. Examples include router, media gateway, call server,
terminal (end device). In SONET, the five basic network
elements are add/drop multiplexer, broadband digital cross-
connect, wideband digital cross-connect, digital loop carrier, and
switch interface.
Network The design characteristics of a telecommunications network,
Architecture including the logical arrangement of the elements (topology), the
operational procedures, and the format of content, but especially
the functional organization of the network.
Network QoS The QoS configuration rules that are implemented on all network
policy elements within a network's administrative domain. A network
can have one QoS policy for the entire network or individual
QoS policies that apply to different network domains.
Network Topology The logical configuration of nodes and connecting links in a
network, regardless of the actual physical location of the nodes.
Examples: star, ring, mesh, hierarchical.
NFS Network File System
NLRI Network Layer Reachability Information - one or more (IP)
network address prefixes carried in a BGP Update message each
of which is combined with the common set of path attributes in
the message to form a route.
NNI 1) Network Node Interface - the ATM Forum's specification for
connections between network nodes. NNI makes network
routing possible. It typically refers to backbone trunk
connections between ATM switching equipment.
2) Network to Network Interface - an internal interface within a
network linking two or more elements

Glossary 745
NNTP Network News Transport Protocol

NPC Network Parameter Control
nrt-VBR An ATM service category called non-real-time Variable Bit Rate
NS Neighbor Solicitation (message in ICMPv6 Neighbor Discovery)
Noise suppressor A device that removes background noise from a wanted signal.
Also called noise reduction. Single input or single channel noise
suppression devices generate an on-going estimate of the
background noise, and subtract this estimate from the signal.
Dual channel noise suppressors require an independent estimate
of the noise, perhaps from a microphone in the listening
environment. See Ding et. al. (1996).
NSAP Address Network Service Access Point Address - a 20 byte addresses
associated with ATM
NSC Network Service Class
NT Network Termination
NTP Network Time Protocol
NTSC National Television Standards Committee - the committee and
name of the standard that defined the North American analog
television standard.
NUD Neighbor Unreachability Detection (part of ICMPv6)
OAM Operations, Administration, and Maintenance (also called
OAM&P Operations, Administration, Maintenance, and
Provisioning) - provides the facilities and personnel required to
manage a network.
OC Optical carrier - an OC-1 link carries 51.840 Mb/s. An OC-N
links carries N*51.840 Mb/s
OC-1 Optical Carrier Level 1 - the optical equivalent of an STS-1
signal.
OC-n Optical Carrier Level n - the optical equivalent of an STS-n
signal.
OFDM Orthogonal Frequency Division Multiplexing
Offer An SDP message sent by an offerer.
Opex Operating Expenses
Orderwire A channel used by installers to expedite the provisioning of lines.

746 Glossary
OS Operations System: Sophisticated applications software that

overlooks the entire network.
OSI Seven-Layer Open Systems Interconnection - a standard architecture for data
Model communications. Layers define hardware and software required
for multi-vendor information processing equipment to be
mutually compatible. The seven layers from lowest to highest
are: physical, link, network, transport, session, presentation, and
application.
OSPF Open Shortest Path First - a link-state routing algorithm (the
Dijkstra algorithm) that chooses the best route based on the
characteristics of each possible path: number of routers, link
speeds, delays, and cost. Also, an Internet Gateway Protocol
based on this algorithm. IP intra-domain or interior gateway
routing protocol that uses flooding techniques to build a
topological model of a routing domain in each node together
with a shortest path algorithm such as the Dijkstra algorithm to
calculate routes to nodes in the domain.
OSPFv2 Open Shortest Path First, Version 2 (routing protocol - IPv4
only)
OSPFv3 Open Shortest Path First, Version 3 (routing protocol - IPv6
only)
Out of profile Traffic entering a network element whose rate or burst size
exceeds the prescribed CIR or CBS, respectively.
Overhead Extra bits in a digital stream used to carry information besides
traffic signals. Orderwire, for example, would be considered
overhead information.
P-Frame Predictive frame - this frame carries the differences between the
last frame and the current frame in the MPEG 2 video
compression scheme.
PA Provider Addressing - a scheme for delegating address space for
IP networks in which service providers allocate address space to
customers from a larger block delegated to the provider by a
higher level provider or address registry.
Pacing Spreads the shaped (limited number of) packets over the time
interval evenly, preserving bandwidth for voice frames, and
preventing queue build up on the egress of the Frame network at
the slow link.
Packet Switching An efficient method for breaking down and handling high
volume traffic in a network.

Glossary 747
PacketCable™ An industry consortium introducing de facto standards for use by

the cable modem access industry
PAL Phase Alternation Line. A system, developed by Telefunken of
Germany, to overcome phase errors in broadcast television
signals, by alternating the colour subcarrier phase angle on
alternate lines. It is in widespread use throughout the world.
PAM Phase Amplitude Modulation
Panning Rotating the camera so as to follow a moving subject.
PAT Port Address Translator (synonym for NAPT)
Path A logical connection between a point where an STS or VT is
multiplexed to the point where it is demultiplexed.
Payload The portion of the SONET signal available to carry service
signals such as DS-1, DS-2, and DS-3.
Payload Pointers Indicates the beginning of the synchronous payload envelope.
PCF Packet Control Function
PCM Pulse Code Modulation - the process of converting an analog
signal to a digital signal.
PCR Peak Cell Rate (ATM)
PDA Personal Digital Assistant - an electronic handheld device with
various features such as a daily planner, telephone directory, to-
do list, and possibly offering connectivity to short message
service and e-mail.
PDSN Packet Data Serving Node (UMTS) - provides access to the
Internet, intranets, and applications servers for mobile stations
over a cdma2000 Radio Access Network (RAN).
PDU Protocol Data Unit - the payload of a data packet, i.e. everything
that is not in the protocol header.
PE Provider Edge - a router or switch on the edge of a service
provider's network.
Per Hop Behavior The forwarding behavior applied at a DiffServ-compliant node.
(PBH)
PESQ Perceptual Evaluation of Speech Quality - an objective method
for estimating the quality of narrow-band telephone channels and
speech codecs. Standardized as ITU Rec. P.862.

748 Glossary
PESQ-LQ A score derived from a raw PESQ output by applying a

conversion function. The PESQ-LQ score is intended to provide
a closer correlation to subjective MOS.
PF_KEY Pseudo-address family used in IPsec key management API
Photonic The basic unit of light transmission used to define the lowest
(physical) layer in the OSI seven-layer model.
PHP Penultimate Hop Popping - the action of the upstream LSR/P
node popping the outer label before sending the packet to the
LER/PE node.
PI Provider Independent (alternative to Provider Addressing)
PIB Policy Information Base
PIM Protocol Independent Multicast - PIM concepts are based on
using information that is already resident in the unicast routing
environment. In the sparse mode model or PIM-SM, a multicast
group is registered into the PIM-SM environment by a well
known point or Rendezvous Point (RP). All PIM-SM edge
routers know who the RP is for each multicast address
administered in the environment. Data received for a multicast
group is sent to the RP. This is the register. A viewer wishing to
join a group is added to the RP for that group. Data flowing via
the RP is typically pruned in favor of a shortest path tree or SPT
because the PIM edge route now knows the source subnet from
whence to build a forwarding path to. PIM-SSM bypasses the RP
by introducing the unicast IP address of the source. By this the
source subnet is known at the setup and the RP is not required.
PIM-DM Protocol Independent Multicast – Dense Mode (see PIM)
PIM-SM Protocol Independent Multicast – Sparse Mode (see PIM)
Pixel A picture element, that is, a point in an image or video display.
The pixel is the smallest area for which brightness and color
information can be individually specified.
Player A software device containing one or more video or audio codecs
that is used to play out stored files or streaming media.
PLC, packet loss An algorithm that operates with a speech decoder, smoothing
concealment over the gaps left by late/lost packets, reducing the degradation
of the speech. Also called packet loss protection, packet loss
mitigation.
Plesiochronous A network with nodes timed by separate clock sources with
almost the same timing.

Glossary 749
PMTUD Path Maximum Transmission Unit Discovery - a mechanism for

discovering the largest packet size that can be sent on a path.
PNNI Private Network to Network Interface. ATM protocol that uses
Dijkstra algorithm to calculate routes
Point Code This is a unique SS7 Address of a Service Switching Point
(digital voice switch).
Point of Local The head-end LSR of a backup tunnel or a detour LSP
Repair (PLR)
Poisson Tables These are probability tables that yield the number of trunks
required between two switches to provide a specified level of call
blocking given a calling volume between those switches.
Policing Rate-limiting feature that manages a network's access bandwidth
policy by ensuring that traffic falling within specified rate
parameters is transmitted, while dropping packets that exceed the
acceptable amount of traffic or transmitting them with a different
priority.
Poll An individual control message from a central controller to an
individual station on a multipoint network inviting that station to
send.
POP Point-of-Presence - a point in the network where inter-exchange
carrier facilities like DS-3 or OC-n meet with access facilities
managed by telephone companies or other service providers.
POP3 Post Office Protocol, Version 3 - used to transfer e-mail from a
server to a user client.
POTS Plain Old Telephone System
PPP Point-to-Point Protocol - tunnelling protocol capable of handling
many types of packets.
PPP Class A method to indicate to which class a fragmented packet belongs
Number over a PPP connection
PPPoE PPP over Ethernet
PPVPN Provider Provisioned Virtual Private Network

750 Glossary
Proxy server An intermediary entity that acts as both a server and a client for
the purpose of making requests on behalf of other clients. A
proxy server primarily plays the role of routing, which means its
job is to ensure that a request is sent to another entity “closer” to
the targeted user. Proxies are also useful for enforcing policy (for
example, making sure a user is allowed to make a call). A proxy
interprets, and, if necessary, rewrites specific parts of a request
message before forwarding it.
PS Packet data Services
PSTN Public Switched Telephone Network
PTI Payload Type Indicator
PVC Permanent Virtual Circuit - point-to-point circuit that maintains
connection even when not in use
QAM Quadrature Amplitude Modulation - QAM is a relatively simple
technique for carrying digital information from the television
operators broadcast center to the cable subscriber. This form of
modulation modifies the amplitude and phase of a signal to
transmit the MPEG2 transport stream. QAM is the preferred
modulation method for the cable provider companies because it
can achieve high transfer rates of up to 40 Mb/s.
QDU, One QDU is the quantization distortion associated with encoding
Quantization into G.711 at 64 kb/s; the coding impairment of other codecs is
Distortion Unit sometimes characterized by the number of QDUs associated with
the encoding/transcoding. The QDU concept assumes an additive
model, so that the quality impairment from successive encoding
can be estimated by adding together the QDUs for each
encoding. This model tends to break down when applied to
speech compression codecs, and other non-linear devices.
QoE Quality Of Experience - the user's perception of service or
application quality.
QoS Quality of Service - a set of mechanisms and protocols intended
to ensure efficient use of the network resources.
QPSK Quadature Phase Shift Keying - QPSK is more immune to noise
than QAM and consequently is typically used as the preferred
modulation for the satellite environment or on the return
signaling path for a CATV network. QPSK works on the
principle of shifting the digital signal so that it is out of phase
with the incoming carrier signal. QPSK improves the robustness
of a network however this modulation scheme has practical
limits of around 10Mb/s.

Glossary 751
R Transmission Rating - the main output variable of the E-Model.

R combines 15 objective measurements to provide an overall
indication of voice quality.
R1 An ITU signaling standard for trunking between voice switches
R2 An ITU signaling standard for trunking between voice switches
RA Router Advertisement (message)
RADIUS Remote Authentication Dial-In User Service
RARP Reverse Address Resolution Protocol - used to translate
hardware interface addresses to protocol addresses.
RD Route Distinguisher
Real-time In a network, the condition of being able to support the
throughput and response time necessary to mediate on-going
processes (such as human speech or video signals) between the
two ends.
RED Random Early Discard - an algorithm that randomly discards
packets to reduce congestion in IP networks.
Redirect server User agent server that generates 3xx responses to requests it
receives, directing the client to contact an alternate set of URIs.
REGISTER A SIP message used to provide registrars with an address of
record.
Registrar Server that accepts REGISTER requests and places the
information it receives in those requests into the location service
for the domain it handles.
RF Radio Frequency
RFC Request for Comments. RFCs are technically drafts in the
standardization process, rather than official published standards.
RFC 3031 Multi-protocol Label Switching Architecture
RIP Routing Information Protocol
RIPE Réseaux IP Européens (European Regional Internet Registry)
RIPng Routing Information Protocol – Next Generation (for IPv6)
RIPv2 Routing Information protocol version 2 (for IPv4)
RPR Resilient Packet Ring
RS Router Solicitation (message)

752 Glossary
RSVP Bandwidth reservation protocol defined in RFC 2205, Resource

ReSerVation Protocol (RSVP)-Version 1 Functional
Specification. IETF standard on bandwidth reservation protocol.
RSVP operates on an end-to-end basis.
RSVP-TE Reservation Protocol with Tunnelling Extensions - a label
distribution protocol. A specification of extensions to RSVP for
establishing LSPs in MPLS networks
RSVP-TE Hello A keep-alive protocol that enables RSVP nodes to detect when a
neighboring node is not reachable
RT Real-Time
RTCP Real-Time Control Protocol, a component of RTP
RTP Real-time Transport Protocol
RTP Mixer An intermediate system that receives RTP packets from one or
more sources, possibly changes the data format, combines the
packets in some manner, and then forwards a new RTP packet.
RTP Relay Relay agents of RTP protocol, including RTP mixer and RTP
translator
RTP Session The association among a set of participants communicating with
RTP
RTP Translator An intermediate system that forwards RTP packets with their
synchronization source identifier intact
rt-VBR An ATM service category called real-time Variable Bit Rate
SA Security Association
SAD Speech Activity Detector. See VAD, silence suppression.
SADB Security Association DataBase
SCH Supplemental Channel
Scheduler A nodal mechanism that determines how and when packets are
serviced in a queue. Examples of schedulers include Strict
Priority, WFQ and WRR.
SCR Sustainable Cell Rate
SCTP Stream Control Transport Protocol

Glossary 753
SDH Synchronous Digital Hierarchy - the CCITT-defined world

standard of synchronization whose base transmission level is 155
Mbps (STM-1) and is equivalent to Sonnet's STS-3 or OC-3
transmission rate. SDH standards were published in 1989 to
address interworking between the European and North American
transmission hierarchies.
SDP Session Description Protocol
SDU Service Data Unit
SECAM Systeme Electroniqe Couleur Avec Memoire - a television
standard used in France and other countries whereby colour
information is sent by modulating each of the two chrominance
difference components on adjacent lines.
Section An optical span and its equipment.
SEND SEcure Neighbor Discovery
Service Control This equipment provides application databases from which
Points (SCP) queries can be made to make decisions about how to process a
call.
Service Switching Signal Switching points have the functions of a switching point
Points (SSP) but are also capable of generating query messages for application
databases located at Service Control Points.
Session A multimedia session is a set of multimedia senders and
receivers and the data streams flowing from senders to receivers.
Session The protocol describes multimedia sessions for the purpose of
Description session announcement, session invitation and other forms of
Protocol (SDP) multimedia session initiation.
SGSN Serving GPRS Support Node
Shaping Shapes traffic by reducing outbound traffic flow to avoid
congestion by constraining traffic to a particular bit rate. Shaping
is the function of the customer premises equipment to limit the
amount of traffic passed on to the frame relay network over a
specified period of time
SID Silence Information Detection
Sidetone On a handset or headset call, a low level signal consisting of the
talker's own voice is fed back from the phone or loop
termination. Sidetone lets the user know that a connection is
active, and provides an indication of how loud to speak.
Standards offer some guidance on sidetone performance, and it is
a factor in the E-Model.

754 Glossary
Signal Transfer This equipment forwards messages from one signal switching
Points (STP) point on to another signal switching point. Just like a router in an
IP network provides packet forwarding, the STP provides
message forwarding.
Signaling Point This is a signaling point is any node that originates or terminates
(SP) signaling messages
SIIT Stateless IP/ICMP Translation Algorithm (used in NAT-PT)
Silence Transmission of the voice path data only when speech is present.
suppression On a voice channel, this can reduce the long-term average data
volume by about 40%. The benefit of silence suppression is
achieved in high capacity links, since the peak data rate for an
individual channel remains the same. The more channels we
combine, the less variability in the overall average data rate,
allowing link capacity to be trimmed close to the predicted
average. Silence suppression is referred to as DTX or VAD.
Implementations usually include comfort noise generation (see
CNG).
Silkroad IPv6 tunnelling mechanism for traversing NATs
SIP Session Initiation Protocol - a peer level call control protocol,
developed as an open standard by IETF, and is a direct
competitor of H.323. In contrast to H.323, it is based on Web
principles and has a simplistic and modular design that is easily
extensible beyond telephony applications. SIP is enjoying rapid
momentum in the industry at both the system and device level.
SIP-based call control has excellent potential for smart phone
applications and some devices already appearing on the market.
SIP is also appropriate as the peer-level interface for call servers
and either standalone or decomposed gateways.
SIP-T SIP for Telephones is SIP with tunnelled ISUP messages
SLA Service Level Agreement
Slip An overflow (deletion) or underflow (repetition) of one frame of
a signal in a receiving buffer.
SNMP Simple Network Management Protocol
SOCKS Apparently not an acronym: A proxying technology extended to
provide IPv4-IPv6 technology
SOCKS64 The SOCKS IPv4-IPv6 translation proxy
SOHO Small Office – Home Office

Glossary 755
SONET Synchronous Optical Network - a standard for optical transport

that defines optical carrier levels and their electrically equivalent
synchronous transport signals. SONET allows for a multi-vendor
environment and positions the network for transport of new
services, synchronous networking, and enhanced OAM&P.
Source coding In wireless transmission, encoding of the to-be-transmitted signal
(the source signal) by a compression codec or other codec. The
term is used to differentiate the coding of the input signal from
the error protection coding or channel coding
Spanning Tree IP Protocol developed primarily to prevent loops forming within
Protocol (STP) networks
SPE Synchronous Payload Envelope - the major portion of the
SONET frame format used to transport payload and STS path
overhead.
Speech codec A codec designed to encode speech signals and which performs
best on speech signals, especially, a codec which uses a model of
the speech production process to encode the speech as efficiently
as possible; sometimes called a coder or vocoder.
Split MultiLink Intended for aggregation switch redundancy. SMLT can tolerate
Trunking (SMLT) any link or any switch failing. The network will restore within
millisecond
SRTP Secure Real-time Transport Protocol - defined by IETF RFC
3711.
SS7 Signaling System Number 7
SSL Secure Socket Layer
SSM Source Specific Multicast
Standard The protocol is an announcement protocol that is used to assist
Announcement the advertisement of multicast multimedia conferences and other
Protocol (SAP) multicast sessions, and to communicate the relevant session
setup information to prospective participants.
Storage Area A LAN set up for centralized storage of data in server clusters.
Network (SAN)
Stratum Level of clock source used to categorize accuracy.
Stream A single media instance, for example, an audio stream or a video
stream as well as a single whiteboard or shared application group

756 Glossary
Strict Priority A scheduler that services queues in a priority order where

Scheduler packets in a higher priority queue must be serviced before
packets in a lower priority queue.
STS Synchronous Transport Signal
STS-1 Synchronous Transport Signal Level 1 - the basic SONET
building block signal transmitted at 51.84 Mbps data rate.
STS-n Synchronous Transport Signal Level n - the signal obtained by
multiplexing integer multiples (n) of STS-1 signals together.
STS-Nc N is an integer, c - concatenated
Subscription Set of application state associated with a dialog. This application
state includes a pointer to the associated dialog, the event
package name, and possibly identification token. Event packages
will define additional subscription state information. By
definition, subscriptions exist in both a subscriber and a notifier.
SVC Switched Virtual Circuit
SW Switch
Switched ATM Circuit that has the ability to create the circuit and tear
Permanent down the circuit automatically
Virtual Circuit
(SPVC)
Synchronization The source of a stream is identified by a 32-bit numeric SSRC
Source Identifier identifier carried in the RTP header so as not to be dependent
(SSRC) upon the network address.
Synchronous A network where transmission system payloads are synchronized
to a master (network) clock and traced to a reference clock.
Synchronous A transfer mode in which there is a constant time between
mode successive bits. Because synchronous operation depends on
conventions about the data format, headers and start and stop bits
are not needed. However, all nodes need to operate on the same
clock rate.
Synchronous Tandem encoding where the samples have been temporally
tandeming aligned so that sampling or framing occurs at the same points in
the signal. Synchronous tandeming may have better quality than
asynchronous tandeming.
Synchronous A measure of the SDH transmission hierarchy. STM-1 is SDH's
Transfer Module base-level transmission rate equal to 155 Mbps. Higher rates of
(STM) STM-4, STM-16, and STM-64 are also defined.

Glossary 757
T (time interval) The time interval used in calculating frame relay CIR and EIR
T1 Trunk Level 1 - a North American TDM transmission link
carrying 24 voice channels; equivalent to DS1.
T1X1 A committee within the ECSA that specifies SONET optical
Subcommittee interface rates and formats.
Tandem encoding Encoding and decoding of a signal two or more times through a
codec. Tandem encoding can cause significant degradation (see
asynchronous tandeming, synchronous tandeming).
TCLw Weighted Terminal Coupling Loss - a measure of the amount of
signal that a telephone end device allows to cross over from the
receive path to the send path. The weighting factor adjusts the
contribution of the various audio frequencies to the perceived
loudness.
TCP Transport Control Protocol - defined by IETF RFC 793
TDM Time Division Multiplexing
TDMA Time Division Multiple Access - a wireless access technique in
which an RF frequency band is shared by dividing it into short
time slots, each of which carries data for a separate channel.
GSM and North American IS-54 are TDMA-based technologies.
Compare CDMA, FDMA.
TE Traffic Engineering - proactive traffic management. Two TE
protocols are available:
RSVP-TE: Resource Reservation Protocol TE
OSPF-TE: Open Shortest Path First TE
TELR Talker Echo Loudness Rating - a measure of the level of echo
present on an interactive voice call
Teredo IPv6 tunnelling mechanism for traversing NATs (Teredo is the
name of an animal species – a worm which bores holes through
timber underwater and was the cause of the demise of many
wooden ships)
Terminal An H.323 Terminal is an endpoint on the network which
provides for real-time, two-way communications with another
H.323 terminal, Gateway, or Multipoint Control Unit. This
communication consists of control, indications, audio, moving
color video pictures, and/or data between the two terminals. A
terminal may provide speech only, speech and data, speech and
video, or speech, data and video.

758 Glossary
Termination Terminations represent media connections to/from the packet

(H.323) network. They allow signals to be applied to the media
connections and events to be received from the media
connections.
Test vectors Special signals defined to test whether a bit-exact
implementation conforms to the standard. The test vectors will
specify the exact input and output bit streams for the encoder or
decoder. The use of test vectors makes verification of the codec
simple and repeatable. Non-bit exact implementations of many
codecs are possible and are used, but verification of non-bit-
exact codecs must be tested by other means, often by subjective
rating.
TFTP Trivial File Transfer Protocol - defined by IETF RFC 1350
TLV Type-Length-Value
Toll quality Presumably, voice quality that is good enough that customers
will pay for it without complaint. Unfortunately, the concept is
vague, and there is no conventional definition in terms of any
quantitative quality measurement, either subjective or objective.
We have chosen to discuss whether voice quality over a
particular channel is acceptable. Acceptability is determined
from what is known from laboratory studies, the constraints of
the application, and from previous experience in the field.
Toll Switch This is a switch that handles long distance calls (that is, with a
Toll) The designation of toll is often used to refer to equipment
used for long distance service whether or not a toll is actually
applied on a per call basis.
TOS Type of Service - an older IP QoS definition provided by RFC
791. The TOS definition has been superseded and made obsolete
by the DiffServ definition for IP QoS as defined in RFC 2475.
Traffic The set of mechanisms used to enhance the performance of an
Engineering operational network at both the traffic and resource levels
Transport layer The protocol is to provide privacy and data integrity between two
Security (TLS) communicating applications. Defined by IETF RFC 2246.
TRT Transport Relay Translator
Trunk This is a 64 kb/s channel between two digital switches.
UBR An ATM service category called Unspecified Bit Rate used to
provide best effort services
UDP User Datagram Protocol - defined by IETF RFC 768

Glossary 759
UMTS Universal Mobile Telecommunications Service

UNI User Network Interface
UPC Usage Parameter Control
User Agent (UA) A logical entity that can generate SIP requests and responses
VAD Voice Activity Detector - a device that determines whether a
speech signal is present, and using that information to enable
transmission or silence suppression as required. In some cases,
VAD is used to refer to the entire silence suppression feature (see
silence suppression).
VBR Variable Bit Rate
VC Virtual Connection
VCC Virtual Channel Connection
VCI Virtual Channel Identifier
VCX Virtual Concatenation
VFRAD Voice Frame Relay Access Device
Virtual Router Allows for the router to provide redundancy and availability to
Redundancy IP routing. VRRP allows two or more interfaces on separate
Protocol (VRRP) routers appear as though they are one router interface.
VLAN IEEE 802.1Q Virtual LAN
VLAN ID The 12-bit tag in the IEEE 802.1Q field in an Ethernet frame.
Vocoder A device that uses a model of the speech production process to
encode and compress speech signals; the term is loosely
interchangeable with speech codec
VoIP Voice over IP
VPC Virtual Path Connection
VPI Virtual Path Identifier
VPLS Virtual Private Line Service
VPN Virtual Private Network
VPWS Virtual Private Wire Service
VR Virtual Router
VRF Virtual Routing and Forwarding (Table)

760 Glossary
VSB Vestigial Side Band - a method for modulating -- or converting

for transmission -- digital data over coaxial cable. Created by
Zenith, VSB has been chosen by the FCC as a standard for digital
TV signal generation.
VT Virtual Tributary - a signal designed for transport and switching
of sub-STS-1 payloads.
VT Group A 9 row x 12 column structure (108 bytes) that carries one or
more VTs of the same size. Seven VT groups can be fitted into
one STS-1 payload.
VTOA Voice and Telephony over ATM
WAN Wide Area Network
Wander Long-term variations in a waveform.
Waveform codec A codec that either directly or indirectly represents the amplitude
of the waveform. G.711 and G.726 are waveform codecs. CELP-
based codecs are not waveform codecs.
W-CDMA Wideband Code Division Multiple Access
WFQ Weighted Fair Queuing - scheduler that services each queue at a
frequency determined by the queue weight. The higher the
weight, the more often the queue is serviced resulting in lower
forwarding delay.
Wideband Services requiring 1.5-50 Mbps transport capacity.
Wideband Digital This type is similar to the broadband cross-connect except that
Cross-Connect the switching is done at VT levels (similar to DS-1/DS-2 levels).
It is similar to a DS-3/1 cross connect because it accepts DS-1s,
DS-3s, and is equipped with optical interfaces to accept optical
carrier signals. It is suitable for DS-1 level grooming applications
at hub locations.
WLAN Wireless Local Area Network
WRR Weighted Round Robin - a scheduler that services each queue
sequentially with a queue weight that determines how long the
queue is serviced. The higher the weight, the more bytes are
emptied out of the queue.
WWAN Wireless Wide Area Network
WWW World Wide Web
X.25 Legacy packet-based low speed Layer 2 network

Glossary 761
Zone Collection of all terminals (TX), Gateways (GW), and

Multipoint Control Units (MCUs) managed by a single
Gatekeeper (GK). A Zone has one and only one Gatekeeper. A
Zone may be independent of network topology and may be
comprised of multiple network segments which are connected
using routers (R) or other devices.

762 Glossary

763
Index 1
Numerics AAL-0 262
AAL-1 262
1+1 protected 149
1xEV-DO 316, 723 AAL-2 263
1xEV-DV 316, 723 AAL-3/4 263
1xRTT 313–315, 723 AAL-5 263
2.5G 723 ABR 265–266, 280, 723, 725
networks 313–315, 318 access
2G 723 control 374, 382, 415, 557, 724
networks 105, 308, 313, 318 network
3G 723 components 334
networks 308, 312, 315–316, 318, 361 concepts 307
5-tuple 352, 379, 723 networks
64 kHz 125 introduction 308
6bone 661, 663–664, 723 transport path 307
6over4 660–661, 665, 723 source jitter 625
6to4 660–661, 664–665, 723 technologies 7
802.1p 240–241, 246, 302, 437–441, 453, comparison 308
463, 573, 582, 591, 723 technology
user priority field 240 cable 327
802.1Q 240–241, 244, 246, 302, 318, 437– active
439, 591, 723 flows and network buffering 713
field definitions 241 queue management 236
add/drop multiplexer 147, 154, 158
additional information about IPv6 637–674
A address
A6 723 IP prioritization 244
records 359 IPv6 343, 346
AAAA 359, 723 resolution
AAL 723 linking IP and link level addresses 652
structure 262 resolution protocol 327, 356, 646, 652
selection
source and destination 658
summarization 409
types
IPv6 637

764 Index
ADM 147, 724 responsive 448

administrative domain 187, 440, 444, 446, RPR 143
724 timely 449
admission request 186–187, 217, 724 APS 151, 397
ADSL 308, 319–320, 322–327, 536, 724 AQM 236, 238, 718
advantages architectural design of VPN schemes 690
broadband 539 architecture
cable 535 ATM 256
local service providers 524 internet 342
long distance providers 528 area 725
multimedia 531 border router 410, 725
AF 238 ARIN 643, 725
AF_INET6 657, 724 ARP 725
AH 724 AS 725
AIS 724 ASI 594
ALF 725 ASN.1 186, 188, 216, 725
ALG 724 association-configuration component 423
algorithm assured forwarding 238
Dijkstra 407 DiffServ 238
allocation ASTN 289
bandwidth and frequency 329 asynchronous transfer mode 141–142, 252–
AMPS 311–312, 724 281, 286–289, 298, 304, 308, 357,
AM-QAM 609 363, 407–408, 411–412, 415–416,
AMR 102 436, 442–443, 450, 459–460, 463,
ANSI 135, 724 505, 511–516, 523, 525, 527, 547–
stratum clock requirements 156 550, 552, 577, 598, 611–612, 677,
answer 724 682, 686
anycast 644 service category 442, 514
API 239, 724 traffic management
security 659 parameters 442
APNIC 643, 724 ATM 505, 725
application 583 adaptation layer 254–256, 261, 264, 269–
categorizing 447 271, 280, 324, 327, 481, 525, 527,
custom 590 585
GbE 143 architecture 256
interactive 447 data link layer
level OSI model 254
gateway 381, 383, 385, 386, 390, 666– forum 265
668, 670 frame relay 251
level framing 725 concepts 251
network control 449 references 282
optical Ethernet 142 summary 280
performance interfaces 255
dimensions 447 networks
requirements 446 QoS and services 264
programming interface 239, 355, 590, 612, protocol layer 257
657, 659 service classes 265

Index 765
virtual channel connection 258 bit rate 68, 81, 86–87, 88–89, 102, 103, 107,
voice and telephony 269 143, 153, 236, 310–311, 319–320,
attenuation, noise and interference 309 338, 585
audio available 266, 280
codec 105 constant 269, 280, 442, 480, 483, 715
video unspecified 266, 280, 442
synchronization 78 variable 266, 271, 280, 442, 480, 482–485,
authentication header 354 512–514, 716
auto-configuration and DHCPv6 653 BITS 155
auto-configured addresses 656 block diagram
automatic protection switching 151, 397 VoIP voice path 37
autonomous system 408–410, 505, 690–692 blocking 75
boundary router 410 BLSR 141, 148
available bit rate 264 blurring 75
border gateway protocol 376, 387, 407–409,
B 410, 505, 642, 645, 683, 688–693
B frame 72–74, 113 BR 280
B2BUA 204, 215 branch network 577
backhaul 105, 214, 324 broadband 142, 147–148, 158, 308–310,
back-to-back user agent 204 342–343, 448, 473, 517, 529–532,
bandwidth 308–310, 494–495, 496, 500, 536–541, 678
507, 513, 526, 536–539, 547–549, advantages 539
553, 559, 562, 568–569, 572–575, delivery
583–585, 590,604, 606–611, 613, 626, wireless systems 310
686, 701–705, 712–714, 717 inter-carrier interface 255
factors solution 536
network edge 701 technical challenge 536
basic broadband market technologies 537
IPv6 network layer 344 buffer size relationship 493
real-time applications and services 5 build
BayRS protocol priority queuing LSRs MPLS forwarding table 290
sidebar 513 building block
BECN 234 data rate 125
benefits business
IPv6 343 continuity 545
MPLS network 287 drivers for convergence 544
best busy hour 130
effort 6
packets or packets marked 234 C
practice 8 cable 531
BGP 408 access technology 327
extensions 688 advantages 535
BIA 669 modem 334
BIS 669 network 327
bit components 330
stuffing 153 solution 533

766 Index
technical challenge 532 538, 681

technology Centrex IP 517
interactive services 329 advantages 520
CAC 259, 267, 477 network diagram 519
CALEA 333 solution 519
call technical challenge 518
admission control 259, 267–268, 477, 500 CER 266
flow 268 certified
center 584 engineer 11
control protocols 5 CES 269
management server 333 challenges
processing language 215 physical 309
progression 128 channel surfing 603
routing 131 characteristics
routing table 124, 138, 411 ATM service classes 265
scenario 474 checksum
server 568 IP layer 347
options 570 chrominance 69, 74, 76–77, 109, 111
set-up 5 CICM 519
capability negotiation 5 CIR 235, 717
capacity circuit
codec 103 emulator service 262, 269–270
Capex 152, 237, 537 switched networking 123
carrier-grade reliability 540 class
categorizing applications 447 based WFQ 236
CATV 593 selector 237
causes classification
video signal real-time
impairments 69 packet 233
CBR and VBR classifier
comparison 483 QoS
CBR voice bandwidth requirements 481 mechanisms 232
CBS 717 classless inter-domain routing 639
CBWFQ 236 clients 581
CCS7 134 clock
CDMA 52, 310, 312–316, 332 rate
CDN 559 TDM 127
CE 271 clocks
cell Stratum level 616
delay variation 266
tolerance 264
error ratio 266
loss priority 257, 442–443, 463
misinsertion rate 266
CELP 86, 98, 584
central office 61, 312, 323–324, 511, 536,

Index 767
closets and aggregation points 562 collaboration 587

CLP 442 color burst 69–70
CLT 484 committed
cluster member 136 burst size 717
CMPv6 and IPv6 network configuration 647 information rate 235, 237–239, 441, 717
CMR 266 comparison
CMS 333, 534 access technologies 308
CMTS 331, 332, 534 CBR and VBR 483
co-channel interference 69–70 hub and meshed pattern 152
codec 4–6, 20, 36, 45–46, 55, 63, 123–125, SIP and H.323 216
180, 184, 191, 194, 207–208, 236, compatibility
270, 272, 310–312, 334, 335, 458– codec 104
460, 466, 469, 473, 480–483, 490, component
584, 588, 620, 628, 635 detection 423
audio 86, 88, 105 monitoring 423
MP3 106 notification 423
capacity 103 routing 424
characteristics 85 switching back 424
common working path 424
telephony 98 components
compatibility 104 access network 334
compression 5, 36, 70, 82, 88–89, 94, 97– cable
98, 101, 105, 116, 458, 500 access network 334
concepts 81 cable network 330
delay 103 MPLS
introduction 82 recovery solution 422
packet loss network-based VPN 685
robustness 104 compression
rankings 105 digital
real-time 81 video 70
robustness issues 74
packet loss 104 concepts
selecting 102 access networks 307
speech 39, 86–87, 90, 123, 181, 191, 198, ATM and frame relay 251
206, 208, 461, 469, 473, 489 codec 81
summary 116 convergence 4
transport path impairments
diagram 81 video 67
types 86 IP reroute 8, 395
video 86, 107 IPv6 341, 637
voice 81 MPLS
performance 102 networks 285
Vorbis 106 rapid fail over 419
coding recovery 8
process optical Ethernet 297
sidebar 82 QoS
implementation 436

768 Index
mechanisms 229 engineer 11

real-time 4 network 4, 17
SONET network 139 QoS 231
TDM circuit switched networking 124 service 4, 17
telecommunications convergence 15 COPS 449
video core 7
broadcasting multicasting 593 network 548
impairments 67 QoS strategy 515
voice quality 35 switch/router enhancements
conclusion IP multicast 599
real-time CPL 215
networking 11 CR-LDP 289, 291
conditional access 703 CS 237, 520
conferencing CTD 416
meet me 588 custom applications 590
configuration customer edge 271, 681–692
gateways and terminals 457 CWDM 303, 547
network 148
connection model D
Megaco 223 DAD 650
considerations data
DSCP configuration 239 application QoE targets 712
constraint-based routing 424 application traffic demands 486
constructing IID 649 center 548
content rate
delivery networking 558 building block 125
protection methods 703 user traffic demands requirements 487
contouring 77–78 datagram header
contributing source identifier 619–620 IPv6 and IPv4 345
control DBS/ASI to IP translation 697
call DCT 72
routing 131 DE 441
mechanisms 5 decoder
call set-up 5 MPEG-4 114
capability negotiation 5 decoding issues
operations and management 647 digital
converged video 71
network engineer 8 default forwarding 239, 246, 444
networks DiffServ 239
conclusion 31 PHB 239
converged desktop 587 deficit WRR 236
converged user desktop 586 defining QoE and QoS requirements 629
convergence 11, 16 degradation
application 4, 18 delay 42
constraints delay
adds 19 codec 103
removes 17

Index 769
impairment 41 DiffServ
margin 489 codepoint 234
processing 44 mapping 436
propagation 44 MPLS
queuing 44 mapping 444
serialization 44 PHB
source components 488 forwarding 238, 239
sources QoS
IP networks 45 architecture 237
voice quality 39 TOS
design addresses field 235
ability 558 digital
effectiveness 556 cross-connect
efficiency 555 wideband 148
interoperability 558 loop carrier 129
manageability 561 switching
performance 555 principles 128
redundancy 554 video 70
reliability 554 decoding issues 71
scalability 558 impairments 68
security 556 Dijkstra algorithm 407
designing directing packets
real-time networking solution IPv6 637
infrastructure 545 discrete cosine transfer function 72
detection component 423 distributed multilink trunking 402
determine DLC 129, 536
network state 544 DLCI 273
development of VPNs 676 DMLT 402
DF 239 DMZ 551
DFZ 642 DOCSIS 331
diagram domains
Centrex IP network 519 transparent 303
IP reroute 395 DoS 551
local network 523 downstream
long distance network 526 technology 331
MPLS drop and repeat 148
rapid fail over 419 DS1 timing circuits 617
multimedia network 530 DSCP 234, 238, 436
PESQ 55, 634 802.1p mapping 438
protocol reference 297 configuration
reference 675 considerations 239
transport path 35, 67, 123, 139, 229, 251, DSCP to EXP
285, 307, 341, 435, 465, 593 example 445
VRRP 404 mapping 445
difference DSCP to PPP
line 129 example 444
trunk 129

770 Index
DSL IP 11
network architecture 325 engineering
DSLAM 52, 324, 598, 702 methodology
DSLAM aggregation example 702 QoE 466
DSTM 670 network resources
DVB-ASI 609 QoE 465–500
DVMRP 595 traffic 479
DVRMP multicast tree building 700 enhancements
DWDM 298, 547 IP multicast 698
Ethernet 303 enterprise
DWRR 236 global
VoIP and QoS 503–516
E LAN engineer 6
E.164 216 WAN access 475
early media 210 Enterprises 6
EBR 414 ENUM 216, 218
EBS 717 EoDWDM 303
ECAN 36, 473 EoF 299
echo EPG 612
causes 51 Erlang 130, 310
control ESCON 141, 142
VoIP 40, 52 Ethernet 436
impairment 48 DWDM 303
ECMP 403, 478 optical 297
ECN 234 concepts 297
edge over fiber 299
distortion 76 private
switch feature enhancements LAN 304
IP multicast 601 line 304
EGP 408 RPR 300
EIR 235 services 304
electrical RF switching and load balancing 552
interference 70 virtual private
element LAN 304
network 147 line 304
E-Model Ethernet switching 552
equipment ETSI 136
impairment 98 ETTU 594
E-MTA 534 example 8
encoding issues bandwidth factors
digital DSLAM aggregation 702
video 71 DSCP to EXP 445
end-user traffic categories 453 DSCP to PPP 444
engineer IP television system 696
converged network 8 label stacking 294
convergence 11 MPLS network 286
multicast route policies 704

Index 771
network 153, 547 FRF.11.1 272

real-time carrier 517–542 FRF.11/12
real-time services 24 frame relay 271
RSVP-TE 292 FTTC 536
spatial reuse 301 FTTH 536
SSM signal flow 157 future
traffic demands calculation 481 internet protocol 341
traffic engineering 480
user tasks 27 G
e-commerce 27 G.711 99, 129
internet browsing 27 G.723.1 101
voice call 27 G.726 100
expedited forwarding G.729 100, 207
DiffServ G.729A 100
PHB 238 gateway 579
PHB 238 control protocols 5
PHB group 246, 444, 450 terminals
explicit congestion notification 234 configuration 457
extension headers GbE
IP 348 application 143
GCRA 268
F generic cell rate algorithm 268
fading and multipath 309 GFC 257
FAX 25 GFP 141
FDMA 310 Gibbs effect 76
FEC 286 GRE 660, 663
label 287 GSM 52
label header 288 GSM-EFR 101
field definition
TOS H
old 234 H.323 180, 181
FIFO 273, 714 HCI 26
firewall security 551 HDTV 108, 142
flexible VPNs 680 header
forwarding 412 IPv6 and IPv4 346
FR 505 host DSCP or 802.1p
fragmentation marking 241
IP 347
frame relay 7
DTE and DCE peers 274
FRF.11/12 271
NNI between DCE peers 275
frames
sequences 74

772 Index
HRX 56 network-to-network 255

HRX impairments 474 user-to-network 255
HRXs 472 interference
human-computer interface 26 co-channel 70
electrical RF 70
I internet
I frame 72 access
IAM 214 services 305
IANA 224 architecture 342
ICMP 348, 647 INTERNET-TE 424
IETF 344 introduction
IGMP 595, 601 access networks 308
fast leave and LMQI residual streams 602 IP reroute 396
membership reports 596 IPv6 637
snooping and forward/proxy 602 MPLS
IGP 289, 408, 430 networks 286
IID construction 649 rapid fail over 419
ILM 290 network resources
impact of TDM hand-off 490 QoE 466
impairment optical Ethernet 297
delay 41 private network examples 543
digital video 68 QoS
distortion 45 implementation 436
comparison 46 mechanisms 230
E-Model 98 real-time networking 3
HRX 474 SONET network 140
jitter 42 video broadcasting multicasting 594
network 487 voice quality 36
packet networks 39 VoIP 503
video 67, 68 IP
signal 69 address
summary 79 prioritization 244
implementation Centrex 517
QoS certified engineer 6, 11
mechanisms 236 extension headers 348
improvements fragmentation 347
IPv6 647 interleaving 242
integration layer
MPLS and Diffserv 292 checksum 347
interactive multicast 695–709
applications 447 core switch/router enhancements 599
services edge switch feature enhancements 601
cable technology 329 enhancements 698
inter-carrier transcoder 477 multicast evolution 594
interface network
ATM 255 sources 45
voice calls 36

Index 773
network data sources 504 IS-IS 289

precedence IS-IS-TE 289
TOS field 234 ISM 614, 703
reroute ISPs 594
concepts 395 issues
resiliency 396 potential 8
transport path diagram 395 IST 552
routing OSPF/BGP 407 ITU 135
set top box initialization 698 ITU-T 135
television head-end system 695 ITU-T Y.1711 426
television system 597
television system example 696 J
IPTEL 218 jerkiness 77
IPv6 369 jitter 6, 447
additional information 637–674 access/source 625
address types 637 delay distribution 625
addresses 346 impairment 42
auto-configuration process 648 jitter buffer 623–636
benefits 343 network 624
concepts 341, 637 jitter buffer dimensioning 626
directing packets 637
future 341 K
improvements 647 KAME 659
introduction 637 KPML 216
IPv4 GRE tunnels 663
manually configured tunnels 662
multicast 646 L
multihoming 645 label
network layer header
basics 344 FEC 288
networks merging 293
tunnels 670 stacking 293
packet transmission process 655 stacking example 294
references 672 switched paths setup 290
transition mechanisms 660 LACNIC 643
transport path diagram 341 LAN 6, 7
IPv6 and IPv4 WAN paradigm 298
compared 345 LAN extension 305
IS-641 102 LAN/MAN strategy
ISATAP 660 lessons learned 516
ISDN 52, 132 LANE 611
signalling 133 latency 447
layer
architecture
SONET 144
photonic 145

774 Index
layer 1 7 LSR 286

layer 2 7 luminance 69, 77, 109
VPNs 682
layer 3 7 M
VPN MAC 331
provider types 687 management services 561
VPN variants 676 man-machine interface 26
VPNs 684 mapping
QoS 690 802.1p to DSCP 439
routing 687 DiffServ 436
layered protocol 253 DiffServ to ATM 442
LDAP 551 DiffServ to frame relay 441
LDCC 146 DiffServ to MPLS 444
LDP 288 DiffServ to PPP 443
leaky bucket 268 DSCP to EXP 445
LER 286 marker
lessons learned QoS
LAN/MAN strategy 516 mechanisms 233
VoIP 508 marking
line host DSCP or 802.1p 241
difference 129 maximum transmission unit 242
linear configuration 149 mean
LLC 254 opinion score 30, 53
LMQI 703 transmission unit 242
LNP 136 mechanisms
local interworking
network diagram 523 IPv4 and IPv6 665
number portability 136 IPv6 660
service providers 521 monitoring 427
advantages 524 detection and notification 424
solution 522 QoS 229, 472, 475, 714
technical challenge 522 media
locating others 5 access control 254, 331, 335
long distance gateway control 220
network diagram 526 gateway controller 221
providers 524 MEDs 403
advantages 528 meet me conferencing 588
technical challenge 525 Megaco
loss 6 connection
synchronization signals 69 model 223
LSP 287, 683 resource model 223
set up 291, 424 message flow 221
Megaco/H.248
gateway control architecture 221
overview 220

Index 775
message flow recovery 428

Megaco 221 IP recovery 430
messages solution components 422
synchronization-status 157 references 296
method shim label header 288
PRACK 210 summary 295
QoS 243 traffic trunks and flows 287
REFER 211 MPLS and Diffserv
SIMPLE 213 integration 292
SUBSCRIBE/NOTIFY 213 MQAM 611
UPDATE 210 MRED 236
methods and practices MSO 328, 334
web streaming 707 MSTP 400
MG 336 MTA 334
MGC 220, 221, 336 MTU 242, 347
Microsoft streaming method 708 multicast in IPv6 646
MIDCOM 218 multicast route policies
MIME 207 example 704
MLT 401 multicast routing 646
MMI 26 multihoming in IPv6 645
MMUSIC 218 multilevel RED 236
monitoring multilink trunking 401
component 423 multimedia 528, 585
detection and notification mechanisms advantages 531
424 network diagram 530
mechanisms 427 solution 529
MOS 30, 633 technical challenge 528
subjective 53 terminal adapter 334
types 53 multiple
MP-BGP4 642 addresses for interfaces 644
MPEG 72, 106 multiplexing 125
MPEG-4 AAC 106 capacity 310
MPLS 7, 286, 495, 547, 676 multiservice operator 328
benefits 287
concepts 285 N
example 286 N protected 150
forwarding table NAT 7, 305, 342
build 290 protocols
introduction 286 network edge 399
networks 285–296 redundancy 396
ping 427
protection schemes 420
protocol 288, 289
rapid fail over 419–433
concepts 419
introduction 419
transport path diagram 419

776 Index
NAT-PT 668 NFAS 134

NE 141, 436 NHC 545
NEs 147 NHLFE 290
NetBSD 659 NLRI 642
Netmeeting 24 NMS 561
network NNCTS 11
address translator - protocol translator 668 NNI 255
architecture NNSC 509
DSL 325 Nortel Networks
branch 577 cable VoIP solution 534
cluster identifier 136 core network QoS strategy 514
configurations 148 gateways 581
control applications 449 LAN/MAN strategy 515
control traffic categories 453 real-time network 503
convergence 540 real-time network architecture 505
definition 19 ultra broadband portfolio 539
element 147 notation
timing methods 154 IPv6
element configuration 458 addresses 346
engineering guidelines notification component 423
summary 498 NSAP 412
health check 545 NSC 450
identifier 136 default nodal 455
impairments 487 performance objectives and target
jitter 624 applications 452
MPLS 285 QoS mapping policy 457
network interface 255 traffic management summary 456
overview 568 NSIS 218
planning 8 NTSC 4, 108
resiliency 7
resources 491 O
QoE 465–500 OAM 420
service classes OC 143
QoS 450 Ogg
SONET 139 transport 106
technologies 6 operations systems support 337
topology 219 Opex 152
networking optical Ethernet 7, 297–306
real-time 20 applications 142
concepts 297
definition 298
network
operate 299
services 304
speed 299

Index 777
organizational dynamics 545 applications

OSI model NSC 452
data link layer personal agent 589
ATM 254 PESQ 53, 54
OSPF 289 block diagram 55
ECMP 406 diagram 634
router types 410 PHB 234
OSPF-TE 289 photonic layer 145
OSS 337 physical challenges 309
overview PIM SSM 595, 701
QoS players 114
mechanisms 232 QuickTime 114
SONET network 140 RealNetworks 115
Windows Media Player 115
P PLC 37, 40, 47, 104, 490
P frame 72 PLR 430
PABX 134 PML 430
packet PMTUD 348
delay 6 PNNI 256, 289, 411
fragmentation hierarchy 413
interleaving 241, 242 policer
loss 628 QoS
concealment 47 mechanisms 235
robustness 104 policing 6
VoIP 40, 46 two color 717
loss requirements policing/shaping 716
TCP based 491 PoPs 680
networks port prioritization 243
impairments 39 potential issues 8
transmission, address lifetimes and POTS-mobile 488
deprecation 654 POTS-POTS 488
PacketCable 332 PPP 242, 436, 443
end-to-end VoIP architecture 533 header overview 721
provision PPPoE 326
future services 339 PPs 423
VoIP 333 PPVPNs 682
PAL 108 PRACK method 210
parallels between protocols 217 principles
PBX 129, 215 digital switching 128
PCM 107 voice switches 128
PCR 513 private
PCS 136 network
PDU 256 dedicated lines 677
PE 271 network examples 543–591
per hop behavior 234 introduction 543
performance objectives and target summary 590
NNI 255

778 Index
processes offer/answer model 207

real-time 20 session
processing description 207
delay 44 SIP 5
propagation messages 205
delay 44 responses 206
protection scheme 398, 421 SIP-T
MPLS 420 signalling control 214
protocol 5 protocols
AMR 102 core 407
ATM layer 257 network edge
comparison NAT 399
SIP and H.323 216 provider types
components layer 3 VPN 687
MPLS 288 PSTN 123
G.711 99, 129 PTT 677
G.723.1 101 PVC 273, 441
G.726 100 PVCs 598
G.729 100
G.729A 100 Q
GSM-EFR 101 Q.931 133, 215
H.248/Megaco 5 QAM 331, 594
H.323 5 QoE 26
internet contributing factors & impairments 712
future 341 definition 26
IS-641 102 engineering 27, 711–719
layered 253 voice 468
Megaco engineering methodology 466
connection model 223 impairments 469
message flow 221 metrics 28
resource model 223 metrics and targets 468
Megaco/H.248 network resources
gateway control architecture 221 introduction 466
overview 220 parameters
PBX 215 quantifying 29
PPP 242 performance 719
Q.931 133, 215 metrics & targets 711
QSIG 215 performance metrics 628
real-time 5, 161 requirements 27, 629
applications 5 QoE to QoS
references 178 summary 498
summary 177 QoS 6, 230, 337
reference diagram architecture
optical Ethernet 297 DiffServ 237
RSVP-TE hello 426 small office 511
RTP 5 architecture by site 510
SDP

Index 779
ATM networks 264 use 26

definition 31 efficiency 26
design 509 transparency 26
implementation Quality of Service 6, 230
introduction 436 queue management 718
references 464 QoS
summary 463 mechanisms 236
transport path 435 queuing
implementation considerations 458 delay 44
layer 3 VPNs 690 QuickTime 114
mapping policy
NSC 457 R
mechanisms 27, 229, 472, 714 random early discard 236
classifier 232 rankings
concepts 229, 230 codec 105
implementing 236 RAS 225
marker 233 rate limiting characteristics 717
overview 232 RBOCs 141
policer 235 RCL 415
queue management 236 Real Networks
references 247 streaming method 707
scheduler 236 RealAudio 107
selection 475 RealNetworks 115
shaper 235 RealPlayer 107
summary 246 real-time
methods 243 applications 5
network convergence 231 services 5
networks service classes 450 carrier examples 517–542
requirements 629 summary 540
small office architecture design 512 communication over ATM networks 253
strategy 509, 515 definition 19
Nortel Networks 514 network 11
variance 416 Nortel Networks 503
VoIP 503 architecture 505
QoS implementation networking 20
concepts 436 ADSL 327
QSIG 215 designing 545
quality introduction 3
experience 4, 15 packet
MOS 53 classification 233
PSTN 61 processes 20
service 15 services
speech 54 background and introduction 3
voice 35 summary 33
VoIP 38 RED 236, 718
Quality of Experience 26 redundancy 7
ease NAT 396

780 Index
REFER method 211 RFC1928 669

other resources 212 RFC1981 348
reference diagram 675 RFC2460 344
references RFC2461 647
ATM and Frame Relay 282 RFC2462 647
IPv6 672 RFC2545 642
MPLS networks 296 RFC2740 642
MPLS rapid fail over 433 RFC2893 662
QoS RFC3089 669
mechanisms 247 RFC3513 346
QoS implementation 464 RFC3542 657
VPN 694 RFC3564 293
requirements RFC3678 658
ANSI stratum clock 156 RGB 109
application performance 446 rights management 330
performance 25 RIPE 643
service quality 25 road warrior 678
resiliency robustness
IP reroute 396 packet loss 104
network 7 routing
resilient packet ring 300 component 424
resource model layer 3 VPNs 687
Megaco 223 RPR 141, 298, 299, 547
resources allocation 479 Ethernet 300
responsive RPR application 143
applications 448 RSTP 400
restrictions RSVP 288, 477
transcoding 105 RSVP-TE 288
RF example 292
modulation and signalling 331 RSVP-TE hello protocol 426
RTCP
packet formats 620
RTP 5
protocol structure 619–621
RTP / RTCP / RTSP 161
RTP/UDP 604
RTSP 604
RTSP session flow 605
RTT 712
S
SA 611
SAD 37
SADB 659
sampling
analog signal 84

Index 781
SAN 141, 142 session control 6

SAR 256 setup
scenarios label switched paths 290
mapping 440 LSP 291
scheduler SG 336
QoS shaper
mechanisms 236 QoS
SCR 513 mechanisms 235
SCTE 339 shimmer 76
SDA 553 sidebar
SDCC 146 analog
SDH 298 video 108
hierarchy 127 BayRS protocol priority queuing 513
SDP 207 coding process 82
offer/answer model 207 echo
SECBR 266 causes 51
secure voice zone 569 evolution
securing neighbor discovery 652 gateway control protocols 219
security IPv6
APIs 659 addresses 343
firewall 551 packages and profiles 224
selecting PIM-SSM 699
codec 102 PSTN quality 61
SEND 653 terminations 224
sensitivity of applications 22 signal level
sequences VoIP 40
frames 74 signalling 131
serialization links 136
delay 44 Signalling System 7 132
service SIMPLE method 213
application evolution SIP 5, 333, 532
SONET 140 messages 205
extensions responses 206
video on demand 604 session establishment 209
intelligence 541 solutions 565
internet access 305 SIPPING 218
level guarantees 478 SIP-T
management 561 signalling
optical Ethernet 304 control 214
traditional 142 sizing
types network connections 130
upstream traffic flows 338 switch 130
voice and multimedia communications trunks 130
564 SLA 7
services small office
real-time architecture design
definition 23 QoS 512

782 Index
QoS architecture 511 SSL 680

smearing 76 SSM 156
SMLT 402, 552 SSM signal flow example 157
SMZ 569 ST3 155
SNA 231 standards bodies
SNMP 449 acceptance 339
solution 525 stateful auto-configuration and DHCPv6 653
broadband 536 states for auto-configured addresses 656
cable 533 statistics requirements 707
Centrex IP 519 status
local service providers 522 notification
multimedia 529 subscription 214
solutions management 560 presence
SONET 6, 7 subscription 214
frame format 145 STB 696
hierarchy 126 STP 136, 400, 554
layers and architecture 144 Stratum
level rates 143 clock 156, 616
network 139 clock hierarchy 617
concepts 139 STS 144
introduction 140 subjective
migration 142 MOS 53
overview 140 SUBSCRIBE/NOTIFY method 213
summary 146, 158 subscriber aggregation 324
service application evolution 140 subscription
terminology 143 notification status 214
SONET/SDH hierarchy 615, 616 presence status 214
SONET/SDH rates 144 summary
source ATM and frame relay 280
delay 44 MPLS networks 295
destination address selection 658 network engineering guidelines 498
spanning tree 399 private network examples 590
spatial reuse QoS
example 301 implementation 463
SPE 144, 158 mechanisms 246
split MLT 402 real-time carrier examples 540
SPT 596 SONET network 146, 158
SPVC call routing 413 TDM circuit switched networking 138
SPVC path telephony codec 99
optimization 414 video
SS 482 broadcasting multicasting 614
SS7 132 impairments 79
signalling 134 voice quality 63
VPNs 693

Index 783
survivability 7 WAN 7
SVC 569 technology
SVZ 551, 557 downstream 331
switching upstream 331
back component 424 TED 612
line to line 130 Telcordia Technologies 135
line to trunk 130 telecommunications convergence 15
trunk to line 130 telephony
VP and VC codec
ATM 259 common 98
synchronization 153 summary 99
audio/video 78 Telnet 24
signals TELR 49
loss 69 TGS 337
status messages 157 TIA-810-A 49
TDM 127 Tier 2 service provider 8
Timbuktu 24
T time division multiplexing 124
T1 126 timely applications 449
talker echo loudness rating 49 timing methods
tandeming 46, 94, 101, 500 network element 154
TCP 448 tool kit 12
based data application packet loss topology
requirements 491 network 219
TDM 25, 615 TOS 233
circuit switched networking 123 field
concepts 124 DiffServ 235
summary 138 old
clock rate 127 field definition 234
principles 124 traditional
synchronization 127 services 142
TDMA 310 traffic
TE 287, 479 conditioning 716
technical demands 480
challenges calculation example 481
broadband 536 engineering 479
cable 532 management
Centrex IP 518 parameters 441
local service providers 522 summary
long distance providers 525 NSC 456
multimedia 528 prioritization
technologies VLANs 244
access 7 trunks and flows
core 7 MPLS networks 287
LAN 7 traffic engineering
network 6 examples 480

784 Index
transcoding URI 216

restrictions 105 user identification 707
transformed network 542 user-to-network
transition risk assessment 544 interface 255
transmission
planning V
VoIP 41 VAD 272, 482
rating variable
voice quality 55 physical
transparent domains 303 delay 29
transport resolution 29
Ogg 106 SNR 29
path VFRAD 272
diagram 35, 67, 81, 123, 139, 229, 251, video 67
285, 307, 341, 395, 419, 435, broadcasting multicasting 593–614
465, 593 concepts 593
TRT 668 introduction 594
trunk summary 614
difference 129 codec 107
tunnel digital 70
manually configured 662 impairments 67, 68
types 662 summary 79
tunnelling 660 on demand
tunnels cable 610
GRE 663 deployment 608
tunnels across IPv6 networks 670 functional components 610
two color policing 717 service extensions 604
type of service 233 signal
types impairments 69
convergence 15 signals 108
MOS 53 sources traffic profile 482
types of tunnels 662 voice 305
virtual
U LAN identifier 240
UDP 448 private networks 675–694
UNI 255 VLAN 598
unicast routing and addressing 642 ID 240
unified messaging 584 field 240
uniform resource identifiers 216 traffic
UPDATE method 210 prioritization 244
UPSR 141, 148, 150 vocoders 273
UPSR vs. BLSR 151 VoD 594
upstream VOD deployment
technology 331 cable MSO 609

Index 785
VoDSL 489 lessons learned 508

voice packet loss 40
band session 485 concealment 47
buffer provisioning 492 PacketCable 333
calls QoS 503
IP network 36 quality 38
IP 25 signal level 40
multimedia communications services 564 sources
performance 102 delay 44
QoE 468 transmission planning 41
QoE impairments 469 voice path
quality 35 block diagram 37
concepts 35 VoIP and QoS
delay 39 global enterprise 503–516
introduction 36 Vorbis
MOS 53 codec 106
summary 63 VP and VC switching
transmission rating 55 ATM switch 259
quality performance 631 VPI/VCI 286
switches VPLS 683
principles 128 VPN 305, 505, 675–694
telephony development 676
ATM 269 introduction 675
video 305 references 694
VoIP 7 schemes 690
architecture and requirements 565 summary 693
business case 506 variants 676
definition 25 VPNs using SVCs 677
delay VPT 514
processing 44 VRRP 403, 553
propagation 44 VRRP diagram 404
queuing 44 VT 144, 158
serialization 44
delay budget 459 W
distortion WAN 7, 235
packet loss 46 wave 107
echo web streaming
type 49 methods and practices 707
echo control 40, 52 weighted fair queuing 236
impairment weighted RED 236
delay 41 weighted round robin 236
distortion 45 WFQ 236, 714
echo 48 WFQ scheduler 715
jitter 42 wide
implementation strategy 507 area mobility networks 311
introduction 503
IP telephony 25

786 Index
wideband
digital cross-connect 148
Windows Media Player 115
wireless systems
broadband delivery 310
working group
ENUM 218
IPTEL 218
MIDCOM 218
MMUSIC 218
NSIS 218
SIPPING 218
working path component 424
WPs 423
WRED 236
WRED QoE performance 719
WRR 236
WSM 552
Y
Y.1711 426

Essentials of Real Time Networking

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Essentials of Real Time Networking

Uploaded by

Copyright:

Available Formats

Copyright Country of printing Confidentiality Legal statements Trademarks

Essentials of Real-Time Networking: How Real-Time Disrupts the Best-Effort Paradigm

Chapter 1. Introduction .............................................................................1

Section I: Real-Time Applications and Services ...........13

Chapter 2. The Real-Time Paradigm Shift ......................................15

Chapter 3. Voice Quality ...................................................................35

Chapter 4. Video Quality ..................................................................67

Copyright © 2004 Nortel Networks Essentials of Real-Time Networking

Chapter 5. Codecs for Voice and Other Real-Time Applications .81

Section II: Legacy Networks .........................................121

Chapter 7. SONET/SDH ..................................................................139

Section III: Protocols for Real-Time Applications .......159

Essentials of Real-Time Networking Copyright © 2004 Nortel Networks

RTP and TCP ......................................................................................................169

Chapter 9. Call Setup Protocols: SIP, H.323, H.248 .....................179

Chapter 10. QoS Mechanisms ....................................................... 229

Section IV: Packet Network Technologies ...................249

Chapter 12. MPLS Networks ..........................................................285

Copyright © 2004 Nortel Networks Essentials of Real-Time Networking

Traffic trunks and flows ........................................................................................287

Chapter 13. Optical Ethernet .........................................................297

Chapter 14. Network Access: Wireless, DSL, Cable ...................307

Chapter 15. The Future Internet Protocol: IPv6 ...........................341

Essentials of Real-Time Networking Copyright © 2004 Nortel Networks

Section V: Network Design and Implementation ........371

Chapter 17. Network Reconvergence ...........................................395

Chapter 18. MPLS Recovery Mechanisms ....................................419

Chapter 19. Implementing QoS: Achieving Consistent Application

Copyright © 2004 Nortel Networks Essentials of Real-Time Networking

Making QoS Simple via Networks Service Classes ............................................450

Chapter 20. Achieving QoE: Engineering Network

Section VI: Examples ....................................................501

Chapter 22. Real-Time Carrier Examples .....................................517

Chapter 23. Private Network Examples ........................................543

Essentials of Real-Time Networking Copyright © 2004 Nortel Networks

Chapter 24. IP Television Example ................................................593

Appendix A. Additional Details about TDM Networking ....................615

Appendix B. RTP Protocol Structure ................................................... 619

Appendix C. Additional Information on Voice Performance

Appendix D. Additional Information about IPv6 ..................................637

Appendix E. Virtual Private Networks: Extending the Corporate

Appendix F. IP Multicast ........................................................................695

Appendix G. QoE Engineering ..............................................................711

Copyright © 2004 Nortel Networks Essentials of Real-Time Networking

Appendix H. PPP Header Overview...................................................... 721

Essentials of Real-Time Networking Copyright © 2004 Nortel Networks

Cedric Aoun works as a Senior Network Architect in Nortel Networks

François Audet is an IP Telephony Subject Matter Expert at Nortel, where

François Blouin is an Engineer and Subject Matter Expert in modeling

Sandra Brown is the Manager of the Technology and New Product

Copyright © 2004 Nortel Networks Essentials of Real-Time Networking

Healthcare industries. While at Nortel, he has held leadership roles in

Peter Chapman has held positions as Development Manager, Product

Hung-Ming Fred Chen is a Network Performance Consultant with Nortel.

Robert L. Cirillo, Jr. is a Network Architect in the Wireline Global

Rob Dalgleish is responsible for 3G Access Strategy with Nortel Networks

Essentials of Real-Time Networking Copyright © 2004 Nortel Networks

Stephen Dudley has 23 years experience as an engineer in the

Gwyneth Edwards manages the Information Services (IS) product

Shane Fernandes is a Network Engineer in the Information Services

Shardul Joshi is a Network Engineer at Nortel with seven years

Sinchai Kamolphiwong received a Ph.D. degree from the University of

Copyright © 2004 Nortel Networks Essentials of Real-Time Networking