You are on page 1of 34

ATSC DTV System Video Compression Guidelines

Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$


Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
ATSC DTV System Video Compression Guidelines 11-69
Transmitting the quantizer matrices costs bits in the compressed data stream. If sent with
every picture in the 60 f/s progressive mode, the matrices consume 0.3 percent of the channe!
bandwidth. This modest amount of overhead can be reduced by updating the quantizer matri"
!ess frequent!y, or on!y when the difference between the desired quantizer matri" and the
prevai!# ing quantizer matri" becomes significant.
$ufficient compression cannot be achieved un!ess a !arge fraction of the %&T coefficients
are dropped and therefore not se!ected for transmission. The coefficients which are not se!ected
are assumed to have zero va!ue in the decoder.
The dc coeff icients are coded different!y to ta'e advantage of high spatia! corre!ation.
(or e"amp!e, when intra#coded, the first dc coefficient in a s!ice is sent abso!ute!y) the
fo!!owing dc coefficients are sent as differences.
11.4.2o Entropy Coding of Video Dt
*uantization creates an efficient, discrete representation for the data to be transmitted +,-. &ode
word assignment ta'es the quantized va!ues and produces a digita! bit stream for transmission.
.ypothetica!!y, the quantized va!ues cou!d be simp!y represented using uniform# or fi"ed#!ength
code words. /nder this approach, every quantized va!ue wou!d be represented with the same
number of bits. 0reater eff iciency1in terms of bit rate1can be achieved with entropy coding.
2ntropy coding attempts to e"p!oit the statistica! properties of the signa! to be encoded. 3
sig# na!, whether it is a pi"e! va!ue or a transform coeff icient, has a certain amount of
information, or entropy, based on the probabi!ity of the different possib!e va!ues or events
occurring. (or e"am# p!e, an event that occurs infrequent!y conveys much more new
information than one that occurs often. The fact that some events occur more frequent!y than
others can be used to reduce the average bit rate.
!uffmn Coding
.uffman coding, which is uti!ized in the 3T$& %T4 video#compression system, is one of the
most common entropy#coding schemes +,-. In .uffman coding, a code boo' is generated that
can approach the minimum average description !ength 5in bits6 of events, given the probabi!ity
distribution of a!! the events. 2vents that are more !i'e!y to occur are assigned shorter#!ength
code words, and those !ess !i'e!y to occur are assigned !onger#!ength code words.
"un #engt$ Coding
In video compression, most of the transform coefficients frequent!y are quantized to zero +,-.
There may be a few non#zero !ow#frequency coefficients and a sparse scattering of non#zero
high#frequency coefficients, but most of the coeff icients typica!!y have been quantized to zero.
To e"p!oit this phenomenon, the #dimensiona! array of transform coefficients is reformatted
and prioritized into a ,#dimensiona! sequence through either a zigzag# or a!ternate#scanning pro#
cess. This resu!ts in most of the important non#zero coefficients 5in terms of energy and visua!
perception6 being grouped together ear!y in the sequence. They wi!! be fo!!owed by !ong runs of
coefficients that are quantized to zero. These zero#va!ue coefficients can be efficient!y repre#
sented through run length encoding.
In run !ength encoding, the number 5run6 of consecutive zero coefficients before a non#zero
coefficient is encoded, fo!!owed by the non#zero coeff icient va!ue. The run !ength and the
coeffi#
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
ATSC DTV System Video Compression Guidelines
11-%& Compression Te'$nologies for Video nd Audio
"a$
-loc.s of
DCT
coeffi cien
ts
Gm(r 11.4.6 Scanning blo. .oefisien/ "a$
scanning alternatif .oefisien0 "b$ 1ig 1ag
scanning .oefisien# "Dari 234# Diguna.an
dengan i+in#$
"b$
cient va!ue can be entropy#coded, either separate!y or 7oint!y. The scanning separates most of
the zero and the non#zero coefficients into groups, thereby enhancing the eff iciency of the
run !ength encoding process. 3!so, a specia! end-of-block 52896 mar'er is used to signify
when a!! of the remaining coefficients in the sequence are equa! to zero. This approach can be
e"treme!y efficient, yie!ding a significant degree of compression.
In the a!ternate#/zigzag#scan technique, the array of 6: %&T coefficients is arranged in a ,#
dimensiona! vector before run !ength/amp!itude code word assignment. Two different ,#dimen#
siona! arrangements, or scan types, are a!!owed, genera!!y referred to as zigzag scan 5shown in
(igure ,,.:.6a6 and alternate scan 5shown in (igure ,,.:.6b6. The scan type is specified before
coding each picture and is permitted to vary from picture to picture.
C$nnel )uffer
;henever entropy coding is emp!oyed, the bit rate produced by the encoder is variab!e and is a
function of the video statistics +,-. 9ecause the bit rate permitted by the transmission system is
!ess than the pea' bit rate that may be produced by the variab!e#!ength coder, a channel buffer is
necessary at the decoder. This buffering system must be carefu!!y designed. The buffer
contro!!er must a!!ow efficient a!!ocation of bits to encode the video and a!so ensure that no
overf!ow or underf!ow occurs.
9uffer contro! typica!!y invo!ves a feedbac' mechanism to the compression a!gorithm
whereby the amp!itude reso!ution 5quantization6 and/or spatia!, tempora!, and co!or reso!ution
may be varied in accordance with the instantaneous bit#rate requirements. If the bit rate
decreases significant!y, a finer quantization can be performed to increase it.
ATSC DTV System Video Compression Guidelines 11-%1
Co
ded
*ideo
bitst
ream
C% ann
el
VL D
C 5n*erse D
5DCT
Decoded
E * *ideo
buffer 6uant i1 er
7
7
!
Motion
G
compens at
or
Anc%or fram
e st orag e "'
$
C 8r ed iction error DCT coeffici ents in 6uant i1 ed f orm
D 9u anti1 ed predict ion error DCT coeffici ents in st and ar d f orm
E 8i :el bypi:el pred iction errors0 d egr ad ed by 6uan ti1ation
* ;e constructed pi:el *al ues0 degrad ed by 6uan ti1ation
G Motion compens at ed pr ed icted pi:el *al ues
! Motion *ect ors
*igure 11.4.% ATSC DTV *ideo system decoder functional bloc. diagram# "From 234# Used
with permission#$
The 3T$& %T4 standard specifies a channe! buffer size of < =bits. The model buffer is
defined in the %T4 video#coding system as a reference for manufacturers of both encoders and
decoders to ensure interoperabi!ity. To prevent overf!ow or underf!ow of the mode! buffer, an
encoder may maintain measures of buffer occupancy and scene comp!e"ity. ;hen the encoder
needs to reduce the number of bits produced, it can do so by increasing the genera! va!ue of the
quantizer sca!e, which wi!! increase picture degradation. ;hen it is ab!e to produce more bits, it
can decrease the quantizer sca!e, thereby decreasing picture degradation.
The bit stream produced by the video encoder is passed to the transport encoding system for
mu!tip!e"ing with audio and anci!!ary data, >!ip#synch?, and schedu!ing for de!ivery.
De'oder )lo'+ Digrm
3s shown in (igure ,,.:.@, the 3T$& %T4 video decoder contains e!ements that invert, or
undo, the processing performed in the encoder +,-. The incoming coded video bit stream is
p!aced in the channe! buffer, and bits are removed by a variable length decoder 54A%6.
The 4A% reconstructs < B < arrays of quantized %&T coefficients by decoding run !ength/
amp!itude codes and appropriate!y distributing the coefficients according to the scan type used.
These coeff icients are dequantized and transformed by the I%&T to obtain pi"e! va!ues or
predic# tion errors.
In the case of interframe prediction, the decoder uses the received motion vectors to perform
the same prediction operation that too' p!ace in the encoder. The prediction errors are summed
with the resu!ts of motion#compensated prediction to produce pi"e! va!ues.
11-%2 Compression Te'$nologies for Video nd Audio
*rme Store for De'oded ,i'tures
3s described previous!y, pi"e! va!ues are decoded from the incoming bit stream +,-. In the case
of decoded anchor frames 5I or P#frames6 these va!ues must be stored in a frame buffer for
subse# quent use as prediction references. ;hen B#frames are used, the anchor frame
storage a!so a!!ows for the necessary frame re#ordering for disp!ay.
11.4.2p Con'tented Se-uen'es
The =C20# standard, which under!ies the 3T$& %T4 standard, c!ear!y specifies the behavior
of a comp!iant video decoder when processing a sing!e video sequence +,-. 3 coded video
sequence commences with a sequence header, may contain some repeated sequence headers and
one or more coded pictures, and is terminated by an end#of#sequence code. 3 number of param#
eters specified in the sequence header are required to remain constant throughout the duration of
the sequence. The sequence#!eve! parameters inc!ude, but are not !imited toD
E .orizonta! and vertica! reso!ution
E (rame rate
E 3spect ratio
E &hroma format
E Crofi!e and !eve!
E 3!!#progressive indicator
E Video buffering verifier 54946 size
E =a"imum bit rate
It is a common requirement for coded bit streams to be sp!iced for editing, insertion of com#
mercia! advertisements, and other purposes in the video production and distribution chain. If
one or more of the sequence#!eve! parameters differ between the two bit streams to be sp!iced,
then an end#of#sequence code must be inserted to terminate the first bit stream, and a new
sequence header must e"ist at the start of the second bit stream. Thus, the situation of
concatenated video sequences arises.
3!though the =C20# standard specif ies the behavior of video decoders for the processing
of
a sing!e sequence, it does not p!ace any requirements on the hand!ing of concatenated
sequences. $pecification of the decoding behavior in the former case is feasib!e because the
=C20# stan# dard p!aces constraints on the construction and coding of individua! sequences.
These constraints prohibit channe! buffer overf!ow and coding of the same f ie!d parity for two
consecutive fie!ds. The =C20# standard does not prohibit these situations at the 7unction
between two coded sequences) !i'ewise, it does not specify the behavior of decoders in this
case.
3!though it is recommended, the %T4 standard does not require the production of well-con-
strained concatenated sequences. ;e!!#constrained concatenated sequences are defined as hav#
ing the fo!!owing characteristicsD
E The e"tended decoder buffer never overf!ows and may underf!ow on!y in the case of !ow#
de!ay bit streams. .ere, >e"tended decoder buffer? refers to the natura! e"tension of the
=C20# decoder buffer mode! to the case of continuous decoding of concatenated
sequences.
ATSC DTV System Video Compression Guidelines 11-%.
E ;hen fie!d parity is specified in two coded sequences that are concatenated, the parity of the
first fie!d in the second sequence is opposite that of the !ast fie!d in the first sequence.
E ;henever a progressive sequence is inserted between two inter!aced sequences, the e"act
number of progressive frames is such that the parity of the inter!aced sequences is preserved
as if no concatenation had occurred.
11.4.2- Guidelines for "efres$ing
3!though the %T4 standard does not require refreshing at !ess than the intraframe#coded mac#
rob!oc' refresh rate 5defined in I2&/I$8 ,3<,<#6, the fo!!owing genera! guide!ines are recom#
mended +,-D
E In a system that uses periodic transmission of I#frames for refreshing, the frequency of
occur# rence of I#frames wi!! determine the channe!#change time performance of the system.
In this case, it is recommended that I#frames be sent at !east once every 0.F second for
acceptab!e channe!#change performance. It a!so is recommended that sequence#!ayer
information be sent before every I#frame.
E To spatia!!y !oca!ize errors resu!ting from transmission, intraframe#coded s!ices shou!d con#
tain fewer macrob!oc's than the ma"imum number a!!owed by the standard. It is recom#
mended that there be four to eight s!ices in a horizonta! row of intraframe#coded
macrob!oc's for the intraframe#coded s!ices in the I#frame refresh case, as we!! as for the
intraframe#coded regions in the progressive refresh case. Gonintraframe#coded s!ices
can be !arger than intraframe#coded s!ices.
11.4.. "eferen'es
,. 3T$&D >0uide to the /se of the 3T$& %igita! Te!evision $tandard,? 3dvanced
Te!evision
$ystems &ommittee, ;ashington, %.&., doc. 3/F:, 8ct. :, ,HHF.
. &adzow, Iames 3.D Discrete Time ystems, Crentice#.a!!, Inc., 2ng!ewood &!iffs,
G.I.,
,H@3.
3. >I222 $tandard $pecifications for the Imp!ementation of < B < Inverse %iscrete &osine
Transform,? std. ,,<0#,HH0, %ec. 6, ,HH0.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Source/ Standard !andboo. of Video and Tele*ision Engineering
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
C$pter
33#<
Compression System Constrints nd
,erformn'e /ssues
0erry C. 1$it+er2 Editor-in-C$ief
11.3.1 /ntrodu'tion
3s with any new techno!ogy, compression systems have, genera!!y spea'ing, e"perienced a few
growing pains. &ertain requirements and tradeoffs1not anticipated during initia! design of a
given device, system, or standard1inevitab!y must be reso!ved whi!e the techno!ogy is being
imp!emented. $uch issues inc!ude concatenation, encoding optimization, and bit stream
sp!icing.
11.3.2 Con'tention
The production of a video segment or program is a seria! processD mu!tip!e modifications must
be made to the origina! materia! to yie!d a finished product. This seria! process demands many
steps where compression and decompression cou!d ta'e p!ace. &ompression and decompression
within the same format is not norma!!y considered concatenation. Jather, concatenation
invo!ves changing the va!ues of the data, forcing the compression techno!ogy to once again
compress the signa!.
&ompressing video is not, genera!!y spea'ing, a comp!ete!y !oss!ess process) !oss!ess bit#rate
reduction is practica! on!y at the !owest compression ratios. It shou!d be understood, however,
that !oss!ess compression is possib!e1in fact, it is used for critica! app!ications such as medica!
imaging. $uch systems, however, are re!ative!y inefficient in terms of bit usage.
(or common video app!ications, concatenation resu!ts in artifacts and coding prob!ems when
different compression schemes are cascaded and/or when recompression is required. =u!tip!e
generations of coding and decoding are practica!, but not particu!ar!y desirab!e. In genera!, the
fewer generations, the better.
/sing the same compression a!gorithm repeated!y 5=C20#, for e"amp!e6 within a chain1
mu!tip!e generations, if you wi!!1shou!d not present prob!ems, as !ong as the pictures are not
manipu!ated 5which wou!d force the signa! to be recompressed6. If, on the other hand, different
compression a!gorithms are cascaded, a!! bets are off. 3 detai!ed mathematica! ana!ysis wi!!
11-%3
Compression System Constraints and 8erformance 5ssues
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
11-%6 Compression Te'$nologies for Video nd Audio
revea! that such concatenation can resu!t in artifacts ranging from insignificant and unnoticeab!e
to considerab!e and ob7ectionab!e, depending on a number of variab!es, inc!uding the fo!!owingD
E The types of compression systems used
E The compression ratios of the individua! systems
E The order or sequence of the compression schemes
E The number of successive coding/decoding steps
E The input signa!s themse!ves
The !ast point merits some additiona! discussion. 3rtifacts from concatenation are most
!i'e!y during viewing of scenes that are difficu!t to code in the first p!ace, such as those
containing rapid movement of ob7ects or noisy signa!s. =any video engineers are fami!iar with
test tapes containing scenes that are intended specifica!!y to point out the wea'nesses of a given
compres# sion scheme or a particu!ar imp!ementation of that scheme. To the e"tent that such
scenes repre# sent rea!#wor!d conditions, these >compression#'i!!er? images represent a rea!
threat to picture qua!ity when sub7ected to concatenation of systems.
11.3.. T$e Video En'oding ,ro'ess
The function of any video compression device or system is to provide for efficient storage
and/or transmission of information from one !ocation or device to another. The encoding
process, natu# ra!!y, is the beginning point of this chain. Ai'e any chain, video encoding
represents not 7ust a sing!e !in' but many interconnected and interdependent !in's. The
bottom !ine in video and audio encoding is to ensure that the compressed signa! or data stream
represents the information required for recording and/or transmission, and only that information.
If there is additiona! infor# mation of any nature remaining in the data stream, it wi!! ta'e bits to
store and/or transmit, which wi!! resu!t in fewer bits being avai!ab!e for the required data.
$urp!us information is irre!evant because the intended recipient5s6 do not require it and can
ma'e no use of it.
11.3.. En'oding Tools
In the migration to digita! video techno!ogies, the encoding process has ta'en on a new and piv#
ota! ro!e. Ai'e any technica! advance, however, encoding presents both cha!!enges and rewards.
The cha!!enge invo!ves assimi!ating new too!s and new s'i!!s. The qua!ity of the fina! com#
pressed video is dependent upon the compression system used to perform the encoding, the
too!s provided by the system, and the s'i!! of the person operating the system.
9eyond the automated procedures of encoding !ies an interactive process that can consider#
ab!y enhance the f inished video output. These >human#assisted? procedures can ma'e the
differ# ence between high#qua!ity images an mediocre ones, and the difference between the eff
icient use of media and wasted bandwidth.
The goa! of inte!!igent encoding is to minimize the impact of encoding artifacts, rendering
them inconspicuous or even invisib!e. $uccess is in the eye of the viewer and, thus, the process
invo!ves many sub7ective visua! and aesthetic 7udgments. It is reasonab!e to conc!ude, then, that
automatic encoding can go on!y so far. It cannot substitute for the trained eye of the video
profes# siona!.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-%%
In this sense, human#assisted encoding is ana!ogous to the te!ecine process. In the te!ecine
app!ication, a s'i!!ed professiona!1the colorist1uses techniques such as co!or correction, fi!#
tering, and noise reduction to ensure that the video version of a motion picture or other fi!m#
based materia! is as true to the origina! as technica!!y possib!e. This wor' requires a
combination of technica! e"pertise and video artistry. Ai'e the te!ecine, human#assisted
encoding is an itera# tive process. The operator 5compressionist, if you wi!!6 sets the encoding
parameters, views the impact of the settings on the scene, and then further modifies the
parameters of the encoder unti! the desired visua! resu!t is achieved for the scene or segment.
11.3..( Signl Conditioning
&orrect!y used, signa! conditioning can provide a remar'ab!e increase in coding eff iciency and
u!timate picture qua!ity. 2ncoding equipment avai!ab!e today incorporates many different types
of fi!ters targeted at different types of artifacts. The benefits of appropriate conditioning are
two# fo!dD
E 9ecause the artifacts are unwanted, there is a c!ear advantage in avoiding the a!!ocation of
bits to transmit them.
E 9ecause the artifacts do not >be!ong,? they genera!!y vio!ate the ru!es or assumptions of the
compression system. (or this reason, artifacts do not compress we!! and use a disproportion#
ate!y high number of bits to transmit and/or store.
(i!tering prior to encoding can be used to se!ective!y screen out image information that
might otherwise resu!t in unwanted artifacts. $patia! fi!tering app!ies within a particu!ar frame,
and can be used to screen out higher frequencies, removing fine te"ture noise and softening
sharp edges. The resu!ting picture may have a softer appearance, but this is often preferab!e to
a b!oc'ing or ringing artifact. $imi!ar!y, tempora! 5recursive6 fi!tering, app!ied from frame to
frame, can be emp!oyed to remove tempora! noise caused1for e"amp!e1by grain#based
irregu!arities in fi!m.
&o!or correction can be used in much the same manner as fi!tering. &o!or correction can
smooth out uneven areas of co!or, reducing the amount of data the compression a!gorithm wi!!
have to contend with, thus e!iminating artifacts. Ai'ewise, ad7ustments in contrast and
brightness can serve to mas' artifacts, achieving some image qua!ity enhancements without
noticeab!y a!tering the video content.
(or decades, the phrase garbage-in, garbage-out has been the watchword of the data
process# ing industry. If the input to some process is f!awed, the output wi!! invariab!y be
f!awed. $uch is the case for video compression. /n!ess proper attention is paid to the entire
encoding process, degradations wi!! occur. In genera!, consider the fo!!owing encoding
chec'!istD
E 9egin with the best. If the source materia! is fi!m, use the highest#qua!ity print or negative
avai!ab!e. If the source is video, use the highest qua!ity, fewest#generation tape.
E &!ean up the source materia! before attempting to compress it. Cerform whatever noise
reduc# tion, co!or correction, scratch remova!, and other artifact#e!imination steps that are
possib!e before attempting to send the signa! to the encoder. There are some defects that the
encoding process may hide. Goise, scratches, and co!or errors are not among them. 2ncoding
wi!! on!y ma'e them worse.
E %ecide on an aspect ratio conversion strategy 5if necessary6. Keep in mind that once
informa# tion is discarded, it cannot be rec!aimed.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-%4 Compression Te'$nologies for Video nd Audio
E Treat the encoding process !i'e a high#qua!ity te!ecine transfer. $tart with the defau!t com#
pression settings and ad7ust as needed to achieve the desired resu!t. %ocument the settings
with an encoding decision list 52%A6 so that the choices made can be reproduced at a !ater
date, if necessary.
The encoding process is much more of an artistic e"ercise than it is a technica! one. In the area
of video encoding, there is no substitute for training and e"perience.
11.3..' S5,TE ", 2&2
3mong the too!s deve!oped to optimize the coding process is $=CT2 Jecommended Cractice
0. 2quipment conforming to this practice wi!! minimize artifacts in mu!tip!e generations of
encoding and de#coding by optimizing macrob!oc' a!ignment +3-. 3s =C20# becomes perva#
sive in emission, contribution, and distribution of video content, mu!tip!e compression and
decompression 5codec6 cyc!es wi!! be required. &oncatenation of codecs may be needed for pro#
duction, post#production, transcoding, or format conversion. 3ny time video transitions to or
from the coefficient domain of =C20# are performed, care must be e"ercised in a!ignment of
the video, both horizonta!!y and vertica!!y, as it is coded from the raster format or decoded and
p!aced in the raster format.
The first prob!em is shifting the video horizonta!!y and vertica!!y. 8ver mu!tip!e
compression and decompression cyc!es, this cou!d substantia!!y distort the image. Aess
obvious, but 7ust as important, is the need for macrob!oc' a!ignment to reduce artifacts between
encoders and decod# ers from various equipment vendors. If concatenated encoders do not share
common macrob!oc' boundaries, then additiona! quantization noise, motion estimation errors,
and poor mode deci# sions may resu!t. Ai'ewise, encoding decisions that may be carried
through the production and post#production process with recoding data present, wi!! re!y upon
macrob!oc' a!ignment. %ecoders must a!so e"ercise caution in p!acement of the active video in
the scanning format so that the downstream encoder does not receive an offset image.
;ith these issues in mind, JC 0 specif ies the spatia! a!ignment for =C20# video
encoders and decoders. 9oth standard definition and high#definition video formats for
production, distri# bution, and emission systems are addressed. Tab!e ,,.F., gives the
recommended coding ranges for =C20# encoders and decoders.
11.3.4 5,EG )it Strem Spli'ing
In the typica! editing environment, audio and video segments are sp!iced into or onto e"isting
materia!. In the uncompressed domain, this is a simp!e procedure +,-. It is re!ative!y easy to syn#
chronize two or more video/audio streams. 4ertica! interva!s occur regu!ar!y, a!!owing switches
to be performed as required 5(igure ,,.F.,6. %igita! audio is simi!ar to video in this regard, and
ana!og audio is even easier because it requires no synchronization whatsoever. .owever, in the
compressed domain of an =C20 bit stream, severa! factors must be considered. 3mong them
areD
E The varying number of bits per frame
E The use of motion prediction
E The fact that frames may not be sent in the order in which they wi!! be
disp!ayed
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-%9
T(le 11.3.1 "e'ommended 5,EG-2 Coding "nges for Vrious Video *ormts "After 2=4#$
*ormt
"esolution
,els 6 #ines
Coded ,els
Coded #ines
5,EG-2 ,rofile
nd #e7el
*ield 1 *ield 2 *rme
)>(5 ?'( B )>( (@?3A '=@'B' '>B@<'< M8@ML
)>(8 ?'( B )>( (@?3A )B@<'< M8@!L
<3'5 ?'( B <3' (@?3A ?@'B' '?(@<'< )''8@ML
<3'8 ?'( B <3' (@?3A 3)@<'< )''8@ ! L
<?B5 ?'( B <?B (@?3A '=@=3( ==B@B'= M8@ML
B(>5 ?'( B B(> (@?3A ?@=3( ='(@B'= )''8@ML
?'(8 3'>( B ?'( (@3'?A 'B@?)< M8@!L
?'(8 3'>( B ?'( (@3'?A 'B@?)< )''8@ ! L
3(>(5
3A'( B 3(>>
3 (@3A3A '3@<B( <>)@33'= M8@!L
3(>(5
3A'( B 3(>>
3
(@3A3A '3@<B( <>)@33'= )''8@ ! L
3(>(8
3A'( B 3(>>
3
(@3A3A )'@33'3 M8@!L
3(>(8
3A'( B 3(>>
3 (@3A3A )'@33'3 )''8@ ! L
3 T%e acti*e image only occupies t%e first 3(>( lines#
I- or P-frames that are disp!ayed after B#frames need to be sent before the B#frames so that the
B-frames can be proper!y assemb!ed. 5$ee (igure ,,.F..6
9ecause the number of bits per frame varies, it is virtua!!y impossib!e to synchronize two
=C20 bit streams. .owever, bit streams can be !oaded into J3=, and memory pointers then
can be manipu!ated. If two bit streams were !oaded into J3=, the pointer used to read the data
cou!d be shifted such that after one stream is output, it is fo!!owed immediate!y by a section of
the other bit stream. This process is much !i'e edits performed by many non!inear des'top edit#
ing systems, e"cept that in the editing systems the data is not read from J3=, but from a hard
drive using pointers that are essentia!!y !ists of frames and their !ocations.
=any non!inear editors use IC20 compression, which is comparab!e to an =C20 bit stream
composed entire!y of I-frames. Iumping from the end of one I-frame to the beginning of another
is re!ative!y simp!e. 9it streams made up of a!! I-frames are f ine for editing, but ineff icient
when it comes to storage or transmission. 9it streams that are far more efficient to store and
transport ma'e considerab!e use of P- and B#frames, but these e!ements comp!icate the editing
process.
&onsiderab!e wor' has been done to deve!op too!s within the =C20 structure that wi!! a!!ow
for accurate bit stream sp!icing in the compressed domain. 8ne too! thought to be needed was
an encoder that wou!d mar' potentia! sp!ice points within the stream. 8ne requirement of a
sp!ice point wou!d be that the first frame after the sp!ice be an I-frame. 3mong other things, an
I-frame wou!d ensure that no previous frames were needed for proper decoding. 3 second
requirement wou!d be that the !ast frame before a sp!ice point be an I-frame or a P-frame,
guaranteeing that a!! the needed B-frames cou!d be decoded.
3nother requirement of a sp!ice point invo!ves the state of the buffer in the decoder. This
state cou!d be anything from near!y fu!! but emptying out to near!y empty but fi!!ing up.
%ecoder
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-4& Compression Te'$nologies for Video nd Audio
*igure 11.3.1 Splicing procedure for uncompressed *ideo# "After 2'4#$
*igure 11.3.2 Splice point considerations for t%e M8EG bit stream# "After 234#$
buffer fullness is a dynamic parameter of the encoding process, and as !ong as the sp!icing is
within a particu!ar bit stream, it is not !i'e!y to cause a prob!em 5(igure ,,.F.36. .owever, sp!ic#
ing one bit stream onto another cou!d cause the receiverLs buffer to underf!ow or overf!ow. This
cou!d occur if the stream to be switched from !eaves the buffer fair!y empty, and the stream to
be switched into assumes a near!y fu!! buffer and e"pects to empty it short!y, resu!ting in a
buffer underflow, as i!!ustrated in (igure ,,.F.:. (!ushing the decoder buffer is one way to dea!
with the prob!em, but this probab!y wou!d resu!t in disp!ay disruption on the viewerLs screen.
3nother sp!icing method is to constrain sp!ice points so that they occur on!y when the decoder
buffer is
F0 percent fu!!. .owever, this cou!d ma'e potentia! sp!ice points few and far
between.
=eeting the previous!y mentioned requirements wou!d go a !ong way toward so!ving the
sp!icing prob!em, but it wou!d not be a comp!ete so!ution. ;ithin the data stream are variab!es
such as time stamps that must be updated to prevent prob!ems at the decoder. 3dditiona! data#
stream processing is required to update these variab!es proper!y.
/p to this point, on!y video sp!icing has been discussed. ;ithin an =C20 program or trans#
port stream, the audio and video are sent as separate pac'ets. 9ecause the pac'ets are sent seri#
a!!y in a sing!e stream, audio pac'ets end up being sent before or after the video pac'ets with
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-41
*igure 11.3.. Typical operational loading of t%e buffer# "After 2'4#$
which they are associated. To resynchronize the audio and video, each pac'et has a
presentation time stamp 5CT$6 that a!!ows the decoder to present the various audio and video
pac'ets in a syn# chronized manner. 9oth the audio and video signa! paths inc!ude buffers.
.owever, because of the amount of ca!cu!ation required to reassemb!e the video, the video
buffer is much !arger than the audio buffer. To some e"tent, the !arger the buffer, the !arger the
signa! de!ay. 9ecause of the additiona! de!ay in the video buffer, a bit#stream sp!ice that
contains >o!d? audio and video before the sp!ice and >new? video and audio after the sp!ice
probab!y wi!! be presented to the viewer as two separate sp!ices. The first sp!ice wi!! affect the
audio and, because of the !onger buffer, the second sp!ice wi!! affect the video. This process is
i!!ustrated in (igure ,,.F.F.
11.3.4 Spli'e *lgs
8ne so!ution to the =C20 bit stream sp!icing prob!em is the insertion of splice flags +-. These
f!ags are inserted during encoding at def ined points of buffer occupancy. 3s i!!ustrated in
(igure
,,.F.6, the f!ags identify a!!owab!e switch points in the =C20 bit stream. The benefits of this
approach inc!udeD
E Go artifacts from the switching process
E &ontinuity of video and audio at the receiver
disp!ay
E Go unpredictab!e behavior of the video decoder
In addition to conventiona! cuts editing, sp!ice f!ag#based operation a!!ows for switching
between progressive and inter!aced images and from one picture reso!ution to another. The
drawbac' of this approach is that the sp!ice points must first be identif ied at the origination
center. This requirement !imits the usefu!ness of the process somewhat. .owever, for most
networ'#to#affi!i# ate feed operations, the required sp!ice/insertion points are we!! 'nown and
c!ear!y defined.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-42 Compression Te'$nologies for Video nd Audio
*igure 11.3.4 M8EG decoder buffer state issues for bit stream switc%ing# "After 234#$
*igure 11.3.3 ;elati*e timing of *ideo and audio splice points and t%e end result# "After 234#$
11.3.4( S5,TE .125
In response to the prob!ems posed by the bit stream sp!icing issue, the $=CT2 e"amined possi#
b!e so!utions. The resu!t of this wor' was $=CT2 3,=, which defines constraints on the
encod# ing of and synta" for =C20# transport streams such that they may be sp!iced
without modifying the C2$ 5pac'etized e!ementary stream6 pay!oad +:-. 0eneric =C20#
transport streams that do not comp!y with the constraints in the standard may require more
sophisticated techniques for sp!icing.
The constraints specified are app!ied individua!!y to programs within transport streams, a
program being def ined as a co!!ection of video, audio, and data streams that share a common
timebase. The presence of a video component is not assumed. The standard enab!es sp!icing of
programs within a mu!tiprogram transport stream either simu!taneous!y or independent!y.
$p!ice points in different programs may be presentation#time#coincident, but do not have to
be. The standard a!so may be used with sing!e#program transport streams.
$=CT2 3,= specifies constraints for both seamless and nonseamless sp!ice points. $eam#
!ess sp!ice points must adhere to a!! the stated constraints) nonseam!ess sp!ice points adhere to a
simp!if ied subset of constraints.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-4.
*igure 11.3.6 T%e use of splice flags to facilitate t%e switc%ing of M8EG bit streams# "After 2'4#$
In addition to constraints for creating sp!iceab!e bit streams, the standard specifies the tech#
nique for carrying notification of upcoming sp!ice points in the transport stream. 3 splice infor-
mation table is defined for notifying downstream devices of sp!ice events, such as a networ'
brea' or return from a networ' brea'. The sp!ice information tab!e that pertains to a given pro#
gram is carried in a separate CI% 5program identifier6 stream referred to by the program map
tab!e 5C=T6. In this way, sp!ice event notification can pass through transport stream
remu!tip!e"# ers without the need for specia! processing.
)uffer /ssues
3s addressed previous!y, the sp!icing of =C20 bit streams requires carefu! management of
buffer fu!!ness. ;hen =C20 bit streams are encoded, there is an inherent buffer occupancy at
every point in time +:-. The buffer fu!!ness corresponds to a de!ay, the amount of time that a
byte spends in the buffer. ;hen sp!icing two separate!y encoded bit streams, the de!ay at the
sp!ice point usua!!y wi!! not match. This mismatch in de!ay can cause the buffer to overf!ow or
under# f!ow. The seam!ess sp!icing method requires that the =C20 encoder match the de!ay at
sp!icing points to a given va!ue. The nonseam!ess method does not require the encoder to match
the de!ay. Instead, the sp!icing device is responsib!e for matching the de!ay of the new materia!
and the o!d materia! as we!! as it can. In some cases, this wi!! resu!t in a contro!!ed decoder
buffer underf!ow. This underf!ow can be mas'ed in the decoder by ho!ding the !ast frame of the
outgoing video and muting the audio unti! the f irst access unit of the new stream has been
decoded. In the worst case, this underf!ow may !ast for a few frames.
9oth sp!icing methods may cause an underf!ow of the audio buffer, and consequent!y a gap
in the presentation of audio at the receiver. The perceived qua!ity of the sp!ice in both cases
benefit from audio decoders that can hand!e such a gap in audio data gracefu!!y.
Spli'e ,oints
To enab!e the sp!icing of compressed bit streams, $=CT2 3,= defines splice points +:-.
$p!ice points in an =C20# transport stream provide opportunities to switch from one
program to another. They indicate a safe p!ace to switchD a p!ace in the bit stream where a
switch can be
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-44 Compression Te'$nologies for Video nd Audio
made and resu!t in good visua! and audio qua!ity. In this way, they are ana!ogous to the vertica!
interva! used to switch uncompressed video. /n!i'e uncompressed video, frame boundaries in
an =C20# bit stream are not even!y spaced. Therefore, the synta" of the transport pac'et itse!f
is used to convey where these sp!ice points occur. Transport streams are created by
mu!tip!e"ing CI% streams. Two types of sp!ice points for CI% streams are definedD
E In Points! p!aces in the bit streams where it is safe to enter and start decoding the
data
E "ut Points, p!aces where it is safe to e"it the bit
stream
=ethods are def ined that can be used to group In Coints of individua! CI% streams into
Crogram In Coints to enab!e the switching of entire programs 5video with audio6. Crogram 8ut
Coints for e"iting a program a!so are def ined.
8ut Coints and In Coints are imaginary benchmar's in the bit stream !ocated between two
transport stream pac'ets. 3n 8ut Coint and an In Coint may be co#!ocated) that is, a sing!e
pac'et boundary may serve as both a safe p!ace to !eave a bit stream and a safe p!ace to enter it.
*re-uen'y of Spli'e ,oints
The frequency of sp!ice points is not specified by $=CT2 3,= +:-. In some app!ications, such
as a studio environment where !ow#de!ay and f!e"ibi!ity in switching are important, sp!ice
points might occur as frequent!y as every frame 5in an a!! I#frame environment6. In a
distribution envi# ronment, sp!ice points might occur at regu!ar interva!s during norma! program
p!ayout and more frequent!y surrounding brea' times. 9ecause out points may be specified at
either I or P frame boundaries 5in presentation order6, they may occur more frequent!y than in
points 5which may on!y occur preceding I frames6.
11.3.4' Trnsition Clip Genertor
The transition clip generator 5T&06 is another approach to bit stream sp!icing 5$arnoff and $i!i#
con 0raphics6. T&0 is a suite of software too!s designed for use in a video server environment.
The T&0 creates a transition clip1a new sequence of =C20 transport stream pac'ets that
rep!ace 5in the resu!ting sp!iced video stream6 a portion of each of the =C20 transport streams
of the video being sp!iced together 5c!ip , and c!ip 6 around the point where the sp!ice is to be
made. &reation of this transition c!ip invo!ves a sma!! amount of decoding and subsequent re#
encoding of compressed video frames 5from the streams being sp!iced6, but on!y of frames from
those regions being rep!aced.
The fact that on!y a few frames need to be decoded and then re#encoded to create a seam!ess
sp!ice is one of the significant benef its of the T&0 approach. 3 detai!ed discussion of the T&0
system can be found in +F-.
11.3.4d S5,TE .245
Met another aproach is $=CT2 3<=, which defines the =C20 video elementary stream 52$6
information to faci!itate seam!ess edits under def ined circumstances +6-. The video 2$, as
defined by the =C20 standards, is supp!emented with additiona! information for professiona!
studio app!ications. This supp!ementary information is carried within the sequence header and
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-43
the user data area of the video 2$. $=CT2 3<= defines the data to be carried and the !ocation
of that data.
$eam!ess, frame#accurate editing of compressed video is most easi!y accomp!ished with the
use of short 08C structures. Aonger 08C structures can be edited by decoding and reencoding,
by transcoding to shorter 08C structures, or 5with more invo!ved processing6 edited direct!y.
The best approach is determined by a range of app!ication#specific considerations.
I$8/I2& ,3<,<# does not def ine the repetition frequency of the sequence header. To
be
comp!iant with $=CT2 3<=, the sequence header must e"ist at every I frame.
3s specif ied in $=CT2 3<=, the fo!!owing synta" e!ements and functiona! descriptions
are inserted in the =C20 2$ in the user data areaD
E V/H coding phase. The basic imp!ementation of =C20 does not specify the horizonta! and
vertica! coding phase. $=CT2 3<= requires that the vertica! and horizonta! coding phase
be 'nown in order for decoding and periphera! equipment to correct!y process the signa!. 4
and . coding information are inc!uded on!y for $%T4 signa!s where the coding phase is not
com# p!iant with $=CT2 JC 0) for .%T4 signa!s, ./4 coding phase information is
!i'ewise defined by $=CT2 JC 0.
E Time code. Crovision is made for the insertion of two time codes comp!ying with $=CT2
,=. 3t !east one time code, the reference date time stamp 5as def ined in $=CT2 36=6 is
carried as a means of maintaining synchronization with other content or metadata streams.
&arriage of a second time code is optiona!. &omp!iant decoders must have the capabi!ity to
decode both time codes.
E Picture order. Cicture order information specif ies the picture duration and is the equiva!ent
to the CT$/%T$ present in the =C20 transport stream. The picture order va!ue is counted by
fie!d units. In some cases, the !atency of the system wi!! be minimized using the picture
order information.
E Video index. 4ideo inde", as defined by $=CT2 JC ,<6, is carried 5if present6 on the base#
band signa!. Information carried by the video inde" shou!d be preserved during any coding,
recoding, editing, or transcoding process. It was envisioned that the data described in the
$=CT2 metadata dictionary 5$=CT2 33F=6 wou!d be hand!ed by the transport mechanism
described in $=CT2 36=. These parametric data inc!ude a!! of the parameters current!y
coded in the video inde", a!though the data representation of some items may be different.
E Ancillary data. %ata that is carried in the vertica! interva! of the baseband signa! shou!d be
preserved. 3nci!!ary data may consist of more than 3 consecutive zeros. To prevent this
con# dition, a mar'er is inserted every bits.
E History data. .istory data, consisting of origina! and subsequent encoding parameters that
may be usefu! in transcoding or reencoding, can be carried by the bit stream. $=CT2 3@=
defines the content of the history data information. .istory data may consist of more than 3
consecutive zeros. To prevent this condition, a mar'er is inserted every bits.
E User data. /ser data is defined by I$8/I2& ,3<,<#.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-46 Compression Te'$nologies for Video nd Audio
11.3.3 5,EG-2 "e'oding
The =C20# video recoding data set a!!ows a fu!! description of the =C20# parameter set
that characterizes any =C20# encoding process. 3ccording to this recoding data set, it is
accepted1from a theoretica! point of view1that any =C20# encoding equipment cou!d gen#
erate an identica! =C20# bitstream from a given digita! video signa!. .owever, the =C20#
recoding data set may be used in practica! recoding app!ications where the environment can
introduce additiona!, unpredictab!e constraints. (or e"amp!e, the bit#rate of any recoding stage
may or may not differ from that of the previous encoding stage. This means that the recoding
data set wi!! not necessari!y be fu!!y reused at any further stage.
3nother critica! aspect of the recoding process is the bandwidth avai!ab!e for transport of the
=C20# recoding data set. In many practica! app!ications, a reduced set of the generic =C20#
recoding data set must be addressed.
The issue of =C20# recoding is an important one because of the common need to modify
previous!y encoded programs. To address this issue, the $=CT2 deve!oped a suite of too!s that
inc!udes the fo!!owing standardsD
E $=CT2 3@=, =C20# 4ideo Jecoding %ata $et
E $=CT2 3H=, =C20# 4ideo Jecoding %ata $et1&ompressed $tream (ormat
E $=CT2 3,H=, Transporting =C20# Jecoding Information Through :DD &omponent
%ig# ita! Interfaces
E $=CT2 3F,=, Transporting =C20# Jecoding Information through .igh#%efinition %igi#
ta! Interfaces
E $=CT2 3F3=, Transport of =C20# Jecoding Information as 3nci!!ary %ata Cac'ets
These standards are discussed in the fo!!owing sections.
11.3.3 S5,TE .2%5
$=CT2 3@= specifies the content of the picture#re!ated recoding data set for the
representation of I$8/I2& ,3<,<# =C20 coding information for the purpose of optima!!y
cascading decoders and recoders at any bit rate or 08C structure +@-. The coding information is
derived from an I$8/ I2& ,3<,<#comp!iant =C20 bit stream during the picture decoding
process, as described in I$8/I2& ,3<,<#. The scope and operation of this standard are the
definition of the content of a sufficient recoding data set that may be derived in decoders
comp!ying with I$8/I2& ,3<,<#, inc!uding a!! nonsca!ab!e profi!es defined in I$8/I2&
,3<,<#.
To a!!ow the resynchronization of the video and its associated audio or data after processing,
a mechanism using some additiona! information derived from I$8/I2& ,3<,<#, is a!so inc!uded
in $=CT2 3@=. This sufficient data set can be transported by various means 5defined in other
$=CT2 standards6.
The principa! app!ication of this standard is to preserve the qua!ity of the video signa! when
cascading =C20# decoders and coders 5inc!uding transcoding6 by feeding forward previous
coding decisions. The =C20# recoding data set is described as sufficient when it contains the
data required that, in combination with an =C20# decoded or partia!!y decoded picture, a!!ows
bit#accurate recreation of the previous!y coded bit stream. The information required in the suffi#
cient =C20# recoding data set can be bro'en down into three parts.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-4%
E The picture rate information
E =acrob!oc' rate information
E 3dditiona! house'eeping data
11.3.3( S5,TE .295
$=CT2 3H= specifies the stream format of the =C20# recoding data set for the representa#
tion of compressed I$8/I2& ,3<,<# =C20 coding information, as used in app!ications requir#
ing transport systems of reduced data capacity +<-. The coding information is derived from an
I$8/I2& ,3<,<# comp!iant =C20 bit stream during the decoding process, as described in
I$8/ I2& ,3<,<#. The information based on this stream format can be transported by various
means) for e"amp!e, the e!ementary stream format defined in $=CT2 3<=.
There are app!ications in which the transmission of a!! the recoding data set is not possib!e.
$ome !egacy equipment may have restricted capacity for the transmission of the recoding data.
This !imitation has an impact on subsequent compression stages that can ma'e use of the
=C20#
recoding process. In order to decrease the bit rate for the recoding data set, the =C20#
recod# ing data set is converted into an =C20#!i'e stream, which is ca!!ed the compressed
stream for- mat of the =C20# recoding set. $=CT2 3H= defines this stream format. The
stream format is independent of app!ication, and a!! the transport information in the reduced
bandwidth recod# ing data transportation system is based on this stream. The transport
mechanism depends on the app!ication, which is defined in other $=CT2 standards documents.
11.3.3' S5,TE .195
$=CT2 3,H= specifies an embedded transport mechanism for the =C20# recoding data set
as defined in $=CT2 3@= for the representation of =C20# recoding information in IT/#J
9T.6F6, :DD component digita! interfaces +H-. The recoding data set is derived from an
I$8/I2&
,3<,<#, and comp!iant =C20 bit stream during the decoding process, as described in I$8/
I2& ,3<,<#, and #.
(or the minimum operation of this standard, the =C20# recoding data set is spatia!!y and
tempora!!y a!igned to each decoded macrob!oc' mapped into an IT/#J 9T.6F6 interface. The
standard specifies the spatia!!y and tempora!!y a!igned transport of the =C20# recoding data
set within the active picture area on IT/#J 9T.6F6 interfaces for equipment that comp!ies with
I$8/I2& ,3<,<#, and #, inc!uding :D DCN=A and =CN=A for both the 6F/F0 and FF/60
video standards.
The information contained in the =C20# recoding data set is defined in $=CT2 3@=.
This recoding information is tempora!!y !oc'ed to the decoded 5or partia!!y decoded6 video to
the nearest =C20# frame or fie!d depending on the picture structure of the coded =C20# bit
stream. It is a!so spatia!!y !oc'ed with the decoded video to the nearest =C20# macrob!oc'
within the decoded frame/fie!d. It is necessary for the recoding information to be a!igned with
the decoded =C20 macrob!oc's in the decoded pictures, both spatia!!y and tempora!!y.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-44 Compression Te'$nologies for Video nd Audio
11.3.3d S5,TE .315
$=CT2 3F,= specifies an embedded transport mechanism for the =C20# recoding data set
as defined in $=CT2 3@= for the representation of =C20# recoding information on a
$=CT2
@:= interface and subsequent!y upon a $=CT2 H= bit#seria! digita! interface +,0-. The
recoding data set is derived from an I$8/I2& ,3<,<#,/ comp!iant =C20 bitstream during the
decoding process, as described in the I$8/I2& ,3<,<#,/ standards. (or the minimum operation
of this standard, the =C20# recoding data set is spatia!!y and tempora!!y a!igned to each
decoded macrob!oc' mapped into a $=CT2 @:=/H= interface.
The standard specifies the spatia!!y and tempora!!y a!igned transport of the =C20#
recoding data set within the active picture area on $=CT2 @:=/H= interfaces for
equipment that comp!ies with the I$8/I2& ,3<,<#,/ standards, inc!uding :DDCN.A and
=CN.A for 60# and F0#.z inter!aced and 60#, 30#, F#, and :#.z progressive video
standards.
The recoding information is tempora!!y !oc'ed to the decoded 5or partia!!y decoded6 video to
the nearest =C20# frame or fie!d depending on the picture structure of the coded =C20# bit#
stream. It is a!so spatia!!y !oc'ed with the decoded video to the nearest =C20# macrob!oc'
within the decoded frame/fie!d. To accrue the fu!! benefits of the recoding information when
cas# cading via a digita! baseband interface, the fo!!owing recommendations must be adhered toD
E The transport mechanism must preserve at !east the < most significant bits of active video.
The mechanism out!ined uses the !east significant bit of each ,0#bit chrominance samp!e to
transmit the data through the $=CT2 @:= or H= interface.
E The recoding information must be a!igned with the decoded =C20 macrob!oc's in the
decoded pictures, both spatia!!y and tempora!!y.
$=CT2 3F,= is based on producing a $=CT2 @:=/H= comp!iant output to cover .%#
=C20 bitstreams up to and inc!uding :DDCN.A and =CN.A for ,H0 B ,0<0 60 5FH.H:6 or
F0 D,) ,H0 B ,0<0 30 5H.H@6, F, or : 53.H<6 ,D,) and ,<0 B @0 60 5FH.H:6 ,D, systems.
11.3.3e S5,TE .3.5
$=CT2 3F3= specifies the mechanism for the transport of =C20# video recoding
information as anci!!ary data pac'ets in an anci!!ary data space1for e"amp!e, through IT/#J
9T.6F6/ $=CT2 FH= interfaces +,,-. The video recoding information transported through this
mecha# nism is used to preserve picture qua!ity at re#encoding stages when cascading =C20#
decoders and encoders. The transport mechanism specified in the standard has been designed so
that it can wor' with digita! video systems in which operation is !imited to <#bit reso!ution.
Crincip!e parameters of $=CT2 3F3= inc!udeD
E The transported =C20# video recoding information is comp!iant with the =C20# video
recoding data set as defined in $=CT2 3@=
E The data set is formatted according to the stream format def ined in $=CT2
3H=
E The formatted data set is transported in the form of anci!!ary 53G&6 data pac'ets as
specified in $=CT2 H,=
E The transport mechanism specified in the standard is comp!iant with $=CT2 H,=
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
Compression System Constrints nd ,erformn'e /ssues 11-49
Cart of both the vertical blanking ancillary data space, 54#3G&6 and the horizontal
blanking ancillary data space 5.#3G&6 are used. The 4#3G& space carries picture rate
information on!y) this is the most basic, highest priority e!ement of the recoding data set. (or
!ow bit rate, !ong 08C app!ications, this typica!!y brings the greatest picture qua!ity
improvement) further refine# ments are achieved when more information is avai!ab!e. The .#
3G& space is used to carry the other part of the recoding data set. /se of the reduced
bandwidth indicator, as specified in $=CT2 3H=, a!!ows the transmission1more or !ess1of
this part of the recoding data set, depending upon the transmission capacity avai!ab!e in the .#
3G& space.
11.3.6 5,EG-2 8perting "nges
The e"tensive use of compression in professiona! video app!ications has imposed unique, and
sometimes considerab!e, demands on the various e!ements of the production chain. $ome digita!
video devices have a great dea! of f!e"ibi!ity to dea! with bursts of data, as might happen when
coding a difficu!t video scene +,-. These devices use a variab!e data rate by increasing data
rates when necessary to preserve qua!ity, and decreasing data rates when processing easier
content to improve efficiency. These devices, therefore, are sometimes referred to as providing
constant- #uality operation. 8ther devices inherent!y operate with the data rate constrained to a
constant va!ue. ;hen data rate is fi"ed, there wi!! be some picture qua!ity variation, which wi!!
be a func# tion of the picture comp!e"ity. If data rates are sufficient!y high, these variations can
be imper# ceptib!e. The ease of processing constant bit rate streams is, therefore, attractive in
some app!ications.
Jegard!ess of the type of compression, a!! practica! systems need some !imits on a!!owab!e bit
rate variations. To address this, =C20# specif ies a buffer model for both compression and
transport 5as discussed previous!y6. It is the responsibi!ity of the compression encoder to
manage the data rate, through varying quantization granu!arity, to avoid buffer overf!ow or
underf!ow.
;ith c!ear app!ications for both variable bit rate 549J6 and constant bit rate 5&9J6 in the
professiona! domain, the potentia! e"ists for interface issues.
11.3.6 ,rodu'tion System Dt *lo9
(igure ,,.F.@ i!!ustrates the various strategies avai!ab!e to broadcasters when designing a com#
pressed video production f!ow process +,-. In this e"amp!e, the sp!it between !ong 08C and I#
frame#on!y systems at 30 =bits/s ref!ects the point of appro"imate!y equa! performance for !ong
08C systems at 30 =bits/s and I#frame#on!y systems at F0 =bits/s. $pecific e!ements of the
pro# duction process inc!ude the fo!!owingD
E Acquisition. The program may be captured in I#frame on!y or using !ong 08C =C20# to
provide higher storage capacities 5principa!!y camcorders6.
E Contribution. ;here there is a need to e!ectronica!!y transmit the captured information over
a networ' 5sate!!ite/te!co/wire!ess camera6, the !ong 08C =C20# format is more !i'e!y to
be used as this provides for higher transmission efficiency.
E Source storage. Idea!!y, the signa! shou!d be stored in its received format, that is I#frame#
on!y for I#frame on!y systems, and !ong 08C for !ong 08C systems to maintain the highest
possi#
V
i
d
e
o

)
i
t
r

t
e
A
'
-
u
i
s
i
t
i
o
n
C
o
n
t
r
i
(
u
t
i
o
n
T

p
e

S
t
o
r

g
e
D
i
s
+

S
t
o
r

g
e
S
e
r
*
e
r
5
n
d
e
p
e
n
d
e
n
t
L
o
n
g

G
C
8

E
d
i
t
i
n
g
"
w
i
t
%

;
e

C
o
d
i
n
g

D
a
t
a
$
5

D
r
a
m
e

C
n
l
y
E
d
i
t
i
n
g
E
d
i
t
i
n
g
5

D
r
a
m
e

C
n
l
y
V
T
;
,
r
o
g
r

m
m
e
S
t
o
r

g
e
8
r
o
g
r
a
m
m
e

S
e
r
*
e
r
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Compression System Constraints and 8erformance 5ssues
11-9& Compression Te'$nologies for Video nd Audio
<(
5Drame Cnly
)/'/'
5 Drame Cnly
)/'/'
Transcode to
5frame )/'/'
=(
'(
Any GC8
)/'/' or
)/'/(
3(
(
Any GC8
)/'/' or
)/'/(
Transcode to
)/'/'
Trns'ode to
Emission
*ormt
,ro'ess *lo9
*igure 11.3.% Current practice for standarddefinition compressed *ideo process flow# "From 23'4#
Used with permission#$
b!e qua!ity. It is a!so possib!e to transcode the incoming feed to permit a standard native stor#
age format. .owever, care must be ta'en to ensure that the qua!ity is not degraded.
E Editing. &uts#on!y editing is simp!e to perform in the =C20# domain in the case of I#
frame#on!y 4TJs or servers. ;here more comp!e" editing is required, the signa! must be
pro# cessed at video baseband. This can be achieved using either I#frame or !ong 08C
=C20# provided that the system has sufficient headroom. ;here this headroom does not
e"ist, it is necessary to use the =C20# recoding parameters as defined in $=CT2 3@= if
=C20# concatenation artifacts are to be avoided.
E Program storage. Crogram storage shou!d a!so be made in the editing format or in the trans#
mission format so that the number of transcoding stages is reduced to a minimum. The 'ey
options avai!ab!e to the program ma'er can therefore be summarized as ,6 an I#frame#on!y or
!ong 08C system with a suff icient!y high data rate to a!!ow mu!tip!e naive
decoding/recoding processes, and 6 a !ong 08C system using !ower data rates but passing
forward recoding information as described in $=CT2 3@= to minimize =C20#
concatenation artifacts.
11.3.6( S5,TE ", 21.
$=CT2 Jecommended Cractice ,3 was deve!oped to address the issues out!ined in the
previous section. The document specif ies the structure and parameters of the data for
interfacing =C20#
:DD profi!e and digita! audio in the professiona! environment +,3-. The purpose of this JC is to
faci!itate video and audio bitstream interchange between =C20# comp!iant equipment.
Compression System Constrints nd ,erformn'e /ssues 11-91
The combination of JC ,3 and re!ated documents are intended to assist the design and
app!i# cation of =C20##based professiona! te!evision equipment that faci!itates bitstream
interchange among different app!ications and over a wide set of user requirements. The JC is
!imited to the video and audio parameters of such a system.
JC ,3 a!so specifies =C20# operating ranges that are def ined to be subsets of
I$8/=C20 profi!es and !eve!s. It defines two operating ranges for standard#definition
te!evision and three operating ranges for high#definition te!evision. 3!! of the =C20# data
structures addressed in this practice are I$8/I2& ,3<,<# 3mendment :DD profi!e comp!iant
and as such are decod# ab!e by =C20# :DD profi!e comp!iant stand#a!one decoders at the
appropriate !eve!. Inasmuch as the :DD profi!e a!so requires stand#a!one decoders to decode
main profi!e structures 5:DD06, e"isting main profi!e sources can be accommodated.
Appli'tion of ",
21.
The f!e"ibi!ity of =C20# compression a!!ows =C20##based equipment to meet the diverse
operationa! requirements of a broad range of professiona! te!evision app!ications +,3-. 3!though
some app!ications might be served by choosing a specific operating point, different users have
different constraints and ob7ectives, and may choose different specific operating parameters.
&ognizant of these considerations, JC ,3 specifies the fo!!owingD
E 8perating ranges, inc!uding constrained bit rates and 08C structures
E 8perating ranges created for random access and editing capabi!ity
E $patia! a!ignment of coded images
E /se of :<#'.z samp!ed digita! audio
This practice describes parameter choices avai!ab!e in =C20# and the factors to be ta'en into
account when defining an =C20##based system. $pecific operating parameter choices wi!!
depend on the individua! app!ication requirements, inc!uding editing capabi!ity, storage
capacity, contribution feeds, and distribution/emission bandwidth.
In ma'ing this se!ection for a given app!ication environment, it is further recognized that
tradeoffs among many different parameters must be considered. $uch considerations inc!ude the
bitstream overhead imposed by various operating range constraints, the required degree of bit#
stream interoperabi!ity among various types of broadcast equipment, and overa!! system com#
p!e"ity.
(or audio, no sing!e wor!dwide compressed standard has been adopted) various transmission
systems are in use depending upon geographic area. 0!oba! audio interchange can, therefore,
on!y be achieved by specifying a noncompressed audio format.
5,EG-2 Video ,rmeters
;ithin professiona! app!ications of =C20#, inc!uding the .%T4 e"tensions to =C20# as
defined by $=CT2 30<=, f ive operating ranges are def ined by JC ,3 5(igure ,,.F.<6.
$eparate !ong# and short#08C ranges are defined for both main !eve! and high !eve! systems.
3dditiona! operating ranges may be added as required to meet future .%T4 requirements.
8perating ranges
, and cover the =C20# :DDCN=A options inc!uding the standard FF#!ine and 6F#!ine
$%T4 formats.
8perating ranges 3 and : cover the =C20# :DDN.A inc!udingD
11-92 Compression Te'$nologies for Video nd Audio
)it-rte
8perting "nge 4
!DTV
,p to =(( MbitEs
5only coding
8perting "nge .)
!DTV
,p to 3?< MbitEs
Any GC8 structure
8perting "nge .A
!DTV
,p to >( MbitEs
Any GC8 structure
8perting "nge 2
SDTV
,p to <( MbitEs
5only coding
8perting "nge 1
SDTV
,p to <( MbitEs
Any GC8 structure
G8, Stru'ture
*igure 11.3.4 SM8TE operating ranges specified in ;8 '3=# "From 23=4# Used with permission#$
E :<0#!ine progressive scan
E F@6#!ine progressive scan
E @0#!ine progressive scan
E ,0<0#!ine inter!aced scan
E ,0<0#!ine progressive scan 5up to 30#.z frame rate6
Je!ationships among different operating ranges are i!!ustrated in (igure ,,.F.H. 8perating range
is a subset of operating ranges , and :. 8perating range , is a subset of operating ranges 33
and 39. 8perating range 33 is a subset of operating range 39.
11.3.% S5,TE Do'uments "elting to 5,EG-2
The fo!!owing sections !ist the primary $=CT2 standards re!ating to =C20#. (or additiona!
information, visit the $=CT2 ;eb site at httpD//ww w .sm p te.o rg.
S5,TE .&25/ Linear 8CM Digital Audio in an M8EG' Transport Stream
This standard specifies the transport of uncompressed 5!inear C&=6 digita! audio in an =C20#
transport system. $ome app!ications may require !inear C&= 5pu!se code modu!ated6 digita!
Compression System Constrints nd ,erformn'e /ssues 11-9.
8perting "nge 4
!DTV
,p to =(( MbitEs
5only coding
8perting "nge 2
SDTV
,p to <( MbitEs
5only coding
8perting "nge .)
!DTV
,p to 3?< MbitEs
Any GC8 structure
8perting "nge .A
!DTV
,p to >( MbitEs
Any GC8 structure
8perting "nge 1
SDTV
,p to <( MbitEs
Any GC8 structure
8perting "nge 2
SDTV
,p to <( MbitEs
5only coding
*igure 11.3.9 ;elations%ip among operating ranges in SM8TE ;8 '3=# "From 23=4# Used with
permission#$
audio in con7unction with compressed video specified in the =C20# :DD profi!e. The =C20
audio standard defines compressed audio, but does not define uncompressed audio for carriage
in an =C20# transport system. This standard augments the =C20 standards to address the
requirement for !inear C&= digita! audio.
S5,TE .&45/ M8EG' )/'/' 8rofile at !ig% Le*el
I$8/I2& ,3<,<#, common!y 'nown as =C20# video, inc!udes specification of the =C20#
:DD profi!e. 9ased on I$8/I2& ,3<,<#, this standard provides additiona! specification for the
=C20# :DD prof i!e at high !eve!. It is intended for use in high#definition te!evision produc#
tion, contribution, and distribution app!ications. 3s in I$8/I2& ,3<,<#, this standard defines
bit#streams, inc!uding their synta" and semantics, together with the requirements for a
comp!iant decoder for :DD profi!e at high !eve!, but does not specify particu!ar encoder
operating parame# ters.
S5,TE .1&5/ Sync%ronous Serial 5nterface for M8EG' Digital Transport
Stream
This standard describes the physica! interface and modu!ation characteristics for a synchronous
seria! interface to carry =C20# transport bit streams at rates up to :0 =bits/s. It is a point#to#
point interface intended for use in a !ow#noise environment. The !ow#noise environment is
defined as a noise !eve! that wou!d corrupt no more than one =C20# data pac'et per day at the
transport c!oc' rate. ;hen other transmission systems 5e.g., studio#to#transmitter microwave
!in's, etc.6 are interposed between devices emp!oying this interface, higher noise !eve!s may be
encountered. In such cases, it is recommended that appropriate error correcting methods by
used.
S5,TE ", 2&2: Video Alignment for M8EG' Coding
2quipment conforming to this practice wi!! minimize artifacts in mu!tip!e generations of encod#
ing and decoding by optimizing macrob!oc' a!ignment. 3s =C20# becomes pervasive in
emis# sion, contribution, and distribution of video content, mu!tip!e compression and
decompression 5codec6 cyc!es wi!! be required. &oncatenation of codecs may be needed for
production, post#
11-94 Compression Te'$nologies for Video nd Audio
production, transcoding, or format conversion. 3ny time video transitions to or from the coeffi#
cient domain of =C20# are performed, care must be e"ercised in a!ignment of the video both
horizonta!!y and vertica!!y as it is coded from the raster format or decoded and p!aced in the ras#
ter format.
S5,TE ", 2&4/ SDT5C8 M8EG Decoder
Templates
This practice defines decoder temp!ates for the encoding of $%TI content pac'ages 5$%TI#&C6
with =C20 coded picture streams.
S5,TE ", 21./ M8EG' Cperating ;anges
This practice specifies the structure and parameters of the data for interfacing =C20# :DD
profi!e and digita! audio in the professiona! environment. The purpose of this practice is to
faci!# itate video and audio bitstream interchange between =C20# comp!iant equipment.
S5,TE ",21%/ Fonsync%roni1ed Mapping of GLV 8ac.ets into M8EG' Systems Streams
This practice describes a means for mapping $=CT2 metadata and other data essence, encoded
in the $=CT2 KA4 protoco!, into =C20# systems streams. /se of synchronized streams and
their synta" and semantics is beyond the scope of this practice.
S5,TE EG .4/ M8EG' Cperating ;ange Applications
The aim of this document is to provide practica! guide!ines to users of =C20# in studio and in
other professiona! app!ications. This guide!ine provides a system overview, detai!ing the e!e#
ments to be considered when choosing an =C20# operating range. This guide!ine describes
how the structure and parameters defined in $=CT2 JC ,3 may be conf igured to meet a
se!ected operating point. This is achieved by giving specif ic, but representative,
imp!ementation e"amp!es p!anned or in use around the wor!d.
11.3.% 5,EG-2 Editing nd Spli'ing
S5,TE .125/ Splice 8oints for M8EG' Transport
Streams
This standard defines constraints on the encoding of and synta" for =C20# transport streams
such that they may be sp!iced without modifying the C2$ pac'et pay!oad. 0eneric =C20#
transport streams, which do not comp!y with the constraints in this standard, may require more
sophisticated techniques for sp!icing.
S5,TE .245/ M8EG' Video Elementary Stream Editing
5nformation
This standard defines the =C20 video e!ementary stream 52$6 information to faci!itate seam#
!ess edits under defined circumstances. The video 2$, as defined by the =C20 standards, are
supp!emented with additiona! information for professiona! studio app!ications. $upp!ementary
information wi!! be carried within the sequence header and the user data area of the video 2$.
This standard def ines the data to be carried and the !ocation of the data.
Compression System Constrints nd ,erformn'e /ssues 11-93
11.3.%( 5,EG-2 "e'oding
S5,TE .195/ Transporting M8EG' ;ecoding 5nformation t%roug% )/'/' Component
Digital
5nterfaces
This standard specif ies an embedded transport mechanism for the =C20# recoding data set as
defined in $=CT2 3@= for the representation of =C20# recoding information in IT/#J
9T.6F6, :DD component digita! interfaces.
S5,TE .2%5/ M8EG' Video ;ecoding Data Set
This standard specifies the content of the picture re!ated recoding data set for the representation
of I$8/I2& ,3<,<# =C20 coding information for the purpose of optima!!y cascading
decoders and recoders at any bit rate or 08C structure. The coding information is as derived
from an I$8/ I2& ,3<,< comp!iant =C20 bit stream during the picture decoding process, as
described in I$8/I2& ,3<,<#.
S5,TE .295/ M8EG' Video ;ecoding Data SetHCompressed Stream
Dormat
This standard specifies the stream format of the =C20# recoding data set for the
representation of compressed I$8/I2& ,3<,<# =C20 coding information, as used in
app!ications requiring transport systems of reduced data capacity.
S5,TE .315: Tele*isionHTransporting M8EG' ;ecoding 5nformation t%roug% !ig%
Defini tion Digital 5nterfaces
This standard specif ies an embedded transport mechanism for the =C20# recoding data set as
defined in $=CT2 3@= for the representation of =C20# recoding information on a $=CT2
@:= interface and subsequent!y upon a $=CT2 H= bit#seria! digita! interface. The recoding
data set is derived from an I$8/I2& ,3<,<#,/ comp!iant =C20 bitstream during the decoding
process, as described in the I$8/I2& ,3<,<#,/ standards.
S5,TE .3.5: Tele*isionHTransport of M8EG' ;ecoding 5nformation as Ancillary
Data
8ac.ets
This standard specifies the mechanism for the transport of =C20# video recoding information
as anci!!ary data pac'ets in an anci!!ary data space1for e"amp!e, through IT/#J 9T.6F6 /
$=CT2 FH= interfaces. The video recoding information transported through this mechanism
is for the purpose of preserving picture qua!ity at re#encoding stages when cascading =C20#
decoders and encoders. 3!though the specif ied mechanism operates on ,0#bit digita! video
inter# faces, it is by design transparent to systems !imited to <#bit operation.
11.3.4 "eferen'es
,. 2pstein, $teveD >2diting =C20 9itstreams,? Broadcast $ngineering, Intertec Cub!ishing,
8ver!and Car', Kan., pp. 3@O:, 8ctober ,HH@.
. &ugnini, 3!do 0.D >=C20# 9itstream $p!icing,? Proceedings of the Digital Television
%&'
(onference, Intertec Cub!ishing, 8ver!and Car', Kan., %ecember ,HH@.
3. $=CT2 Jecommended CracticeD JC 0, >4ideo 3!ignment for =C20# &oding,?
$=CT2, ;hite C!ains, G.M., 000.
11-96 Compression Te'$nologies for Video nd Audio
:. $=CT2 $tandardD $=CT2 3,=, plice Points for )P$*-+ Transport treams, $=CT2,
;hite C!ains, G.M., 00,.
F. ;ard, &hristopher, &. Cecota, P. Aee and 0. .ughesD >$eam!ess $p!icing for =C20#
Transport $tream 4ideo $ervers,? Proceedings! ,,rd )PT$ -dvanced )otion Imaging
(onference, $=CT2, ;hite C!ains, G.M., 000.
6. $=CT2 $tandardD $=CT2 3<=#000D >=C20# 4ideo 2!ementary $tream 2diting
Information,? $=CT2, ;hite C!ains, G.M., 000.
@. $=CT2 3@=#000, >=C20# 4ideo Jecoding %ata $et,? $=CT2, ;hite C!ains, G.M.,
000.
<. $=CT2 3H=#000, >=C20# 4ideo Jecoding %ata $et1&ompressed $tream (ormat,?
$=CT2, ;hite C!ains, G.M., 000.
H. $=CT2 3,H=#000, >Transporting =C20# Jecoding Information Through :DD &om#
ponent %igita! Interfaces,? $=CT2, ;hite C!ains, G.M., 000.
,0. $=CT2 3F,=, >Transporting =C20# Jecoding Information through .igh#%efinition
%igita! Interfaces,? $=CT2, ;hite C!ains, G.M., 000.
,,. $=CT2 3F3=, >Transport of =C20# Jecoding Information as 3nci!!ary %ata Cac'ets,?
$=CT2, ;hite C!ains, G.M., 000.
,. $=CT2 2ngineering 0uide!ineD 20 3<D >=C20# 8perating Jange 3pp!ications,? $oci#
ety of =otion Cicture and Te!evision 2ngineers, ;hite C!ains, G.M., 00,.
,3. $=CT2 Jecommended CracticeD CJ ,3, >=C20# 8perating Janges,? $ociety of
=otion Cicture and Te!evision 2ngineers, ;hite C!ains, G.M., 00,.
11.3.9 )i(liogrp$y
9ennett, &hristopherD >Three =C20 =yths,? Proceedings of the .&&/ 0-B Broadcast
$ngineer- ing (onference, Gationa! 3ssociation of 9roadcasters, ;ashington, %.&., pp.
,HO,36,
,HH6.
9onomi, =auroD >The 3rt and $cience of %igita! 4ideo &ompression,? 0-B Broadcast $ngi-
neering (onference Proceedings, Gationa! 3ssociation of 9roadcasters, ;ashington,
%.&., pp. @O,:, ,HHF.
%are, CeterD >The (uture of Getwor'ing,? Broadcast $ngineering, Intertec Cub!ishing, 8ver!and
Car', Kan., p. 36, 3pri! ,HH6.
(ibush, %avid K.D >Testing =C20#&ompressed $igna!s,? Broadcast $ngineering, 8ver!and
Car', Kan., pp. @6O<6, (ebruary ,HH6.
(reed, KenD >4ideo &ompression,? Broadcast $ngineering, 8ver!and Car', Kan., pp. :6O@@,
Ian# uary ,HH@.
I$$$ tandard Dictionary of $lectrical and $lectronics Terms, 3G$I/I222 $tandard ,00#
,H<:, Institute of 2!ectrica! and 2!ectronics 2ngineers, Gew Mor', ,H<:.
Compression System Constrints nd ,erformn'e /ssues 11-9%
Iones, KenD >The Te!evision A3G,? Proceedings of the .&&1 0-B $ngineering (onference,
Gationa! 3ssociation of 9roadcasters, ;ashington, %.&., p. ,6<, 3pri! ,HHF.
$ta!!ings, ;i!!iamD ID0 and Broadband ID0,
nd
2d., =ac=i!!an, Gew Mor'.
Tay!or, C.D >9roadcast *ua!ity and &ompression,? Broadcast $ngineering, Intertec
Cub!ishing, 8ver!and Car', Kan., p. :6, 8ctober ,HHF.
;hita'er, Ierry &., and .aro!d ;inard 5eds.6D The Information -ge Dictionary, Intertec
Cub!ish# ing/9e!!core, 8ver!and Car', Kan., ,HH.
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Source/ Standard !andboo. of Video and Tele*ision Engineering
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
C$pter
33#B
Audio Compression Systems
*red 1ylie
0erry C. 1$it+er2 Editor-in-C$ief
11.6.1 /ntrodu'tion
3s with video, high on the !ist of priorities for the professiona! audio industry is to ref ine and
e"tend the range of digita! equipment capab!e of the capture, storage, post production,
e"change, distribution, and transmission of high#qua!ity audio1be it mono, stereo, or F.,
channe! 3&#3 +,-. This demand being driven by end#users, broadcasters, f i!m ma'ers, and the
recording indus# try a!i'e, who are moving rapid!y towards a >tape!ess? environment. 8ver the
!ast two decades, there have been continuing advances in %$C techno!ogy, which have
supported research engi# neers in their endeavors to produce the necessary hardware,
particu!ar!y in the fie!d of digita! audio data compression or1as it is often referred to1bit-rate
reduction. There e"ist a number of rea!#time or1in rea!ity1near instantaneous compression
coding a!gorithms. These can signifi# cant!y !ower the circuit bandwidth and storage
requirements for the transmission, distribution, and e"change of high#qua!ity audio.
The introduction in ,H<3 of the compact disc 5&%6 digita! audio format set a qua!ity bench#
mar' that the manufacturers of subsequent professiona! audio equipment strive to match or
improve upon. The discerning consumer now e"pects the same qua!ity from radio and te!evision
receivers. This !eaves the broadcaster with an enormous cha!!enge.
11.6.1 ,C5 Versus Compression
It can be an e"pensive and comp!e" technica! e"ercise to fu!!y imp!ement a !inear pulse code
modulation 5C&=6 infrastructure, e"cept over very short distances and within studio areas +,-.
To demonstrate the advantages of distributing compressed digita! audio over wire!ess or wired
systems and networ's, consider again the &% format as a reference. The &% is a ,6 bit !inear
C&= process, but has one ma7or handicapD the amount of circuit bandwidth the digita! signa!
occupies in a transmission system. 3 stereo &% transfers information 5data6 at ,.:,, =bits/s,
which wou!d require a circuit with a bandwidth of appro"imate!y @00 '.z to avoid distortion of
the digita! signa!. In practice, additiona! bits are added to the signa! for channe! coding,
synchro#
11-99
Audio Compression Systems
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
11-1&& Compression Te'$nologies for Video nd Audio
nization, and error correction) this increases the bandwidth demands yet again. ,.F =.z is the
common!y quoted bandwidth figure for a circuit capab!e of carrying a &% or simi!ar!y coded
!in# ear C&= digita! stereo signa!. This can be compared with the 0 '.z needed for each of
two cir# cuits to distribute the same stereo audio in the ana!og format, a @F#fo!d increase in
bandwidth requirements.
11.6.1( Audio )it "te "edu'tion
In genera!, ana!og audio transmission requires fi"ed input and output bandwidths +-. This con#
dition imp!ies that in a rea!#time compression system, the qua!ity, bandwidth, and distortion/
noise !eve! of both the origina! and the decoded output sound shou!d not be sub2ectively
different, thus giving the appearance of a !oss!ess and rea!#time process.
In a technica! sense, a!! practica! rea!#time bit#rate#reduction systems can be referred to as
>!ossy.? In other words, the digita! audio signa! at the output is not identica! to the input signa!
data stream. .owever, some compression a!gorithms are, for a!! intents and purposes, !oss!ess)
they !ose as !itt!e as percent of the origina! signa!. 8thers remove appro"imate!y <0 percent of
the origina! signa!.
"edundn'y nd /rrele7n'y
3 comp!e" audio signa! contains a great dea! of information, some of which, because the human
ear cannot hear it, is deemed irre!evant. +-. The same signa!, depending on its comp!e"ity, a!so
contains information that is high!y predictab!e and, therefore, can be made redundant.
3edundancy, measurab!e and quantifiab!e, can be removed in the coder and rep!aced in the
decoder) this process often is referred to as statistical compression. Irrelevancy, on the other
hand, referred to as perceptual coding, once removed from the signa! cannot be rep!aced and is
!ost, irretrievab!y. This is entire!y a sub7ective process, with each proprietary a!gorithm using a
different psychoacoustic mode!.
&ritica!!y perceived signa!s, such as pure tones, are high in redundancy and !ow in irre!e#
vancy. They compress quite easi!y, a!most tota!!y a statistica! compression process. &onverse!y,
noncritica!!y perceived signa!s, such as comp!e" audio or noisy signa!s, are !ow in redundancy
and high in irre!evancy. These compress easi!y in the perceptua! coder, but with the tota! !oss of
a!! the irre!evancy content.
!umn Auditory System
The sensitivity of the human ear is biased toward the !ower end of the audib!e frequency spec#
trum, around 3 '.z +-. 3t F0 .z, the bottom end of the spectrum, and at ,@ '.z at the top end,
the sensitivity of the ear is down by appro"imate!y F0 d9 re!ative to its sensitivity at 3 '.z
5(ig# ure ,,.6.,6. 3dditiona!!y, very few audio signa!s1music# or speech#based1carry
fundamenta! frequencies above : '.z. Ta'ing advantage of these characteristics of the ear, the
structure of audib!e sounds, and the redundancy content of the C&= signa! is the basis used by
the designers of the predictive range of compression a!gorithms.
3nother we!!#'nown feature of the hearing process is that !oud sounds mas' out quieter
sounds at a simi!ar or nearby frequency. This compares with the action of an automatic gain
con# tro!, turning the gain down when sub7ected to !oud sounds, thus ma'ing quieter sounds !ess
!i'e!y to be heard. (or e"amp!e, as i!!ustrated in (igure ,,.6., if we assume a , '.z tone at a
!eve! of
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Audio Compression Systems
Audio Compression Systems 11-1&1
*igure 11.6.1 Generali1ed fre6uency response of t%e %uman ear# Fote %ow t%e 8CM process
captures signals t%at t%e ear cannot distinguis%# "From 2'4# Used with permission#$
@0 d9u, !eve!s of greater than :0 d9u at @F0 .z and '.z wou!d be required for those frequen#
cies to be heard. The ear a!so e"ercises a degree of tempora! mas'ing, being e"ceptiona!!y to!er#
ant of sharp transient sounds.
It is by mimic'ing these additiona! psychoacoustic features of the human ear and identifying
the irre!evancy content of the input signa! that the transform range of !ow bit#rate a!gorithms
operate, adopting the princip!e that if the ear is unab!e to hear the sound then there is no point in
transmitting it in the first p!ace.
;unti<tion
*uantization is the process of converting an ana!og signa! to its representative digita! format or,
as in the case with compression, the requantizing of an a!ready converted signa! +-. This
process is the !imiting of a finite !eve! measurement of a signa! samp!e to a specific preset
integer va!ue. This means that the actual !eve! of the samp!e may be greater or sma!!er than the
preset reference !eve! it is being compared with. The difference between these two !eve!s, ca!!ed
the #uantization error, is compounded in the decoded signa! as #uantization noise.
*uantization noise, therefore, wi!! be in7ected into the audio signa! after each 3/% and %/3
conversion, the !eve! of that noise being governed by the bit a!!ocation associated with the
coding process 5i.e., the number of bits a!!ocated to represent the !eve! of each samp!e ta'en of
the ana# !og signa!6. (or !inear C&=, the bit a!!ocation is common!y ,6. The !eve! of each audio
samp!e,
therefore, wi!! be compared with one of
,6
or 6F,F36 discrete !eve!s or steps.
&ompression or bit#rate reduction of the C&= signa! !eads to the requantizing of an a!ready
quantized signa!, which wi!! unavoidab!y in7ect further quantization noise. It a!ways has been
good operating practice to restrict the number of 3/% and %/3 conversions in an audio chain.
Gothing has changed in this regard, and now the number of compression stages a!so shou!d be
Downloaded from Digital Engineering Library @ McGraw!ill "www#digitalengineeringlibrary#com$
Copyrig%t & '(() T%e McGraw!ill Companies# All rig%ts reser*ed#
Any use is sub+ect to t%e Terms of ,se as gi*en at t%e website#
Audio Compression Systems
11-1&2 Compression Te'$nologies for Video nd Audio
*igure 11.6.2 E:ample of t%e mas.ing effect of a %ig%le*el sound# "From 2'4# Used with permis-
sion#$
'ept to a minimum. 3dditiona!!y, the bit rates of these stages shou!d be set as high as practica!)
put another way, the compression ratio shou!d be as !ow as
possib!e.
$ooner or !ater1after a finite number of 3/%, %/3 conversions and passes of compression
coding, of whatever type1the accumu!ation of quantization noise and other unpredictab!e
signa! degradations eventua!!y wi!! brea' through the noise/signa! thresho!d, be interpreted as
part of the audio signa!, be processed as such, and be heard by the !istener.
Smpling *re-uen'y nd )it "te
The bit rate of a digita! signa! is def ined
by
sampling fre#uency B bit resolution B number of audio channels
The ru!es regarding the se!ection of a samp!ing frequency are based on GyquistQs theorem +-.
This ensures that, in particu!ar, the !ower sideband of the samp!ing frequency does not encroach
into the baseband audio. 8b7ectionab!e and audib!e a!iasing effects wou!d occur if the two bands
were to over!ap. In practice, the samp!ing rate is set s!ight!y above twice the highest audib!e fre#
quency, which ma'es the fi!ter designs !ess comp!e" and !ess e"pensive.
In the case of a stereo &% with the audio signa! having been samp!ed at ::., '.z, this sam#
p!ing rate produces audio bandwidths of appro"imate!y 0 '.z for each channe!. The resu!ting
audio bit rate R ::., '.z B ,6 B R ,.:,, =bits/s, as discussed previous!y.