Attribution Non-Commercial (BY-NC)

5 views

Attribution Non-Commercial (BY-NC)

- Etc
- Compression vs Picture Sharpness
- Sows A
- Independent Component Analysis of Simulated EEG
- h9zh38u66dz3wi98nedp
- TSF05-5CCTV
- Chapter 12
- Syllabus
- Project
- video_enc_basic
- New Microsoft Word Document
- Compression
- Video Compression MPEG
- SN DataCompression
- Telephony VoIP IP Questions and Answers
- INDOOR 3D VIDEO MONITORING USING MULTIPLE KINECT DEPTH-CAMERAS
- Wallerius HCI 2005 Robust Feature Extraction and Classification of EEG Spectra for Real Time Classification of Cognitive State
- A3250-1-Datasheet
- 25 Quantization and Compression
- Video Basics

You are on page 1of 4

Universidad Politécnica de Valencia Purdue University

Valencia 46071, SPAIN West Lafayette, IN 47907-1285, USA

jprades@dcom.upv.es cook@ieee.org, ace@purdue.ecn.edu

ABSTRACT transmitter and the receiver have different reference frames, pre-

In this paper, an analysis of the efficiency of three signal-to-noise diction drift is introduced (unless R = Rmax ) which reduces the

ratio (SNR) scalable strategies for motion compensated video efficiency. In Scalable Encodings Above the Loop Rate (SALR),

coders and their non-scalable counterpart is presented. After as- prediction drift is avoided by setting Rl = Rl0 = Rmin . This

suming some models and hypotheses with respect to the signals is the scalable strategy used in the fine granular scalability (FGS)

and systems involved, we have obtained the SNR of each coding profile of the MPEG-4 standard [2]. In a SALR coder, the refer-

strategy as a function of the decoding rate. To validate our anal- ence frames s0 are decoded at Rmin which limits the quality of the

ysis, we have compared our theoretical results with data from en- prediction, and therefore, the efficiency of the coder.

codings of real video sequences. Results show that our analysis

describes qualitatively the performance of each scalable strategy, s + e Intraframe Texture Intraframe + s00

+ encoder (Re ) decoder (R) +

and therefore, it can be useful to understand main features of each − +

scalable technique and what factors influence their efficiency. ŝ Intraframe Intraframe

decoder (Rl ) decoder (Rl0 )

1. INTRODUCTION +

+ e0 +

+

+ +

s0

Scalable video can be decoded at two or more different bit-rates MCP MCP

each corresponding to a different level of quality. Although scal-

ability is a desirable property when video has to be transmitted Motion MVs

in channels with errors and bandwidth fluctuations, scalable video Estimation

coders are not commonly being used in practice. One of the rea- Transmitter Receiver

sons is that all scalable coders are lower in efficiency than their

non-scalable (NS) counterparts [1, 2, 3, 4, 5]. Consequently, it is

important to know main features of each scalable technique and Fig. 1. Scheme of a SNR-scalable MCP-based video coder.

what factors influence their efficiency. In this paper, we present a

theoretical study of the efficiency of three signal-to-noise (SNR)

scalable strategies used in video coders with single-loop motion To improve their efficiency, some coders set Rl between Rmin

compensated prediction (MCP). and Rmax and allow decoding both above and below Rl [3, 4, 5].

Figure 1 shows the scheme of a SNR-scalable MCP-based In the following, we call this type Scalable encoding Above

video coder. At the transmitter, the predicted error frames (PEF) and Below the Loop Rate (SABLR). In [6], these three scalable

represented by signal e are encoded at a rate Re to generate the schemes were studied considering one dimensional signals and lin-

bit-stream, and decoded at the loop rate Rl to provide signal e0 ear prediction. In this paper, we have extended the study in [6] to

to the MCP loop. At the decoder, the bit-stream is decoded at Rl0 video signals and motion compensated coders.

(for the MCP loop) and at the decoding rate R. Depending on the In our theoretical analysis we make some assumptions about

values of these four rates (Re , Rl , Rl0 , R) we have different coding the signals and systems involved. With respect to the intra-frame

strategies. If Re = Rl = Rl0 = R, then we have a NS coder, encoding, we assume that embedded quantization is used and the

which sets the maximum performance for scalable coders. In all quantization noise q is modeled as an additive white noise with

the SNR-scalable strategies: Re = Rmax and the decoding rate variance

can vary between the minimum and the maximum rate of the ser- σq2 = σe2 2−βR , (1)

vice (Rmin ≤ R ≤ Rmax ). In Scalable encodings Below the Loop where σe2 is the power of the PEF, β is a parameter that measures

Rate (SBLR), Rl = Rmax and R = Rl0 . This is the encoding strat- the efficiency of the of the intra-frame coding, and R is the intra-

egy proposed in the SNR-scalable MPEG-2 standard [1]. As the frame encoding rate [7]. We also assume that q and e are uncorre-

lated.

This work has been supported by a grant for the Secretarı́a de Estado

de Educación y Universidades of the Spanish Government, by the pro- The rest of hypotheses are similar to the ones assumed in [8,

gram CICYT TIC-2002-02469, and by an Indiana Twenty-First Century 9]. With respect to the input video signal s, we assume that its

Research and Technology Fund grant. frames constitutes a stationary random field. We also assume that

the only difference between consecutive frames is a constant-in- where D = {Λ : |ωx | < π, |ωy | < π}, and Ef is

time and uniform-in-space displacement (dx ,dy ). Although these ZZ

hypothesis are not accurate in real encodings (MVs change in time 1

Ef = 2

|F (Λ)|2 dΛ. (6)

and space, motion can be non-translatory, at low rates q is not 4π D

white and is correlated with e), our analysis can still be useful to

Finally, from (1) and (5), the SNR of the NS coder as a function of

study the relative performance of every scalable strategy.

the decoding rate is

In our analysis, we ignore the bits necessary to encode mo-

tion vectors (MVs). In practice, this does not introduce significant σs2 σ2 “ ”

differences in analyzing the relative performance of each scalable SNRNS (R) = 2

= s 2βR − Ef . (7)

σr Es

strategy, if the number of bits aimed to encode MVs are approxi-

mately the same at all rates and is low compared to the number of If R is large enough so that 2βR Ef , then the SNR (in dB) of

bits used to encode PEF texture. the NS coder is an affine function of R with slope 3β.

In the following, x and y are the spatial variables, and t is

the temporal variable of the video sequence. Their corresponding

3. ANALYSIS OF THE SALR SCHEME

frequency variables are ωx , ωy and ωt respectively, although for

simplicity, Λ = (ωx , ωy ) and Ω = (ωx , ωy , ωt ) are used some-

Figure 3 shows the block diagram of a SALR coder. The quantiza-

times. The predictor is modeled as a random linear time-invariant

tion noise qb is generated by the encoding e at Rmax and its further

system whose frequency response is

decoding at Rl . With respect to the quantization noise source q, is

ˆ ˆ generated by encoding e at Rmax and decoding it at R.

H(ωx , ωy , ωt ) = F (ωx , ωy ) e−j(ωx dx +ωy dy +ωt ) (2)

performed in the MCP loop and (dˆx ,dˆx ) is the estimated (random) s + e s00

+ + +

displacement vector. In general, there is a displacement error vec- −

tor ∆d = (∆dx , ∆dy ) ŝ + qb + ŝ

e0

(∆dx , ∆dy ) = (dx , dy ) − (dˆx , dˆy ). (3) + +

s0 s0

H(Ω) H(Ω)

2. ANALYSIS OF THE NON-SCALABLE CODER

The block diagram of a non-scalable MCP-based video coder is Fig. 3. Block diagram to compute the SNR of the SALR coder.

shown in Figure 2. Notice that the reconstruction error r = s00 − s

is equal to the quantization noise q, and thus σq2 = σr2 . Similarly to the NS coder, σr2 = σq2 , but now

s + e +

+

e0 + s00 and the variance of qb is

+ + +

− +

σq2b = σe2 2−βRmin . (9)

ŝ +

+

+ From (1), (8) and (9), the SNR of the SALR coder is

s0 H(Ω)

SNRSALR (R) = SNRNS (Rmin ) 2β(R−Rmin ) . (10)

H(Ω)

Notice there is no loss with respect to the NS coder at Rmin . Above

Fig. 2. Block diagram of the non-scalable coder. this rate, the SNR (in dB) is an affine funtion of R with slope 3β.

The power spectral density (PSD) of the error frames is [9]: 4. ANALYSIS OF THE SBLR CODER

ˆ ˜

In a SBLR coder, two quantization noise sources must be taken

+ 2

|F (Λ)| Sqq (Λ) (4) into account (Figure 4). The first one (qm ) is placed in the trans-

mitter and is the result of encoding and decoding the predicted

where Sss (Λ) and Sqq (Λ) are the PSD of the input frames and error frames at Rmax . The second one (q) is placed in the receiver

the quantization noise respectively, Re{·} denotes “real part”, and and is the result of decoding the compressed PEF at R.

P (Λ) is the 2-D Fourier Transform of the probability density func- In this case, the reconstruction error r is

tion p∆d (∆d). Then, the power of e is

r = qm + ∆q ∗ hd (11)

σe2 = Es + σq2 Ef (5)

where ∆q = q − qm , hd represents the end-to-end decoder trans-

where Es is fer function, and ∗ is the convolution operator. We assume that

ZZ E{qm ∆q } = 0 and that ∆q is white noise, which provides

1

Sss (Λ) 1 − 2 Re {F ∗ (Λ) P (Λ)} + |F (Λ)|2 dΛ,

ˆ ˜

Es = 2 2

σr2 = σq2m + σ∆q Ed (12)

4π D

3110

q quantization to encode the PEF [10], it can operate in any of the

s + e s00 four coding modes (NS, SALR, SBLR and SABLR).

+ + + To obtain specific numerical simulation results, some parame-

−

ŝ + qm ters have to be set. With respect to the video signals, we assume s

e0 has an isotropic PSD

+ H(Ω)

s0 2π σs2

„

ωx2 + ωy2

«−3/2

H(Ω) Sss (ωx , ωy ) = 1+ (17)

ω02 ω02

Fig. 4. Block diagram to compute the SNR of the SBLR coder.

where σs2 is the signal power and ω0 has been set to provide an

adjacent step correlation coefficient equal to 0.93 [9]. It is as-

2

where σq2m and σ∆q are the variances of qm and ∆q respectively, sumed that ∆d follows a zero mean isotropic Gaussian distribu-

2

Ed is tion with σ∆d = 0.2 T 2 where T is the spatial sampling period.

1

Z Z Z ff With respect to the coder, parameter β has been set to 3 and, al-

−2

Ed = E |1 − H(Ω)| dΩ (13) though spatial filtering is not considered, we introduce a leaky fac-

8π 3 D0

tor equal to 0.95, and then F (Λ) = 0.95. The use of a leaky factor

where E{·} is the expectation operator, and D0 = {Ω : |ωx | < limits the effect of prediction drift in SBLR and SABLR coders.

2

π, |ωy | < π, |ωt | < π}. As σ∆q = σq2 − σq2m , Expression (12) Practical coders usually introduce some implicit or explicit spatial

transforms into filtering in the MCP loop which can be considered as a frequency-

dependent leaky factor. The rate interval chosen is Rmin = 0.066

σr2 = σq2m + σq2 − σq2m Ed

` ´

bits/pixel and Rmax = 0.33 bits/pixel which for CIF sequences at

30 frames/s is equivalent to Rmin = 200 kbits/s and Rmax = 1000

h “ ” i

= σq2m 1 + 2β(Rmax −R) − 1 Ed . (14)

kbits/s.

Figure 5 shows the SNR(R) function of the NS, SALR, SBLR

Finally, from (14) and σq2m = Es /(2βRmax − Ef ), we obtain and SABLR coder for the set of parameters previously described.

In the case of the SABLR coder three curves, corresponding to

SNRNS (Rmax )

SNRSBLR (R) = (15) Rl = 0.131, 0.197 and 0.263 bits/pixel, have been plotted. These

[1 + (2β(Rmax −R) − 1) Ed ] three rates correspond to 400, 600 and 800 kbits/s respectively,

The SBLR coder has no loss with respect the NS coder at Rmax . if CIF video sequences at 30 frames/s are used. In the SABLR

Below this rate, prediction drift is introduced. Note that if R is far curves, the Rl value is the rate at which the SABLR and the NS

below Rmax so that Ed 2β(Rmax −R) 1, the SNR of the SBLR curve intersect. The portions of the three SABLR curves where

coder (in dB) is an affine function of R with slope 3β. R > Rl are equivalent to the curves of a SALR coder using

Rmin = Rl . Equivalently, the portions of the SABLR curves

where R < Rl can be considered SBLR curves with Rmax = Rl .

5. ANALYSIS OF SABLR CODER In the SALR intervals of the curves in Figure 5, notice that the

larger Rmin is, the lower the loss is with respect to the NS coder,

In SABLR coders, according to the decoding rate R, we can dis- but the interval of rates where decoding is possible is also lowered.

tinguish two operating intervals: In fact, if Rmin is large enough so that 2βRmin Ef , the loss is

• the SBLR interval (Rmin ≤ R ≤ Rl ) where prediction insignificant. With respect to the SBLR intervals of the curves,

drift is introduced. In this interval, the SABLR coder has a the contrary effect in the SALR ones is noted: the loss decreases

higher SNR than the SBLR coder. with a decrease in Rmax (again, at the expense of reducing the

interval of decoding rates). SABLR coders allow a balancing of

• the SALR interval (Rl ≤ R ≤ Rmax ) where there is a

both effects and by setting Rl properly, the mean SNR (MSNR)

loss of performance with respect to the NS coder because

can be improved with respect to the SALR and the SBLR coders.

the prediction is based on previous frames decoded at Rmin

For the encoding parameters of Figure 5, a maximum MSNR of

instead of R. In this interval, the SABLR coder has a higher

10.15 dB is achieved at Rl = 0.162 bits/pixel (or, equivalently, at

SNR than the SALR coder.

550.3 kbits/s with CIF sequences at 30 frames/s). With respect to

From Sections 3 and 4, the SNR for the SABLR coder is: the SALR and the SBLR coders, the MSNR are 8.86 dB and 8.33

8 dB respectively.

SNRNS (Rl ) To test the efficiency of the strategies in practice, we have en-

< , R ≤ Rl

SNRSABLR (R,Rl) = [1 + (2β(Rl −R) − 1) Ed ] (16) coded several test CIF sequences (352 × 288 pixels/frame) at 30

β(R−Rl )

SNRNS (Rl ) 2 , R ≥ Rl

:

frames/s with SAMCoW. The quality of each encoding is measured

by computing the mean PSNR (in dB) of the luminance component

Notice that the SABLR coder has no loss with respect to its NS of 100 decoded frames. As our theoretical analysis only accounts

counterpart at Rl . for the steady-state performance of coders, in every encoding an

initial portion of each decoded sequence containing frames with

6. EXPERIMENTAL RESULTS transient response was not considered. Motion estimation is per-

formed at integer-pixel accuracy with no loop filter and, as in the-

In this section, we compare our theoretical analysis with data from ory, a leaky factor c = 0.95 is introduced. Figure 6 shows the

encodings of real video sequences using the MCP-based SNR- SNR(R) function obtained by encoding Foreman with SAMCoW

scalable SAMCoW video coder. As SAMCoW uses embedded running in the four strategies. By comparing Figures 5 and 6, we

3111

15 cannot be increased much above Rmin because the improvement

NS

14 SALR in the SALR interval could not compensate the loss introduced in

SBLR the SBLR interval. Second, in practice, gains with respect to the

13 SABLR

SALR are lower than in theory. In fact, the optimum Rl value is

12 300 kbits/s which provides a mean PSNR of 30.72 dB, compared

11 to the 30.41 dB and 28.44 dB of the SALR and SBLR coders re-

spectively.

SNR [dB]

10

9

7. CONCLUSIONS AND FUTURE WORK

8

In this paper, we have theoretically analyzed the performance of

7

three sorts of MCP-based SNR-scalable video coders and have

6 compared them to their non-scalable counterpart. Results show

5 that main trends in the efficiency described by the theory match

practical results obtained from the encoding of real video se-

0.066 0.131 0.197 0.263 0.33 quences. Consequently, our analysis is useful to understand the

Decoding Rate [bits/pixel] main features of each scalable strategy and what factors influence

Fig. 5. Numerical simulation of the theoretical SNR(R) of the four their efficiency.

video strategies using the assumptions outlined in Section 6. Although the present work only takes into account the steady-

state response of SALR and SABLR coders, we are currently ex-

tending our analysis by considering also their transitory response.

NS

34 SALR

This will allow us to analyze the efficiency of these strategies in

SBLR coders using periodic intra-frames. We are also studying the op-

SABLR

timum values of parameters c and Rl when different degrees of

32

motion estimation accuracy exist.

30

PSNR [dB]

8. REFERENCES

28

[1] B. G. Haskell, A. Puri, and A. N. Netravali, Digital Video:

An introduction to MPEG-2, Chapman and Hall, 1997.

[2] W. Li, “Overview of fine granularity scalability in MPEG-4

26

video standard,” IEEE Trans. on CSVT, vol. CSVT-11, pp.

301–317, 2001.

24 [3] C. Buchner, T. Stockhammer, D. Marp, G. Blatterman, and

G. Heising, “Efficient fine granular scalable video coding,”

200 300 400 500 600 700 800 900 1000

Decoding Rate [kbits/s]

in Proceedings of the ICIP, Thessaloniki, Greece, October

7–10 2001, pp. 997–1000.

Fig. 6. PSNR(R) of the four video strategies using SAMCoW

[4] J. Prades-Nebot, G. Cook, and E. J. Delp, “Rate control for

FFGS video coders,” in Proceedings of the SPIE VCIP, San

Jose, California, 2002, vol. 4310, pp. 828–839.

can study the differences between theory and practice. No attempt

of using similar parameters values (β, ω0 ) in theory and practice [5] M. van der Schaar and H. Radha, “Adaptive motion-

has been made, and therefore, our comparison is qualitative. compensation fine-granular-scalability (AMC-FGS) for

With respect to the SALR intervals of the scalable strategies, wireless video,” IEEE Transactions on CSVT, vol. 12, no. 6,

while in theory all the SALR curves have the same slope, in prac- pp. 360–370, June 2002.

tice the slope decreases when Rl increases. The reason is that, in [6] J. Prades-Nebot and G. W. Cook, “Analysis of the perfor-

practice β is not constant but depends on Rl : starting in Rl = 0, mance of predictive SNR scalable coders,” in Proceedings of

β decreases rapidly with increase in Rl , but tends to a constant the ICIP, Barcelona, Spain, Sept. 2003, vol. 3, pp. 861–864.

value at high Rl . The consequence of this is that, in practice, the [7] P.-Y. Cheng, J. Li, and C.-C. J. Kuo, “Rate control for an

gain obtained by increasing the value of Rmin is lower than the embedded wavelet video coder,” IEEE Trans. on CSVT, vol.

one obtained in theory. 7, pp. 696–701, 1997.

With respect to the SBLR intervals of the scalable strategies, [8] B. Girod, “The efficiency of motion-compensating predic-

although theory and practice tend to be similar at high decoding tion for hybrid coding of video sequences,” IEEE Journal on

rates, there is a great divergence at low decoding rates where the SAC, vol. SAC-5, no. 7, pp. 1140–1154, 1987.

loss in practice is higher than the theoretical one. The reasons of

[9] B. Girod, “Motion-compensating prediction with fractional-

this divergence is that, at low rates, some of our hypothesis do not

pel accuracy,” IEEE Trans. on Communications, vol. 41, pp.

hold (β changes largely with R and, ∆q and qm are correlated).

604–611, 1993.

We have checked that when rate intervals with higher Rmin val-

ues are used, theory and practice are much closer. Differences [10] K. Shen and E. J. Delp, “Wavelet based rate scalable video

between theory and practice in both the SALR and SBLR inter- compression,” IEEE Trans. on CSVT, vol. 9, pp. 109–122,

vals, have two main consequences for the SABLR coder. First, Rl 1999.

3112

- EtcUploaded byvenkatsrmv
- Compression vs Picture SharpnessUploaded byapi-3738209
- Sows AUploaded byqusai84
- Independent Component Analysis of Simulated EEGUploaded byajandekelefant
- h9zh38u66dz3wi98nedpUploaded byanon_184995438
- TSF05-5CCTVUploaded byKAZIMALI25
- Chapter 12Uploaded byVivek Pradhan
- SyllabusUploaded byUpwan Gupta
- ProjectUploaded byBashir Khan
- video_enc_basicUploaded byPrasad Gvbs
- New Microsoft Word DocumentUploaded byaenuguprasadrao
- CompressionUploaded byPriyanka Anand
- Video Compression MPEGUploaded byanujkawasthi
- SN DataCompressionUploaded byTuan Doanh
- Telephony VoIP IP Questions and AnswersUploaded byHitesh Shastry
- INDOOR 3D VIDEO MONITORING USING MULTIPLE KINECT DEPTH-CAMERASUploaded byIJMAJournal
- Wallerius HCI 2005 Robust Feature Extraction and Classification of EEG Spectra for Real Time Classification of Cognitive StateUploaded bymarco,
- A3250-1-DatasheetUploaded byOki Maulana Rosati
- 25 Quantization and CompressionUploaded byAngger Ardiansyah
- Video BasicsUploaded byfollonerus
- Lower Bounds on Power-Dissipation for DSP Algorithms-shanbhag96Uploaded byPhuc Hoang
- 13imagecompression-120321055027-phpapp02.pptxUploaded byTripathi Vina
- AQuA 7.x ManualUploaded bysevanaoy
- 7 - DCTUploaded byJonatasCartaxo
- 10.1073@pnas.1704450114Uploaded byviniezhil2581
- Depth.pdfUploaded byMauricio Leaño
- Bagianku.txtUploaded byGee Cloud
- Digital TV 1Uploaded byT Chatterjee
- NfovUploaded byPop Robert
- Theory of Programming PracticeUploaded byajay9123456

- Verilog Objective TestUploaded bySoumik Sarkar
- VLSI Design Interview QuestionsUploaded bySiva Rao
- Color ChartUploaded bynunonegrier
- Beamer TutorialUploaded bySoumik Sarkar
- Report On Hardware Design of Automotive DiagnoserUploaded bySoumik Sarkar
- Prognoser PresentationUploaded bySoumik Sarkar
- 1000 Novels Must Read by GuardianUploaded byAdascalitei Alexandra
- A Dynamic Analysis of the Dickson Charge Pump CircuitUploaded bySoumik Sarkar
- cadenceTutorial_v3_3Uploaded bySoumik Sarkar
- C-Stacks & QueuesUploaded bySoumik Sarkar

- 022127Uploaded byuranub
- MPEG Compression TechnologyUploaded bydanishkhanlxr
- 00585296Uploaded byAnonymous 1WYk7wpZl
- ADAT_BOQ_Uploaded byduchoang5000
- 03N60S5Uploaded byPascual Mtz
- dc201e30a40ndcUploaded byElectromate
- Best Receiver Eddystone Ever Built - The EC958Uploaded byrotex
- ADCUploaded bynandan_dubey93
- s e r i e sUploaded bydkn_8k
- Frequency Selective Filters LectureUploaded byAditya Chanda
- ND-00029Uploaded byAhmed Ismail
- 224066 (1)Uploaded byAhmet Durlu
- SDCard Product ManualUploaded byBoti7
- PotterNBrumfield_SSRQ_DSUploaded byBri Pisko
- Service Manual 100712Uploaded byadavit73
- User Manual Gp-68Uploaded byvaahplus
- Solved Papaers - 2007Uploaded byRohit Mishra
- Lab Report Zeb AUploaded byKkreerMix
- Engineering Science - DC and AC TheoryUploaded bySteven Goddard
- ATC-873 User's ManualUploaded byMaitry Shah
- 677 Hi -Fi System ProfessionalUploaded bynidecker12
- Chapter 11Uploaded byChristopher Inoval Paril
- Manual PlayerUploaded byJoflix
- Embedded C 8051 Example ProgramsUploaded byprakash_neo
- GT-I9500-TSHOO-7.pdfUploaded byValmirSousaSilva
- Optimal Design of a Reversible Full AdderUploaded byMohamed
- Spy EarUploaded byAsad Abbas
- Profile Link Budget171116-200912Uploaded byYuli Michelle Bernaola Campos
- S-AV36 DatasheetUploaded bypaprati7965
- Subordination in KaritianaUploaded byEugênio C. Brito

## Much more than documents.

Discover everything Scribd has to offer, including books and audiobooks from major publishers.

Cancel anytime.