This action might not be possible to undo. Are you sure you want to continue?

**InterDigital Communications, LLC 9710 Scranton Rd. Suite 250, San Diego, CA 92121 USA
**

ABSTRACT In this paper, we analyze the effect of time-varying channels to video codec buffer specially for low-delay applications. We derive the sufﬁcient conditions under which an encoder can design a bitstream for any time-varying channel without decoder buffer overﬂow and underﬂow. We then apply those conditions to design a bandwidth adaptive rate control in x264 and test it under LTE simulator. Our test results show significant improvement of delay and delay jitter over traditional leaky bucket models. Index Terms— Leaky bucket model, HRD, VBV, Rate control, LTE, H.264/MPEG-4 AVC 1. INTRODUCTION Thanks to the advances in wireless networks and improvements in processing and graphics capabilities of mobile devices, mobile video telephony is now becoming part of our daily lives [1]. Yet, some technical challenges in the design of video phone applications still exist. On one hand, such applications require low-delay, jitter-free delivery of video, but on the other hand, they are constrained by time-varying behavior of mobile networks. In most video coding standards, a hypothetical reference decoder (HRD) [2] or a video buffering veriﬁer (VBV) [3] is proposed to help design of conforming encoders and bitstreams. One major part of HRD/VBV is to ensure that there will be no underﬂow and overﬂow in the decoder coded picture buffer (CPB) where underﬂow causes delay jitter and overﬂow causes packet loss. In video encoders, leaky bucket models are usually adopted to control encoded bit rate conforming to the decoder HRD or VBV. Ref. [4] proposes a generalized HRD in H.264/AVC, where several leaky bucket models are speciﬁed for a given bit stream instead of just one option in previous standards. The best leaky bucket model can be selected by communication system according to different network connections and delay requirements. New wireless network standard, e.g. 4G LTE, provides much faster transmission rate than previous wireless standards. Due to the multi-path fading and multi-user characteristics, the transmission bit rate ﬂuctuates signiﬁcantly, from zero to dozens of megabits per second, every transmission time interval (TTI). Rapid changes in channel behavior make the problem of low-delay rate control for video encoding very challenging. However, the generalized HRD assumes channel has constant bit rate (CBR) or piece-wise variable bit rate (VBR), i.e channel capacity changes much slower than frame rate. To the best of our knowledge, there is no literature strictly prove the conditions guiding encoder to design the bitstream conforming to HRD/VBV after transmitted over a fast time-varying channel. In this paper, we analyze the effect of time-varying channels to buffer specially for low-delay applications. We derive, for the ﬁrst time, the sufﬁcient conditions under which an encoder can design a bitstream for any time-varying channel without buffer overﬂow and underﬂow. We then apply those conditions to design a bandwidth adaptive rate control in x264 and test it under LTE simulator. Our method requires an estimation of channel bit rate. Both perfect and imperfect channel estimation are simulated. The test results provides an reference about how much the improvement of delay/delay jitter and video quality over traditional CBR/VBR leaky bucket models we may gain if channel estimation is possible to encoder. The rest of paper is organized as follows. Section 2 offers background information and illustration of problems why traditional leaky bucket models fail if channel bit rate rapidly changes. Section 3 proves the conditions for avoiding overﬂow and underﬂow in buffer and explains the bandwidth adaptive rate control method. Section 4 presents experimental results. Conclusions are drawn in Section 5. 2. PROBLEM STATEMENT Fig. 1 show the relationship among a leaky bucket model and encoding schedule in a CBR case. In H.264/AVC HRD, the decoding process is assumed to be instantaneously, i.e. the decoding schedule is a step function (or staircase function). Correspondingly, in Fig. 1 we also assume the encoding process is instantaneous. Note that our method in this paper is general for non-instantaneous encoding and decoding processes. In Fig. 1, the encoder buffer fullness is deﬁned as 𝐹𝑒 (𝑡) = 𝒮𝑒 (𝑡) − 𝒮𝑡 (𝑡), which is constrained by 𝐵𝑒 and the decoder buffer fullness is deﬁned as 𝐹𝑑 (𝑡) = 𝒮𝑟 (𝑡) − 𝒮𝑑 (𝑡), which is constrained by 𝐵𝑑 . The leaky bucket model actually speciﬁes a combination of {max[𝑅(𝑡)], 𝐵𝑒 , 𝐹𝑒 (𝑡0 )} for the encoder, where 𝑅(𝑡) is 𝑒 the channel capacity over time. It has been proved that the

3. it follows that in order to avoid overﬂow and underﬂow in the encoder buffer. 𝛿𝑒 and 𝛿𝑑 has critical impact on the HRD performance. may be problematic for low-delay applications. and decoder correspondingly. ANALYSIS AND ALGORITHM 3. We will see that in the experimental section. while 𝒮𝑡. 𝛿𝑑 𝑑 are delays due to encoder buffering. 1. 𝑒 𝑒 𝑒 𝑒 (5) where 𝒮𝑒 (𝑡𝑘− ) is the cumulative number of encoded bits at 𝑒 time 𝑡𝑘− and 𝒮𝑡 (𝑡𝑘− ) is the cumulative number of transmitted 𝑒 𝑒 . receiver. the reception and transmission schedules become connected as follows 𝒮𝑟 (𝑡) = 𝒮𝑡 (𝑡 − 𝛿𝑝 ). 2. in video coding standards. Basic constraints From previous discussion. the decoding schedule shall satisfy 𝒮𝑟 (𝑡) − 𝐵𝑑 < 𝒮𝑑 (𝑡) < 𝒮𝑟 (𝑡). transmission. 2 also shows possible decoder overﬂow and encoder underﬂow situations that can occur under time-varying channels. Similarly. (4) We next produce bounds for encoding and decoding schedules for time-varying channels with a continuous function 𝑅(𝑡). 𝐿𝑘 is the length of 𝑘-th encoded frame. and decoder buffering correspondingly. Fig. 2 shows an example of decoder underﬂow caused by transmission of frame 1 (index begins from 0). Δ𝑘 is the endto-end delay. 𝒮𝑒 (𝑡). a bit stream designed to avoid overﬂow and underﬂow in encoder under the corresponding leaky bucket model will conform to the HRD in decoder. 1 and Fig. Fig. 𝒮𝑡 (𝑡). [4].𝐶𝐵𝑅 (𝑡) corresponds to constant rate channel model. which is to derive the sufﬁcient conditions for avoiding buffer underﬂow and overﬂow under rapidly-varying channels and based on that to design bandwidth adaptive rate control mechanism that can ensure HRD compliant bitstreams. 𝒮𝑟 (𝑡). 3. Fig. Circled regions show encoder and decoder underﬂows caused by changing behavior of the channel. 2. Therefore. This frame was regulated by the traditional leaky bucket model in order to meet HRD but can no longer be delivered on scheduled arˆ𝑑 rival time 𝑡1 due to very low instantaneous channel bit rate. which.1. Leaky bucket model in the CBR case. 0 where 𝒮𝑑 (𝑡) is a function of initial decoder delay 𝛿𝑑 . 𝛿𝑝 . If data packets have constant propagation delay 𝛿𝑝 . 𝐵𝑑 . HRD or VBV only speciﬁes {max[𝑅(𝑡)]. in order to avoid overﬂow and underﬂow in the decoder buffer.Fig. 𝐵𝑒 and 𝐵𝑑 denote buffer sizes of encoder and decoder correspondingly. 𝛿𝑡 . This introduces decoder jitter. propagation. transmitter. (1) fails and both underﬂow and overﬂow happens due to time-varying channels. the encoding schedule 𝒮𝑒 (𝑡) shall satisfy 𝒮𝑡 (𝑡) < 𝒮𝑒 (𝑡) < 𝒮𝑡 (𝑡) + 𝐵𝑒 . (3) complement of encoder buffer fullness is equal to the decoder buffer fullness for the CBR cases [4]. (2) where 𝒮𝑡 (𝑡) depends on the channel capacity and initial en0 coder delay 𝛿𝑒 as shown in Fig. authors also proved that the complement of encoder buffer fullness is a tight (achievable) upper bound to the decoder buffer fullness in piece-wise VBR case 𝐹𝑑 (𝑡) ≤ 𝐵𝑒 − 𝐹𝑒 (𝑡). There0 0 fore. assumed by the encoder. Quantities 𝛿𝑒 . 𝒮𝑑 (𝑡) show cumulative numbers of bits over time (or schedules) at the encoder. Behavior of the leaky bucket model under timevarying channel. 2 . (1) However. if happens frequently. as shown in Fig. 𝒮𝑡. So the decoder has to wait till point 𝑡1 in order to start de𝑑 coding of this frame. which is pushed into encoder buffer at 𝑡𝑘 and 𝑒 𝑘 𝑘 𝑘 𝑘 pulled out of decoder buffer at 𝑡𝑘 .𝑉 𝐵𝑅 (𝑡) shows the actual transmission schedule. Constraints under time-varying channels By 𝑡𝑘− and 𝑡𝑘+ let us denote time points right before and 𝑒 𝑒 after adding the 𝑘-th frame bits into encoder buffer. 1. although the exactly same encoding schedule as in Fig. 𝐹𝑑 (𝑡0 ) (equivalent to 𝑑 0 0 𝛿𝑡 + 𝛿𝑑 )} for the decoder.2. which is the most general case including CBR and piece-wise VBR. As a result. It is easy to prove that a sufﬁcient and necessary condition of (2) is 𝒮𝑒 (𝑡𝑘− ) > 𝒮𝑡 (𝑡𝑘− ) and 𝒮𝑒 (𝑡𝑘+ ) < 𝒮𝑡 (𝑡𝑘+ ) + 𝐵𝑒 . These examples explain motivation for our work. In Ref.

we can rewrite left side in (10) as 𝑆𝑟 (𝑡𝑘+1 ) − 𝐵𝑑 = 𝑆𝑡 (𝑡𝑘+1 + Δ𝑘+1 − 𝛿𝑝 ) − 𝐵𝑑 𝑒 𝑑 ∫ 𝑡𝑘+1 ∫ 𝑡𝑘+1 +Δ𝑘+1 −𝛿𝑝 𝑒 𝑒 = 𝛿0 𝑅(𝑡)𝑑𝑡 + 𝑡𝑘+1 𝑅(𝑡)𝑑𝑡 − 𝐵𝑑 . 3.264/AVC.3. 𝛿𝑒 is called initial cpb removal delay and 0 0 𝛿𝑡 + 𝛿𝑑 is called initial cpb removal delay offset. Therefore. (7) From the deﬁnition of buffer fullness. by using 𝐵𝑑 = (Δmax − 𝛿𝑝 ) ⋅ 𝑅max and (15) for designing the encoding schedule. ∫ 𝑡𝑘+1 +Δ𝑘+1 −𝛿𝑝 𝑒 (13) 𝑅(𝑡)𝑑𝑡 < 𝐵𝑑 . 𝑒 (18) (18) deﬁne sufﬁcient conditions for avoiding decoder buffer underﬂow. The thresholds for CBR and piece-wise VBR cases can be easily obtained from them. we know that 𝐵𝑒 should satisfy ∫ 𝐵𝑒 ⩾ 𝑡𝑘 +Δ𝑘 −𝛿𝑝 𝑒 𝑡𝑘 𝑒 𝑅(𝑡)𝑑𝑡. We note that 𝑆𝑟 (𝑡𝑘 ) = 𝑆𝑡 (𝑡𝑘 + Δ𝑘 − 𝛿𝑝 ) 𝑒 𝑑 ∫ 𝑡𝑘 +Δ𝑘 −𝛿𝑝 ∫ 𝑡𝑘 𝑒 𝑅(𝑡)𝑑𝑡. Assume the time axis begins from encoding 𝑒 the ﬁrst frame. it is easy to prove that the tightest upper bound of 𝐵𝑑 shall be (Δmax − 𝛿𝑝 ) ⋅ 𝑅max . 1 and Fig. it is easy to prove that a sufﬁcient and necessary condition of (3) is 𝑘 ∑ (10) 𝑆𝑟 (𝑡𝑘+1 ) − 𝐵𝑑 < 𝐿𝑖 < 𝑆𝑟 (𝑡𝑘 ). 𝑒 (15) 𝑡𝑘+1 𝑒 0 𝛿𝑒 𝑅(𝑡)𝑑𝑡 < 𝑘 ∑ 𝑖=0 𝐿 < 𝑖 𝑡𝑘 𝑒 0 𝛿𝑒 𝑅(𝑡)𝑑𝑡 + 𝐵𝑒 . (11) By comparing this expression with the right side of (9). (12) 𝑒 𝑒 which can be ensured by setting 𝐵𝑒 = 𝐵𝑑 . (9) for all 𝑘 ⩾ 0. i. Sufﬁcient conditions for 𝐿𝑘 and tightest upper bound for 𝐵𝑒 and 𝐵𝑑 From (4). ∫ 𝑡𝑘+1 𝑑 0 0 𝑡0 −𝛿𝑡 −𝛿𝑑 𝑑 𝑅(𝑡)𝑑𝑡 − 𝐵𝑑 < 𝑘 ∑ 𝑖=0 𝐿 < 𝑖 ∫ 𝑡𝑘 𝑑 0 0 𝑡0 −𝛿𝑡 −𝛿𝑑 𝑑 𝑅(𝑡)𝑑𝑡. By comparing this expression with the left side of (9). .e. Note that (15) and (18) are general forms for any channel. Discussion of delay-related constraints From Fig.bits at time 𝑡𝑘− . we see that 𝐹𝑒 (𝑡𝑘− ) = 𝑒 ∑𝑘−1 𝑖 ∫ 𝑡𝑘 𝑒 𝑖=0 𝐿 − 𝛿 0 𝑅(𝑡)𝑑𝑡. = 𝛿0 𝑅(𝑡)𝑑𝑡 + 𝑡𝑘𝑒 𝑒 𝑒 (16) that is. where Δ𝑚𝑎𝑥 = max𝑘 Δ𝑘 and 𝑅max = max𝑡 𝑅(𝑡). 1 and Fig. that is 𝑡0 = 0. 3. By (16) and channel estimation. the estimated channel capacity 𝑅(𝑡) can 𝑘 help design 𝐿 in encoder to meet (9) and (11). Following the same arguments. and 2) determine the tightest upper bound for 𝐵𝑒 and 𝐵𝑑 . The ﬁrst inequality in (9) can be rewritten as 𝐿 > 𝑘 ∫ 𝑡𝑘 𝑒 0 𝛿𝑒 and 𝒮𝑡 (𝑡𝑘− ) = 𝒮𝑡 (𝑡𝑘 ) = 𝒮𝑡 (𝑡𝑘+ ) = 𝑒 𝑒 𝑒 Therefore. 𝑒 ∫ (8) 𝐿𝑘 > ∫ 𝑡𝑘+1 𝑒 𝑡𝑘 𝑒 ˆ 𝑅(𝑡)𝑑𝑡 − 𝐹𝑒 (𝑡𝑘− ). (14) ∫ 𝑡𝑘 𝑒 0 𝛿𝑒 𝑅(𝑡)𝑑𝑡. (9) and (11) may be not satisﬁed simultaneously due to time-varying 𝑅(𝑡). 2. the second inequality in (10) can be rewritten as 𝐿𝑘 < ∫ 𝑡𝑘 +Δ𝑘 −𝛿𝑝 𝑒 𝑡𝑘 𝑒 ˆ 𝑅(𝑡)𝑑𝑡 − 𝐹𝑒 (𝑡𝑘− ). Both of them was deﬁned in Buffering period SEI (supplemental enhancement information) message [2]. we also need to balance 𝐿𝑘 and Δ𝑘 . 𝒮𝑒 (𝑡𝑘− ) = 𝒮𝑒 (𝑡(𝑘−1)+ ) = 𝑒 𝑒 𝑖=0 From (13). In the following.4. Consider now the right side in (10). 𝑡𝑘+1 𝑒 (19) 0 In H. However. From Fig. we can 𝑒 observe that 𝑘−1 ∑ (6) 𝐿𝑖 . (5) becomes 𝑆𝑡 (𝑡𝑘+1 ) < 𝑒 that is. if both (13) and the ﬁrst inequality in (9) becomes satisﬁed. (18) and (17) further ensure the encoder has no overﬂow. 𝑑 𝑑 𝑖=0 In other words. if the transmitter has the capability of channel estiˆ mation for encoder. we can ensure the resulted bitstreams will not cause overﬂow in the decoder buffer. we know that 𝑘 𝑘 𝑘 Δ𝑘 = 𝛿𝑒 + 𝛿𝑝 + 𝛿𝑡 + 𝛿𝑑 . Note that (15) also ensures no underﬂow in the encoder buffer. avoiding decoder overﬂow. (14) is simpliﬁed as 𝑒 𝐿𝑖 < 𝑆𝑡 (𝑡𝑘 ) + 𝐵𝑒 . 2. (17) Different from CBR and piece-wise VBR. we will 1) derive the sufﬁcient conditions for 𝐿𝑘 to avoid overﬂow and underﬂow in encoder and decoder. In some cases. ∫ 𝑘 ∑ 𝑖=0 𝑅(𝑡)𝑑𝑡 − 𝑘−1 ∑ 𝑖=0 𝐿 + 𝑖 ∫ 𝑡𝑘+1 𝑒 𝑡𝑘 𝑒 ˆ 𝑅(𝑡)𝑑𝑡. we observe that the ﬁrst inequality in (11) becomes satisﬁed.

i. 𝛿𝑡 ≤ Δ − 𝛿𝑝 − 𝛿𝑒 . and other header information bits 𝑚𝑣 𝐿𝑘 ℎ𝑒𝑎𝑑𝑒𝑟 . it follows that end-to-end delay Δ𝑘 must satisfy: (20) Δ𝑘 − 𝛿𝑝 ⩾ 𝑡𝑘+1 − 𝑡𝑘 . H. We’ve also set our rate control algorithm to meet several end-to-end-delay bounds. All tests were performed using standard CIF and 720p .814 model. and a modiﬁed version of x264 encoder incorporating our proposed algorithm. 𝛿𝑑 ≥ 0 and the transmission 𝑘 delay 𝛿𝑡 depends on the 𝑘-th frame bits 𝐿𝑘 . Δ𝑘 should include the extra propagation delay of the 𝑘-th frame 𝑘 0 𝛿𝑝 − 𝛿𝑝 . that is Δ𝑘 = Δ. 𝐿𝑘 and 𝐿𝑘 𝑚𝑣 ℎ𝑒𝑎𝑑𝑒𝑟 are less affected by quantization step size 𝑄𝑘 . (23) (21) In case of strict delay bound. 90ms. For x264 VBR rate control. mo𝑟𝑒𝑠𝑖 tion information bits 𝐿𝑘 . we set initial delay and provided the mean and maximum channel rate. there are inﬁnite 𝒮𝑒 (𝑡) which meets (15) and (18). before encoding the 𝑘-th frame. the delay bound (excluding propagation delay 𝛿𝑝 ) is set to be very low. Experimental Setup To test our rate adaptation logic we have employed H.264/AVC HRD also speciﬁes two delay modes. for strict 𝑘 𝑘 𝑘 delay mode. Decoding of all bitstreams was done by using standard JM decoder[8]. which is beyond the scope of this paper. Therefore. every 1ms. the 𝑘-th frame bits 𝐿𝑘 should be allocated such that: 𝑘 𝑘 𝛿𝑡 ≤ Δ𝑘 − 𝛿𝑝 − 𝛿𝑒 .e.264/AVC codec. coupled with LTE channel simulator. the emphasis on Δ𝑘 may cause large variation in bits allocated to different frames 𝐿𝑘 and hence inconsistent video quality. Note that for more general channel models where the 𝑘 propagation delay is time-varying. We will use the ﬁrst method in this paper to show the performance of our bandwidth adaptive rate control algorithm. In this case. For these cases. the encoded bit rate 𝐿𝑘 consists of residual bits 𝐿𝑘 . Theoretically. For our rate control algorithm we used information from simulator to obtain channel rate estimates with different channel feedback delay. 𝑘 𝐿𝑘 = 𝐻𝑟𝑒𝑠𝑖 ⋅ 𝑁𝑟𝑒𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 + 𝐿𝑘 + 𝐿𝑘 𝑚𝑣 ℎ𝑒𝑎𝑑𝑒𝑟 .resolution video sequences [9]. The results are compared with both CBR and VBR in the original x264 encoder.2. encoder 𝑘 where 𝐻𝑟𝑒𝑠𝑖 is the entropy of residual (bits per pixel) and 𝑁𝑟𝑒𝑠𝑜𝑙𝑢𝑡𝑖𝑜𝑛 is the normalized video resolution considering 𝑘 color components. for low delay mode. 6]. Target frame bit estimation Our ﬁnal task is to deﬁne a rate assignment algorithm that meets conditions (15) and (18). 𝐿 = 2 𝑡𝑘 𝑘 𝑡𝑒 𝑒 (22) 3. Given the target end-to-end delay Δ𝑘 . frame rate 𝑓𝑠 . Another delay related constraint is to balance 𝐿𝑘 and Δ𝑘 . . 𝑒 𝑒 By (19) and (20). LTE channel rate changes signiﬁcantly every TTI. The emphasis on choice of best rate points 𝐿𝑘 may cause large delay jitter.e. A simple but robust method is to design 𝐿𝑘 to meet the average of the upper bound and lower bound. where Δ is 𝑘 𝑘 a given nominal delay. 𝛿𝑒 + 𝛿𝑡 + 𝑘 𝑘+1 𝑘 𝛿𝑑 ⩾ 𝑡𝑒 − 𝑡𝑒 . 𝛿𝑡 should be less than Δ − 𝛿𝑝 − 𝛿𝑒 In case delay jitter is allowable. should design 𝐿𝑘 based on its buffer fullness 𝐹𝑒 (𝑡𝑘− ) and 𝑒 ˆ estimated channel capacity 𝑅(𝑡) as follows ] [∫ 𝑘 1 ∫ 𝑡𝑘 +Δ𝑘 −𝛿𝑝 𝑡𝑒 + 𝑓𝑠 𝑒 1 𝑘 𝑘−1 ˆ ˆ 𝑅(𝑡)𝑑𝑡 + 𝑅(𝑡)𝑑𝑡 − 𝐹𝑒 (𝑡𝑒 ). i. 𝐿𝑘 𝑚𝑣 and 𝐿𝑘 ℎ𝑒𝑎𝑑𝑒𝑟 can be ﬁrst estimated from the statistics in the 𝑘 previous frames and then 𝐻𝑟𝑒𝑠𝑖 will be used to determine the 𝑘 𝑄 by any bit rate model.e. 3.From (15) and (18). 4.e. 3.6. However. the second column shows ﬂuctuations of frame rates. We have tested two encoders: x264 encoder with its original rate control[7]. we know that to avoid overﬂow and underﬂow. deﬁned for each sequence and bit rate. This method allow maximum margins for both high threshold and low threshold and therefore robust to the imperfect channel estimation. Target residual bit estimation For a hybrid video coder with block-based coding scheme. i. implemented according to 3GPP TS 36. which directly related to the delay/delay jitter and video quality/quality variation. That is. Compared to 𝐻𝑟𝑒𝑠𝑖 . it might make more sense to strike a certain balance in choices of constraints for Δ𝑘 and 𝐿𝑘 . propagation delay 𝛿𝑝 . i. On the other hand. Both VBV and HRD options are enabled. The major limitation of this method is the variation of quality frame by frame resulted from time-varying channel and strictly low end-to-end delay constraint. The ﬁrst frame was encoded as an IDR frame and the following frames were encoded as P-frames with reference number equal to 3. strict delay (delay jitter is not allowable) and low delay (delay jitter is allowable). Another method is to design 𝐿𝑘 to minimize the quality variation given (15) and (18).1. in other words.5. 4. Therefore. Results An illustrative subset of our results is shown in Fig. the sum of initial cpb removal delay and initial cpb removal delay offset in SEI should be designed at 𝑘 𝑘 least larger than the frame period. Notice that in (19). it requires a very accurate rate-distortion models to estimate 𝐿𝑘 from targeted quality and some rate-distortion optimization are required [5. as shown in ﬁrst column. the decoder buffering delay 𝑘 should be non-negative. Lookahead was disabled and the maximum size for all NAL packets was set to 1400 bytes. 𝛿𝑑 can be extended to in𝑘 0 clude the difference between 𝛿𝑝 and 𝛿𝑝 . EXPERIMENTAL RESULTS 4.

2 50 (b) 0 0 2000 4000 6000 8000 10000 12000 14000 0 0 50 100 150 200 250 300 12 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 TTI index (1 ms per TTI) 10 900 Frame index (40 frames per second) Frame index 450 Frame index 9 800 proposed x264 8 700 45 proposed (before alignment) x264 (before alignment) proposed (after alignment) x264 (after alignment) 400 proposed x264 350 7 Channel rate (kbits) 600 6 Video rate (kbits) 40 300 500 Delay (ms) 35 250 5 400 PSNR (dB) 30 200 4 300 3 200 25 150 2 20 100 1 100 15 50 (c) 0 0 2000 4000 6000 8000 10000 12000 14000 0 0 50 100 150 200 250 300 10 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 TTI index (1 ms per TTI) Frame index (40 frames per second) Frame index Frame index Fig. 3. 720p.4 200 10 0. decoded and rendered before the delay bound. PSNR is shown for both before and after aligning the decoded frames.4 700 Channel rate (kbits) Video rate (kbits) 1. since almost all of frames can be received.8 400 25 0. peak signal-to-noise ratio (PSNR) are shown in third column. Instead. using the reconstructed frames in encoder for calculation.8 30 0. we will analyze how to balance the delay and quality by using some metrics integrating both of them.1% delay and 94. For example.6 20 300 20 0.4 250 30 Channel rate (kbits) Video rate (kbits) 50 Delay (ms) 1. Comparison of behavior of x264 VBR and our proposed rate control schemes under time-varying channels. In our algorithm. CIF. the adaptive method hardly provides any performance gain. bandwidth adaptive rate control saves up to 84. the former saves up to 70. we believe either way of comparing PSNR is not good for low-delay applications. Due to the space limit. those scene change frames will increase the delay/delay jitter. It can be observed. 3 the delay grows up to 845ms. x264 produces much worse PSNR and larger PSNR variation than our rate control algorithm at the rendering moment.5% delay jitter for the same sequence. the delay-rate-PSNR. the resulted PSNR in the decoder is close to that in the encoder. If the end-to-end delay bound is in seconds level (as in streaming video).6 proposed (before alignment) x264 (before alignment) proposed (after alignment) x264 (after alignment) 40 800 proposed x264 1. are shown since the comparison is between two three-dimensional surfaces instead of twodimensional curves. that is. Under VBR case.6 100 20 0. Under CBR case. We also compared the performance under imperfect chan- . 3. Delay is calculated by Δ𝑘 with 𝛿𝑝 = 0 in (22) and delay jitter is the variance of delay. In other words. x264 shows smaller PSNR variation than ours. and last column shows end-to-end delay. our proposed algorithm was able to maintain end-to-end delay close to the 90ms target.2 40 35 600 Delay (ms) 1 500 30 30 PSNR (dB) 0. only part of data set are shown in Table 1 and 2.8 60 proposed x264 50 900 1. However. that end-to-end delay with native x264 rate control varies signiﬁcantly. although the resulted reconstructed frames in the encoder has a little larger PSNR variation. In Fig. After aligning the decoded frames. In our future work. This is because the signiﬁcant delay and delay jitter in bitstreams produced by x264 and display has to use previous frames for rendering if frames were not received within the delay bound.5% delay and 87.6 proposed (before alignment) x264 (before alignment) proposed (after alignment) x264 (after alignment) 32 proposed x264 300 60 1.2 15 100 (a) 0 0 2000 4000 6000 8000 10000 12000 14000 0 0 50 100 150 200 250 300 10 0 0 50 100 150 200 250 300 0 50 100 150 200 250 300 TTI index (1 ms per TTI) 2 Frame index (40 frames per second) 80 Frame index 350 Frame index proposed x264 70 1. However.4% delay jitter than x264 rate control for foreman-cif sequence. CIF. the best can be achieved when the delay bound is inﬁnite. the gain reduces when the end-to-end delay bound increase.1.2 28 200 1 40 26 PSNR (dB) 24 22 20 18 16 14 150 0.8 1. due to the instantaneous channel capacity is time-varying and may much smaller than the maximum channel capacity. delay/delay jitter should be a better measure for the performance in this case.4 10 0. Note that the PSNR variation in the encoder without considering the channel variation is the best of what can be achieved in the decoder with considering the channel variation. for the ﬁrst sequence in Fig. Results are produced using the following video sequences: (a) Foreman. instead of traditional BD-PSNR/rate. This is because x264 VBR may allocate more bits to those scene change frames to meet HRD with maximum channel capacity. On the other hand. In fact. (b) Mobile. and (c) Parkrun. In Table 1 and 2.

97 94.5% 43.4% 44.99 96.videolan.77 4. We then applied those conditions to design a bandwidth adaptive rate control with low delay constraint.44 66.92 60.org/developers/x264.13 PSNR (dB) 35. Performance comparison with x264 CBR. September 2011.5% 73. [4] J. H. 352–365.3% 68.6% 9.42 28.9% Table 2.51 55. Instead.92 49. pp.21 23.96 x264 delay (ms) 341. HRD/VBV enabled sequence foreman-cif foreman-cif mobile-cif mobile-cif parkrun-720p parkrun-720p mobcal-720p mobcal-720p init delay (ms) 90 180 90 180 90 180 90 180 rate (kbps) 400 400 400 400 2000 2000 1000 1000 PSNR (dB) 17.17 28.10 19. “The picturephone is here.html.” IEEE Spectrum.” IEEE Transactions on Circuits and Systems for Video Technology. [2] “ISO/IEC 14496 − 10 ∣ ITU-T Rec.95 16.hhi. there is little impact.07 73.19 28. IEEE Transactions on. really.16 28.4% 45. REFERENCES [1] T.11 59. 840–849. vol. The algorithm facilitates the design of encoding schedule which produces bitstreams conform to HRD model even transmitted over a time-varying channel.51 60.41 14.34 110. Advanced Video Coding for Generic Audiovisual Services. Information Technology – Generic Coding of Moving Pictures and Associated Audio Information: Video.03 25. pp.88 48.14 43. e.5% 31.Table 1.g. [3] “ISO/IEC 138180 − 2. 5. [9] “Collection of test video sequences in YUV format. vol.2% jitter 94. “A generalized hypothetical reference decoder for h.40 86. and S. HRD/VBV enabled sequence foreman-cif foreman-cif mobile-cif mobile-cif parkrun-720p parkrun-720p mobcal-720p mobcal-720p init delay (ms) 90 180 90 180 90 180 90 180 rate (kbps) 400 400 400 400 2000 2000 1000 1000 PSNR (dB) 19.94 127.A.” Nov. Imperfect channel estimation has more impact on very low end-to-end delay case.86 142. 2003.61 86.5% 83.90 jitter (ms) 118.52 71.88 65.6% 61.de/suehring/tml/download.83 240. CONCLUSIONS We derived the sufﬁcient conditions under which an encoder can produce a bitstream for any time-varying channel without decoder buffer overﬂow and underﬂow.8% 25.264/AVC video encoder and tested using LTE simulator.J. e.04 1.86 15.66 56.78 20.18 52. our rate control still shows remarkable gain over both x264 VBR and CBR rate control.28 jitter (ms) 15. vol.” IEEE Transactions on Circuits and Systems for Video Technology. Due to the space limit.27 46. Test results suggest that it achieves very good performance and tight control over end-to-end delay.39 50.02 17.24 61.47 23. 2007.” http://iphome.46 6. “Optimum bit allocation and accurate rate control for video coding via 𝜌-domain source modeling.264. Wiegand and G.8% 15. 12.1% 32.93 6.85 proposed delay (ms) 54.1% 67. .3% 34.42 213. 7.” http://www.85 28. Ribas-Corbera.00 24. 48. 50–54.90 36. we give our observation below.21 28.90 21.22 28.1% 72.6% 15.79 44. Performance comparison with x264 VBR.41 23.24 24.” Circuits and Systems for Video Technology.7% 52.6% 76.41 87. The algorithm was implemented within the H.4% 92. 264/AVC. Video Buffering Veriﬁer.4% 6.xiph.11 gain delay 84. Sullivan. Our encoding schedule and rate control is applicable to future video coding standards.70 7.75 94.09 122.6% 81. 674–687. 9. 6. pp. Since higher target end-to-end delay has higher tolerance to these deviations. [6] Z.8% 81. 22. 2002. than on other end-to-end delay cases.39 98.264/AVC reference software [JM 16. Chou. Annex C.14 16. [7] “x264 open source video encoder implemenation. no. [5] Zhifeng Chen and Dapeng Wu. He and S.29 52.10 19.60 jitter (ms) 14.0].93 9.19 78.73 PSNR (dB) 18.04 11.03 x264 delay (ms) 183.86 123. ranging from miliseconds to hundreds of miliseconds. P. pp. This is because the imperfect channel estimation causes deviation between estimated and true of encoder buffer delay and transmission delay.37 70.01 84.1% jitter 87.38 172.14 98.49 21. no.95 139. 180/270/360ms.13 28. necessary for real-time applications. 2012.94 114.19 115.6% 26.g. such as High Efﬁciency Video Coding (HEVC).81 137.L.96 PSNR (dB) 35.55 28. 3. [8] “H.43 21. 10.” http://media.78 4. 13. Mitra.87 24.7% -6.45 jitter (ms) 268.87 12.66 18.32 50.54 PSNR (dB) 16. vol. no.89 29. we did not show such results here.4% nel estimation where the channel feedback has certain delay.15 proposed delay (ms) 54.K.93 36. “Rate-Distortion Optimized Cross-Layer Rate Control in Wireless Video Communication.28 16.1% 63.org/video/derf.” 2000. 90ms. Regunathan.42 23.28 18.1% 58.70 36.20 10.27 6.61 7. Even under imperfect channel estimation.97 17.92 4.12 -0.27 18.3% 63.71 70.12 gain delay 70.95 276. no.59 105.64 47.1% 51.82 352.

Sign up to vote on this title

UsefulNot usefulVCIP - HRD analysis for video encoding

VCIP - HRD analysis for video encoding

- Index
- A Stochastic Jitter Model for Analyzing Digital Timing-recovery Circuits
- qos
- Ransom Stephens and Tektronix Jitter 360
- Jack Wolf Lecture
- Creating a Hardware Decoder Integrating FFmpeg With MediaCodec
- LeCroy WaveRunner 6 Zi Datasheet
- TMS320VC5416 DSK Developing System
- Gsm
- ASIC-System on Chip-VLSI Design_ Clock Definitions
- MOS
- C33124_20-DIU2M
- 02-LosslessCoding1
- Hyppenen_110406
- Measurement and Analysis of the VoIP Capacity in IEEE 802.11 WLAN - 2009
- ad
- Traffic Analyzer
- Mobile Streaming
- Step-By-step Instructions by Miracle.flame
- H264_cavlc_wp_2
- nato-1999
- Operating Instructions
- Lifesize Icon 400 Datasheet A4
- VCIP12 HRD Analysis