Polytechnic University, Dept.

Electrical and Computer Engineering EE4414 Multimedia Communication System II Fall 2005, Yao Wang ___________________________________________________________________________________ Homework 6 Solution (Video Coding Standards) Reading Assignment: • Lecture slides • K. R. Rao, Z. S. Bojkovic, D. A. Milovanovic, Multimedia Communication Systems: Techniques, Standards, and Networks, Prentice Hall PTR, 2002. (Chap.5) • Excerpts from EL514-Multimedia Lab Manual on video coding Written Assignment (Quiz based on selected problems on 11/1) 1. What is the target application of the H.320 standard? What is the video coding standard used in H.320? H.320 is the standard for audio-visual conferencing/telephony over the ISDN channels. This standard is mainly used for tele-conferencing for business/education. At the time H.320 is developed, the H.261 standard is developed for video coding. But in the new systems, H.263 can also be used. 2. What is the target application of the H.323 standard? What is the video coding standard used in H.323? H.323 is the standard for audio-visual conferencing/telephony over the packet switched networks that does not provide guaranteed quality of services, mainly the Internet. It allows both H.263 and H.261 standards for video coding, but H.263 is preferred. 3. What is the target application of the H.324 standard? What is the video coding standard used in H.324? H.324 is the standard for audio-visual conferencing/telephony through the circuit switched telephone networks, including both wired and wireless phone modems. It allows both H.263 and H.261 standard for video coding, but H.263 is preferred. 4. What are the main differences between H.320, H.323 and H.324 applications in terms of available bandwidth and delay variation? The H.320 and H.324 standard are targeted for circuit switched networks which have dedicated channels allocated to a particular communication session. Therefore the available bandwidth is fixed and the delay variation is small. Because of this fixed bandwidth and delay, the quality of the received audio and video quality stays fairly constant in time. The H.320 uses the ISDN channel, with rates much higher that that affordable by either wired or wireless modems used by the H.324 system. The ISDN channels are also very reliable (very low bit error rates). The H.320 system is

). What is the target application of MPEG-1? What are the different parts of MPEG-1 standard? MPEG-1 is initially developed to enable storage of a 2-hour movie on a CD. This is the standard used to produce VCD. The H. . Describe some of the techniques adopted in H.261/H. The H. H.324 applications depend on the underlying application: the wired channels are more reliable than the wireless channels.263 video coding standards? H. rewind. But the compressed video bitstream should allow random access (fast forward. the encoder can also use B-frames for enhanced coding efficiency and random access capability.16). It also has a relatively low search range (-16. etc. which requires coding a future frame first and causes quite large delay. 6. For wireless channels. but does not involve two-way communication. so that the available bandwidth for sending audio and video signals is much lower. Both the encoder and decoder have to be able to process a video in real time to enable effective communication between people at different locations. It also allows a 16x16 MB to be divided to 4 8x8 blocks for more accurate motion estimation. a large portion of the bandwidth is used for channel error correction. Because of this variation. Between P-frames. overlapping block motion compensation. It allows a larger maximum search range (-32.263 allows half-pel accuracy motion estimation. The H.261 standard performs motion estimation at the integer-pel accuracy only.323 standard is targeted for the packet switched networks that do not guarantee quality of services. The channel quality for H.263 that helped improve its coding efficiency over H. With MPEG-1 (and MPEG-2). The MPEG-1 standard contains many parts. with each picture starting with an I-frame followed by P-frames. which has stringent delay requirement. as an option. and an audio-coding part. But MPEG-1 is also used now to distribute video together with audio over the Internet. The encoder is located at the originator of the video content and can be fairly complex. The bit rate can have large variation as long as the decoder has a large buffer. with large variations in available bandwidth and the end-to-end delay. so as not to cause large delay variation due to transmission. Therefore the encoder and decoder both cannot be overly complex. This requirement forbids the video encoder to insert I-frames periodically. It also allows for. rather only I-blocks when necessary (either for coding efficiency or for error resilience). including a video coding part.263 video coding standards differ mainly in how motion estimation is performed. Describe some of the differences between MPEG-1 and H. MPEG-1 on the other hand is targeted for viewing a video that is either pre-compressed or live compressed. All of these techniques help to improve the prediction accuracy and consequently the coding efficiency. and a system part which deals with how to synchronize audio and video.mainly used within a large corporation. Also the bit stream should have a fairly constant bit rate. the quality of the received audio and video signals can vary greatly in time. which is helpful when an MB contains multiple objects with different motions. this is enabled by organizing video frames into group of pictures (GoP). which can suppress the blocking artifacts in predicted images.261. 7.261 and H.32). 5.263 are targeted for two-way video conferencing/telephony.261 and H. This low-delay requirement rules out coding a frame as a B-frame. The viewer just needs a decoder to view the compressed video.

Describe some of the differences between MPEG-1 and MPEG-2 video coding standard? The main difference is that MPEG2 must handle interlaced video. Then the bits for each MB is split between a base layer and enhancement layer. The enhancement layer together with the base layer yields the original size video (with coding artifacts). 10. and. 3) Spatial scalability: the base layer codes a down-sized version of the original video. provides a more clear representation. the scope of the standard was expanded to consider broadcasting of video (SD and HD) over the air or cable and other types of networks. The base layer alone yields a coarsely quantized version of the original video. The base layer alone yields a some what blurred version of the original video. or the interpolated version of the current frame produced by the base layer. What is the target application of MPEG-2? What are the different parts of MPEG-2 standard? When the MPEG-2 standard was first developed. 9. Then the quantization error for the DCT coefficients are quantized again using a smaller step size. the enhancement layer includes the remaining coefficients. or the past coded frames in the enhancement layer for motion compensated temporal prediction. It also has a profile dealing with how to code stereo video or more generally multiple view videos. What are the different ways that MPEG-2 uses to generate two layer video? Explain each briefly. 2) SNR scalability: Each frame of a video is first coded using the conventional method but with a large quantization step size. The resulting bits constitute the enhancement layer.601 resolution (704x480) on a DVD with quality comparable or better than broadcast TV. The MPEG-2 standards includes a system part. For this purpose. Describe what it is and how a receiving user may make use of it? What are the three types of information contained in each object? . the enhancement layer codes the original size. The enhancement layer includes the detail information. Later on. when added to the base layer. an audio coding part. In addition. using either the coded frames in the base-layer. MPEG-4 video coding standard uses the so-called “object-based coding”. the enhancement layer together with the base layer yields a more accurate version. a video coding part.8. different modes of motion estimation and DCT scanning were developed that can handle interlaced sequences more efficiently. the major target application is to store a 2-hr video in the BT. or a weighted some of both. but with each frame predicted from either the past coded frame in the original size. MPEG-2 has 4 ways to generate layered video: 1) data partition: a video is first coded using conventional method into one bit stream. MPEG2 standard has options that allow a video to be coded into two layers. The resulting bits constitute the base layer. and several parts dealing with the transmission aspect. 4) Temporal scalability: the base layer codes the original video (say 30 frames/s) at a lower frame rate (say 10 frames/s). the enhancement layer codes the skipped frames. 11. The base layer includes the header and motion information and first few low DCT coefficients.

Different intra prediction modes are introduced to consider correlation along different directions.264 that helped improving its coding efficiency over H. change the viewing angle for one or several objects when displaying a sequence. . which approximates DCT but all computations can be done through integer operation. Also the transform block size can be varied from block to block depending on which one gives the best representation. It can choose not to decode certain objects. 3) More accurate motion estimation with 1/8 pel search step size. its motion. and variable block sizes from 16x16 down to 4x4. in the INTRA mode. or replace an object with some other pre-stored objects. the pixels in a block are coded directly through transform coding. 12. therefore not exploiting any spatial correlation that may exist between pixels in this block and pixels in adjacent blocks. 1) Intra-prediction. H. In the prior standards. 5) Deblocking filtering to remover blocking artifacts in reconstructed images.264 makes use of this correlation. the foreground moving object can be coded more accurately than background. a video sequence is decomposed into multiple objects. Intra-prediction in H. This enables the encoder to code different objects with different accuracy.263/MPEG-4. For example. The information transmitted for each object includes its shape.With object-based coding. The bidirectional prediction is generalized to allow prediction from past two reference frames with any weighting. 4) More efficient arithmetic coding. one can choose among several reference frames. Describe some techniques incorporated in H. 2) Integer transforms: instead of the DCT. and its texture (the color intensities in the initial frame and the prediction errors in following frames).264 uses an integer version of the DCT. The receiver can choose to compose the objects as desired. and each object is coded separately. Also instead of using a single reference frame. This helps to eliminate any numerical errors between the forward transform at the encoder and the inverse transform at the decoder.

Sign up to vote on this title
UsefulNot useful

Master Your Semester with Scribd & The New York Times

Special offer for students: Only $4.99/month.

Master Your Semester with a Special Offer from Scribd & The New York Times

Cancel anytime.