You are on page 1of 59

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec.

and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Video Coding

Tien Pham Van, Dr. rer. nat. Hanoi University of Technology

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Agenda

Video coding process Video coding standards Future development

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Introduction (1/2)

Why video compression technique is important ? One movie video without compression
720 x 480 pixels per frame 30 frames per second Total 90 minutes Full color The full data quantity = 167.96 G bytes !!
3

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Introduction (2/2)

What is the difference between video compression and image compression?


Temporal Redundancy

Coding method to remove redundancy


Intraframe Coding
Remove spatial redundancy

Interframe Coding
Remove temporal redundancy

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Desired Features

Better compression Improved quality Interactivity and Manipulation of Content Error Resilience Processing of content in the compressed domain Identification and selective coding/decoding of the object of interest Facilitate Search / Indexing (MPEG-7)

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Time table

H.265 VC-1/VC-2 H.26L H.263 H.261 MPEG4 MPEG2/H.262 MPEG1 JPEG


Year 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004 2010

H.264

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Evolution of Video Compression Standards


ITU-T
H.261 Video Telephony H.262/MPEG-2 Digital TV/DVD H.263 Video Conferencing MPEG-4 Visual Object-based Coding

MPEG
MPEG-1 Video-CD

H.264 MPEG-4 AVC

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Where used? MPEG-1


Video-CD Usually .mpg or .mpeg files are MPEG-1 DAB Digital Radio is MP2 (MPEG-1 Layer 2) MP3 files (MPEG-1 Layer 3)

MPEG-2:
.vob, .m2v, rarely .mpg files Anything to do with DVD
Camcorders, DVD players, DVD recorders

Digital TV (DVB)

MPEG-4:
High Quality AVI files Video Phones DivX Some advanced audio players support MPEG-4 Advanced Audio Coding (AAC)

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Where used?

H.263/+/++
NetMeeting and similar video-chat Network streaming application, video phone H.264
Video Conferencing: over different networks Multimedia Streaming: live and on-demand Multimedia Messaging Services (MMS) Blu-ray, Digital Video Broadcasting, iPod Video, HD DVD

VC-1, VC-2
Video on Internet, HDTV broadcast, UHDTV

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

R-D Performance of MPEG Codecs


50

48

46

44

PSNR (Y)

42

40

38

36

34

32 350

450

550

650

750 Bit rate (kbps)

850

950

1050

MPEG-1

MPEG-2

MPEG-4

H.264

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Questions

What are video/audio codecs ? Name some popular codecs that your media players support. What are disadvantages of using specific codecs ? What is container format? Name some examples. Codecs and Formats

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Compression...

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

movie picture 1

movie picture 2

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Horse ride

Pixel-wise difference w/o motion compensation

Motion estimation

Residue after motion compensation

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Motion Prediction

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Motion vector: a motion vector is a bidimensional pointer that tell the decoder how much left/right and up/down Motion estimation: the process, perfomed by the coder, that should find the motion vector pointing to the best prediction macroblock in a reference frame or field Motion compensation: what obtained after applying motion vector on reference frame

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Motion Estimation

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Help understanding the content of image sequence


For surveillance

Help reduce temporal redundancy of video


For compression

Stabilizing video by detecting and removing small, noisy global motions


For building stabilizer in camcorder

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Motion Compensation

It aims to reduce the data transmitted by detecting the motion of objects


Use the previous as reference In steps:
Split the current frame in blocks. For each one: Find the best-matching block in the reference frame The best matching block is coded and transmitted

Next frame can be used a reference too

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Picture type

Slice
One or more "contiguous'' macroblocks. The order of the macroblocks within a slice is from left-to-right and top-to-bottom.

Macroblock
A 16-pixel by 16-line section of luminance components and the corresponding 8-pixel by 8-line section of the two chrominance components.

Block
A block is an 8-pixel by 8-line set of values of a luminance or a chrominance component.

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

CODEC Design

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Coding functions

Achieve high compression performance while keep good picture quality Theorem
Spatial redundancy DCT,DFT,subband,wavelet Temporal redundancy MC/ME Statistical redundancy VLC, Entropy coding Perceptual redundancy VQ

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Tradeoffs in lossy compression

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

DCT

Use the technique of the JPEG


DCT based coding scheme
DCT transform (2D)

3D DCT transform ?

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Discrete cosine transform

Use the technique of the JPEG


Discrete cosine transform

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

DCT Transformation

23

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Steps

Spatial-to-DCT domain transformation 8 x 8 DCT Discard unimportant DCT domain samples Quantization Lossless coding of DCT domain samples Entropy Coding

Image

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Quantization

Quantization
Eyes are insensible to high-frequency components The greater quantizer means greater loss Lower frequency component has smaller quantizer, high frequency component has greater quantizer The quantization tables in the encoder and decoder are the same

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Picture type

Video bit stream

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Picture type

Intra picture
Coded using only information present in the picture itself I-pictures provide potential random access points into the compressed video data.

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Picture type

Predicted picture
coded with respect to the nearest previous I- or Ppicture. P-pictures use motion compensation Unlike I-pictures, P-pictures can propagate coding errors

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Picture type

Bidirectional picture
Coded use both a past and future picture as a reference B-pictures provide the most compression and do not propagate errors

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Picture type

Typical display order of picture types

Video stream composition


The MPEG encoder reorders pictures in the video stream to present the pictures to the decoder in the most efficient sequence

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Hybrid MC-DCT Video Encoder

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Intra-frame: encoded without prediction Inter-frame: predictively encoded => use quantized frames as ref for residue

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

MPEG-1 = JPEG + Motion Prediction + Rate Control

Early motivation: to encode motion video at 1.5Mbits/s for


transport over T1 data circuits and for replay from CD-ROM

Defines the decoder but not the encoder Frames (pictures) A22
Intra-coded using JPEG Inter-coded using (interpolated) ME & MC and JPEG for the residuals

A21

MacroBlocks (MBs)
1616 pixels block

Rate control
buffer at each end Test Model 5 (TM5)

Slide 32 A22 Intracoding of MBs in MPEG is as same as what is described for JPEG, except that 1) unless otherwise specified in the sequence header MPEG defines quantization tables: one is used for intracoding, the other is used to code any residules when prediction by montion estimation. 2)Quantization scale factor, or MQuant is different.
Author, 6/17/2004

A21

MPEG does not define the encoder. A valid encoder produces a syntactically correct bit stream, resulting in the desired output if the bit stream is fed to a compliant decoder. But an MPEG-1 complaint decoder is required to decode all valid MPEG-1 bit streams.
Author, 6/17/2004

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

MPEG-2 = MPEG-1 +

Improvements
Color space: could support 4:2:2 and 4:4:4 coding Quantization: could have 9- or 10- bit precision for DC coefficients Concealment motion vectors: used when an intra-MB is lost Pan and Scan: supports display of different aspect ratios, e.g., 16:9

Profiles and levels


Profiles: define the tools or syntactical elements Levels: define the permissible ranges of parameters

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

MPEG-2 = MPEG-1 +

Interlace tools Scalable coding profiles System layer: define two bit stream constructs
Program stream (PS): modeled on MPEG-1 (backward compatibility) Transport stream (TS): more robust, does not need a common time base, designed for use in error-prone environment.

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

MPEG-4 = MPEG-2+Objects+Other Enhancements Object-oriented


Video (texture+shape), image, audio, speech, text, etc. Encoded using different techniques Transmitted independently Composited at the decoder using BInary Format for Scenes (BIFS)

Improvements in MPEG-4 version2


Global motion compensation (GMC) Quarter pixel motion compensation Shape-adaptive DCT

Why is MPEG-4 not a success as MPEG-2?


Not substantially better than MPEG-2 Suffers from its sheer size and flexibility Issue of licensing

35

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

MPEG-4 Error Resilience Tools

Video packet resynchronization


Previous coding standards: Resynchronization markers are fixed at the beginning of each row of MBs MPEG-4: Resynchronization markers are inserted at every K bits

Data partitioning
Partitions the data in a video packet into a motion part and a texture part separated by a motion boundary marker (MBM)
I-VOP A video packet VP Header Resync. marker MB No. DC DCT data QP AC DCT data HEC Repeated header info. P-VOP Motion data VP Header MBM DCT use discard Motion data Texture data data use

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

MPEG-4 Error Resilience Tools

Reversible variable length codes (RVLC)


Finds the next resynchronization marker and decode backwards

Header extension code (HEC)


The header information is repeated after the 1-bit HEC

Unequal error protection technique (UEP)


I-VOP A video packet VP Header Resync. marker MB No. DC DCT data AC DCT data Repeated header info. P-VOP VP Header Motion data Texture data QP HEC Motion data MBM DCT use discard data use

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

New Features of H.264

Multi-mode, multi-reference MC Motion vector can point out of image border 1/4-, 1/8-pixel motion vector precision B-frame prediction weighting 44 integer transform Multi-mode intra-prediction In-loop de-blocking filter UVLC (Uniform Variable Length Coding) NAL (Network Abstraction Layer) SP-slices

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Profiles and Levels

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Profiles: Baseline, Main, and X


Baseline: Progressive, Videoconferencing & Wireless Main: esp. Broadcast X: Mobile network

Baseline profile is the minimum implementation


Without CABAC, 1/8 MC, B-frame, SP-slices

11 levels
Resolution, capability, bit rate, buffer, reference # Built to match popular international production and emission formats From QCIF to D-Cinema

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Basic Marcoblock Coding Structure


Coder Control Control Data Quant. Transf. coeffs Scaling & Inv. Transform Entropy Coding De-blocking Filter Output Video Signal Motion Data Motion Estimation

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Input Video Signal

Split into Macroblocks 16x16 pixels

Transform/ Scal./Quant.

Decoder

Intra-frame Prediction MotionCompensation

Intra/Inter

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Variable block size

The fixed block size may not be suitable for all motion objects
Improve the flexibility of comparison Reduce the error of comparison

7 types of blocks for selection


1616, 168, 816, 88, 84, 48, 44

41

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Multiple Reference Frames

The neighboring frames are not the most similar in some cases The B-frame can be reference frame
B-frame is close to the target frame in many situations

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Spatial Prediction for Intra-Coded MBs luma - 4x4: 9 modes


M A B C D I J K L M A B C D I J K L M A B C D I Mean J (A-D, K I-M) L M A B C D E F G H I J K L

H ..

- 16x16: 4 modes chroma - 8x8:

..

Mean (H, V)

H ..

4modes

Mean (H, V)

..

- The same prediction mode is always applied to both chroma blocks

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Deblocking filter
Picture is filtered using an adaptive deblocking filter. The filter removes visible block structures on the edges of the 4 X 4 blocks caused by block-based transform coding and motion estimation

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Deblocking Filters
A boundary-strength (BS) parameter is assigned to every 44 block
Block modes and conditions One of the blocks is intracoded and the edge is a MB edge One of the blocks is intracoded One of the blocks has coded residuals Difference of block motion one luma sample distance Motion compensation from different reference frames Else (BS) 4

BS = 0 No filtering BS = 1-3 Slight filtering BS = 4 Strong filtering Filters only when


|P0-Q0|< |P1-P0|< |Q1-Q0|<
P3 P2 P1 P0 Q0 Q1 Q2 Q3

3 2 1

Thresholds and depend on the average quantization parameter (QP) The deblocking filtering accounts for 1/3 of the computational complexity of a decoder.
46

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

SP and SI-Frame Design

SP and SI-frames
allow identical reconstruction when coded using different references Subtract the reference in the coder and add it back in the decoder

Bitstream switching
In previous coding standards: perfect (mismatch-free) switching only happens at Intra-frames.
Stream 2: P2,n-2 P2,n-1
SP2,n

P2,n+1

P2,n+2

SP12,n

Other applications
Bitstream splicing Error recovery/resilience Video redundancy coding

Stream 1: P1,n-2 P1,n-1 P1,n P1,n+1 P1,n+2

47

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Transformation

H.264 employs a 4X4 integer transform

The transform is an approximation of the DCT


It has a similar coding-gain to the DCT transform. Since the integer transform has an exact inverse operation, there is no mismatch between the encoder and the decoder which was a problem in all DCT based codecs

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Network friendliless

H.264 structure
Video coding layer (VCL) Network abstraction layer (NAL)

Scope of H.264 standard

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

H.264 Over IP
Network Abstraction LayerOSI/RM Unit (NALU)
A byte stream of variable length 1-byte header
NALU type (T) NALU importance (R) Error indication (F)
T R F

Protocols and specifi-cations for H.264 RTP (Real-Time Transport Protocol) Header size: IP/UDP/RTP = 20+8+12=40 bytes Media-Unaware RTP payload specifications to reduce the loss rates observed by the decoder. Packet duplication/Packet based FEC/Audio redundancy coding Control protocols: H.245, SIP (Session Initiation Protocol), SDP (Session Description Protocol), RTSP (Real-Time Streaming Protocol)
A1 UDP (User Datagram Protocol) IP: best effort service

Application Layer Presentation Layer

RTP packetization
Simple packetization
One NALU in one RTP packet NALU header as RTP header

Session Layer

Transport Layer Network Layer

NALU fragmentation NALU aggregation

Slide 50 A1 IP header is 20 bytes in size and protected by a checksum. No protection of the payload is performed.
Author, 8/24/2011

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Comparison

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

H265 outlook Half-rate reduction compared to H264 Tree-structured prediction and residual difference block segmentation Extended prediction block sizes (up to 64x64) Tile and slice picture segmentations for loss resilience and parallelism Wavefront processing structure for decoder parallelism Mode-dependent sine/cosine transform type switching Adaptive motion vector predictor selection Temporal motion vector prediction

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

3D video coding

Left and right eye view Depth sensation Resolving 2D viewing ambiguity Additional features: Free view points Depth-controlled object insertion

53

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Multiview Frame Structure


time

view

..

. . .

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Predictions based on H.264/AVC JM95

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Homework 1

Download the open source tool X264 from VIDEOLAN website Capture a video sequence via webcam or from the Internet Work around with FFMPEG to encode and transcode the video sequence with different standards (mpeg2, mpeg4, h.263, h.264, etc), parameters Playback the encoded video and comment Contain the encoded video sequence in mp4 format

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Homework 2

Draw decoding diagrams for MPEG1, MPEG2, MPEG4, H264 and 3D

Pham Van Tien, Dr. rer. nat. , Embedded Networking Research Group Faculty of Elec. and Telecom, Hanoi University of Science and Technology

Email: tienpv-fet@mail.hut.edu.vn C9-411 Dai Co Viet str. 1, Hanoi

Future development Future coding/presentation standards:


H265, VC-1, VC-2 MPEG-21, MHEG

Computer vision
Game Graphics

Multimedia retrieval
Segmentation Search (Google)

Multi-camera system
3D cinema Realistic broadcasting