You are on page 1of 91

Table of Content

MPEG-4 standard
O. Le Meur
olemeur@irisa.fr
Univ. of Rennes 1
http://www.irisa.fr/temics/staff/lemeur/

October 1, 2012

Table of Content

MPEG-4 standard

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Entropy coding

Profiles

Intra prediction modes


9

Amendments

10

Conclusion

MPEG-4

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Entropy coding

Profiles

Intra prediction modes


9

Amendments

10

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

History

A rapid acceptance of this new standard


First Test Model of H.26L in August 1999;
Formation of JVT (Joint Video Team) between VCEG and ISO/IEC/ JTC/ 1/SC
29/WG11 (MPEG) to establish a joint standard H.264 / MPEG-4 AVC;
ITU-T Approval: May 2003, ITU-SG16, Final standard approved;
ISO/IEC Approval: March 2003, Final draft international standard.

The target is to double the coding efficiency.


Core experiments
CfE

CfP

First solution
Assessment of proposals

Iteration
t
Verification Model (VM)
4

VM Evolution

History

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

History

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Terminology
H.26L (has become outdated...);
JVT (joint Video Team) or JVT codec;
JM2.x, JM3.x, JM4.x;
AVC or Advanced Video Coding.

MPEG-4 part 10 (Official MPEG term)


H.264 (Official ITU term)

Video compression standard


A common framework for the different video standards (H.261, MPEG-1, MPEG-2,
H.263, MPEG-4, H.264/AVC):

Motion-compensated hybrid coding1

1
For this part, most of the figures have been extracted from B. Girods courses (EE398 Image and
Video Compression).

Important differences in many details

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

A big toolbox

Overview of new features to improve the prediction


Directional spatial prediction for intra coding;
Variable block-size MC with small block size;
Quarter-sample-accurate MC;
Motion vectors over picture boundaries;
Multiple reference picture MC;
Weighted prediction;
Improved skipped and direct motion inference;
In-the-loop deblocking filter.

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

A big toolbox

Overview of new features to improve the coding efficiency


Small block-size transform;
Exact-match inverse transform;
Short word-length transform;
Hierarchical block transform;
Arithmetic entropy coding;
Context-adaptive entropy coding.

10

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

A big toolbox

Overview of new features to improve the robustness to data errors/losses


Flexible slice size;
Flexible MB ordering (FMO);
Arbitrary slice ordering (ASO);
Redundant pictures;
Data partitionning;
SP/SI synchronization/switching pictures.

11

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

Hierarchical syntax

Introduction and overview

In loop Deblocking filter

Hierarchical syntax
Slice and macroblock
Slice
Macroblock

Transform

Entropy coding

Profiles

Amendments

Motion Compensation and prediction


modes

Intra prediction modes


10

12

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

Slice and macroblock

Pictures are still divided in slices and MBs.


Slice
Picture has one or more slices;
Slices are self-contained;
Slices are a sequence of MBs.

Macroblock
Basic syntax and processing unit;
1 MB is composed of 16x16 pixels of luminance and 2 blocks 8x8 of chrominance (4:2:0);
MB within a slice depends on each other;
MB can be further partitionned.
13

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

FMO (Flexible MB Ordering)

Slice group
A slice group is a subset of the MB in a coded picture and may contain one or more slices.
The goal is to divide the picture into different scan patterns of macroblocks. Different types
can be used:
Type 0: interleaved;
Type 1: dispersed;
Type 2: foreground and background. All but the last slice group are defined as
rectangular regions within the picture. The last slice group contains all MBs not
contained in any other slice group (background);
Type 3: box-out. A box is created starting from the center of the frame;
Type 4: raster scan;
Type 5: wipe (vertical scan order);
Type 6: explicit (a MB map is entirely user-defined).

14

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

FMO (Flexible MB Ordering)

FMO can be used:


to distribute errors more randomly over the picture
to allocate more bit rate to Regions of Interest
to reduce the computational load while decoding (parallel processing) to reduce
the computational load
15

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

FMO (Flexible MB Ordering)

Example extracted from Unequal Error Protection Technique for ROI Based H.264
Video Coding , H. K. Arachchi, CCECE 2006.

H.264 network abstraction layer (NAL) encodes each slice into separate data packet.
16

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

FMO (Flexible MB Ordering)

Type 2: RoI coding (foreground background)


Type 3: Box-out (direction of
rotation, number of slices and
number of MB per slices)

17

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

FMO (Flexible MB Ordering)

Type 4: Raster scan (direction


(normal vs inverse), number of
slices and number of MB per
slices)
Type 5: Wipe (direction (normal
vs inverse), number of slices and
number of MB per slices)

18

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

Introduction of Error resilience/Error concealment concepts


Error resilience
Error resilience refers to mechanisms in the encoder that enhance the ability of the compressed
bitstream to resist channel errors. Error resilience functionality in the encoder produces a
bitstream that supports error recovery at the decoder.
Selective intra coding
Frame segmentation into slices
Constrained texture prediction
Flexible Macroblock Order (FMO)
Error concealment
Error concealment refers to the actions taken by the decoder to analyze losses and conceal
them in the displayed video by minimizing the visual artifacts. The concealment schemes can
be spatial, temporal or combined.

19

In spatial interpolation, the values of missing pixels are estimated from the surrounding
pixels of the same frame, without using the temporal information.
Temporal interpolation is based on the corresponding regions of the reference frames. If a
motion vector is missing, it can be estimated based on the motion vectors of the
surrounding regions.
Combination schemes use an adaptive mechanism to choose the best concealment

Slice

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

I-Slice
An I slice contains only intra-coded MBs (predicted from previously coded samples in the same
slice). I=Intra.

P-Slice
An P slice can contain inter-coded MBs (predicted from samples in previously coded pictures),
intra-coded MBs or Skipped MBs. P=Predictive.

B-Slice
An B slice can contain inter-coded MBs (each MB partition can be predicted from samples of
one or two reference pictures before and after the current picture). B=Bi-predictive (6=
Bi-directional as in MPEG-2!!)

SP-Slice and SI-Slice


These slices are specified to ease the switching between bitstreams coded at various bit-rate.
20

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

Macroblock

MB-AFF=MB-adaptive frame/field
The concept of macroblock frame/field coding decision was originated from MPEG2
standard. Instead of splitting up a 16 16 MB into two 16 8 blocks, super MB is
defined as a decision unit (16 32).
The frame is scanned as MB pairs. For each MB pair (16 32), the coding type
frame/field is decided. A super MB can be coded as:
two frame MBs of 16x16;
one top-field MB of 16x16 and one bottom-field MB of 16x16.

Coding MB pair in field mode requires modifications to a number of the encoding and
decoding steps.
21

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice and macroblock


Slice
Macroblock

Macroblock - MB-AFF=MB-adaptive frame/field


The macroblocks in grids are coded in field mode, and the remaining macroblocks are
coded in frame mode.

Extracted from [Yu et al,2006].


22

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion Compensation and prediction modes

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes
Motion compensation
Multiple Reference Frames
Motion vector prediction

Entropy coding

Profiles

Amendments

Intra prediction modes


10

23

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Variable block-size and shape for MC


MPEG-4/AVC supports motion compensation block sizes ranging from 16 16 to
4 4 luminance samples.
MB partitions: 16 16, 16 8, 8 16, 8 8
Sub-MB partitions: 8 4, 4 8, 4 4
These partitions and sub-partitions give rise to a large number of possible combinations
within each macroblock. This method of partitionning MBs into motion compensated
sub-blocks of varying size is known as tree structured motion compensation.

24

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

In areas where there is little change between the frames, a 16 16 partition is


chosen;
In areas of detailed motion (residual appears black or white), smaller partitions
are more efficient.
25

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Motion Vectors
Each partition or sub-MB partition is predicted from an area of same size and shape in
a reference picture.
The motion vector has a quarter-sample resolution for the luma component and
one-eighth-sample resolution for the chroma components.
Sub-pixel motion compensation can provide significantly better compression
performance than integer-pixel compensation, at the expense of increased complexity.
Quarter-pixel accuracy outperforms half-pixel accuracy.

26

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Half-sample positions are obtained by applying a 6-tap filter with tap values
1 5 20 20 5 1
( 32
, 32 , 32 , 32 , 32 , 32 ):
Sample b: b = b(E 5F + 20G + 20H 5I + J)/32c
Sample h: h = b(A 5C + 20G + 20M 5R + T )/32c
27

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Once all the half-pels samples are available, the samples at quarter-pel positions
are produced by linear interpolation (average of samples at integer and half-pel
sample positions).

28

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Multiple Reference Frames


More than 2 frames can be used for prediction!!
Block prediction is done by a weighted sum of blocks from reference pictures
Previously encoded pictures are stored in a reference buffer in both the encoder
and the decoder (reference pictures list 0 and list 1);
These lists may be composed of short term and long term reference pictures (list
0 is mainly composed by the closest past pictures whereas list 1 by closest future
pictures);
By default, a recently-coded picture is marked as a short term picture. When the
buffer is full, the oldest short term picture is removed from the buffer (sliding
window);
Long-term pictures remain in the buffer until explicitly removed or replaced.

29

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

List 0: the closest past picture,


followed by any other past pictures,
followed by any future pictures;
List 1: the closest future picture,
followed by any other future
pictures, followed by any past
pictures;
From [Richardson,03].
Multiple Reference Frames
Multiple reference frames can be used to predict MBs (maximum number of
reference frames limited to 15 in the buffer (2x6 short term + 3 long term));
Multiple reference frames are most helpful for sequences with chaotic motion;
High computational load on the encoder, small at the decoder;
Increases memory requirements.
30

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

H.264/AVC allows unrestricted motion vector (motion vectors can point outsidet he
image area. In this case, the reference frame is extended beyong the image boundaries
by repeating the edge pixels before interpolation).

31

From [Richardson,03].

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Multiple Reference Frames


For MB partitions: 16 16, 16 8, 8 16, 8 8
Each partition can have its own reference picture and prediction direction (List 0,
List 1, BiPred).
For Sub-MB partitions: 8 4, 4 8, 4 4
Each sub-MB partition can have its own motion vector but have to use the same
reference picture and prediction direction as its 8 8 partition.
32

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Motion compensation in P slices


In addition to Intra MB coding types, various predictive coding types are specified .
Inter-coded MB: inter prediction from previously-decoded reference pictures (bi-pred is
allowed);
Skipped MB: no data is sent. The decoder calculates a vector for the skipped MB (The
decoded calculates a vector for the skipped MB and reconstructs the MB using
motion-compensated prediction from the first reference picture in list 0).
For P frames, each 8 8 MB partition is only predicted from one frame, but that frame may
be different from the prediction frame used in neighboring MBs.

33

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Motion compensation in B slices


B frames can also be used as reference frames for P frames. This means that B
pictures can be stored in the reference frame buffer and used as predictors for other
pictures, which was not allowed in MPEG-2.
B-slices utilize a similar MB partitioning to P-Slices;
With intra 16 16 mode, four different types of inter-picture prediction are
supported:
List 0: the prediction signal is formed by utilizing MC from a picture of the first
reference picture buffer;
List 1: idem but with L1;
Bi-predictive: the prediction signal is formed by a weighted average of a MC list 0
and list 1 prediction signal;
Direct: no motion vector is transmitted. Instead, the decoder calculates List 0 and
list 1 vectors based on previously coded vectors and uses these to carry out
bi-predictive MC of the decoded residual samples (spatial or temporal modes).

The main difference between B and P slices: inter predicted MB or blocks in B


slices may use a weighted average of two distinct reference frames.
34

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation
Direct prediction mode

Spatial: it predicts the movement from neighbour macroblocks in same frame. A


possible criterium could be to copy the motion vector from a neighbor block.
These modes are used in uniform zones of the picture where there is not much
movement;
Temporal: motion information is derived by considering the motion parameters of
the co-located MB/block in the subsequent reference picture. These parameters
are scaled according to the temporal distance of the reference (Assumption:
object is moving with constant speed).

35

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Weighted prediction
Weighted prediction is a method of modifying the samples of motion-compensated
prediction data in a P or B slice MB:
Explicit weighted prediction for P and B slice MB: the weighting factors are
determined by the encoder and transmitted in the slice header;
Implicit weighted prediction for B slice MB: the weighting factors are calculated
based on the relative temporal positions of the list 0 and list 1 reference pictures.
Weighted prediction may be effective in coding of fade transitions.

36

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

Motion vector prediction


Encoding a motion vector for each partition can cost a significant numbers of bits,
especially if small partitions are chosen.
Idea: to take into account the correlation that exists between vectors. Usually
there is a high correlation among the MVs of the adjacent blocks. Use this to
predict the motion vector, and calculate the difference (MVD);
Goal: Motion vector encoding can contribute to a significant amount of bits per
picture, especially at low bit rates (overhead). So transfer only the MVD.

37

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation

E is the current block, and A,B,C are its


neighbors.
Predictor = median (MVA, MVB, MVC)
MVD = Predictor - MVE

If one or more block are not available,


modify the choice accordingly.

38

Motion compensation
Multiple Reference Frames
Motion vector prediction

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Motion compensation
Multiple Reference Frames
Motion vector prediction

Motion compensation

For 8 16 partitions, MVp for the left 8 16 partition is predicted from A and
MVp for the right 8 16 partition is predicted from C.
For 16 8 partitions, MVp for the upper 16 8 partition is predicted from B and
MVp for the lower 16 8 partition is predicted from A.

Chroma vectors are derived from luma MV by dividing them by 2.

39

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Intra prediction modes

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Entropy coding

Profiles

Intra prediction modes


Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Amendments

10

40

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Intra prediction

This is a novelty compared to previous standards...


Principle
Spatial prediction using surrounding available samples;
Luma intra prediction can be done:
By means of a single prediction for an entire 16 16 macroblock (4 modes);
By means of 16 individual predictions on 4 4 block (9 modes).

Chroma intra prediction


Single prediction type for both 8 8 regions.

41

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Intra prediction

16 16 luma prediction modes


Four prediction modes are available:
Mode 0 (vertical): extrapolation from upper samples (H);
Mode 1 (horizontal): extrapolation from left samples (V);
Mode 2 (DC): mean of upper and left-band samples (H+V);
Mode 3 (Plane): a linear plane function is fitted to the upper and left-band
samples H and V.

42

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Intra prediction

16 16 luma DC prediction
if TOP and LEFT predictors are available:
mean=(sum(H)+sum(V)+16)/32

if TOP predictors are available:


mean=(sum(H)+8)/16

if LEFT predictors are available:


mean=(sum(V)+8)/16

else mean = 128

43

16 16 Plane mode
Given the top predictors (T0...T15), left predictors (L0...L15) and the left-top corner predictor
(LT) arranged as follows:

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Intra prediction

16 16 luma prediction modes

45

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Intra prediction

4 4 luma prediction modes


Nine prediction modes are available.

46

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Intra prediction

4 4 diagonal down/left luma prediction mode

47

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Intra prediction

48

Principle
16 16 Intra prediction modes
4 4 Intra prediction modes

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Idea
Algorithm
Examples

Deblocking filter

Introduction and overview

Hierarchical syntax

Motion Compensation and prediction


modes

Intra prediction modes

In loop Deblocking filter


Idea
Algorithm
Examples

Transform

Entropy coding

Profiles

Amendments

10

49

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Idea
Algorithm
Examples

Goal of the deblocking filter

Due to coarse quantization at low bit rates, block-based coding typically results in
visually noticeable discontinuities along the block boundaries.
Idea
To remove such blocking artifacts, a deblocking filter operating within the predictive
coding loop is proposed:
As the coder and the decoder must do the same operation, this filter also
constitutes a required component of the decoding process;
Adaptivity on different levels (slice, edge...);
To improve the appearance of the decoded pictures;
Significantly superior to post filtering (the filter reduces bit rate by typically 5-10
percent);

50

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Idea
Algorithm
Examples

Algorithm

Filtering is applied to vertical or horizontal edges of 4 4 blocks in a MB in the


following order:

51

Filtering the vertical boundaries of the luma component;

Filtering the horizontal boundaries of the luma component;

Filtering the vertical boundaries of each chromatic component;

Filtering the horizontal boundaries of each chromatic component;

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Idea
Algorithm
Examples

Algorithm

The deblocking filter is an adaptive filtering:


On slice level: the global filtering strength can be adjusted to the individual
characteristics of the video sequence;
On edge level: the filtering strength depends on inter/intra, motion and coded
residuals;
On sample level: the filtering strength depends on the gradient of image samples
across the boundary.
Specially strong filter for macroblocks with very flat characteristics almost
removes tiling artifacts.
A boundary strength parameter (Bs ) is defined for each block: Bs = {1, 2, 3, 4}, 4 =
strongest filtering.

52

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Idea
Algorithm
Examples

Algorithm

Filter decision
The set of eight pixels across a 4 4 block horizontal or vertical boundary is denoted
as shown below, with actual boundary between p0 and q0 .
Filtering condition: A group of samples is filtered only if:
Bs 6= 0 and
|p0 q0 | < and |p1 p0 | < and |q1 q0 |
and are defined in the standard. They increase with the average quantiser
parameter Qp of the two blocks.

disabled: there is a high gradient across the block boundary in the original image;
Qp is small: and are small (probability to have blocking effects is very low);
Qp is high: and are high.
53

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

54

Idea
Algorithm
Examples

From [Richardson,03].

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

55

Idea
Algorithm
Examples

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

Transform

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Motion Compensation and prediction


modes

Transform
Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

Intra prediction modes


7

Entropy coding

Profiles

Amendments

10

56

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

Introduction

H.264 uses 3 transforms depending on the type of residual data:


A Hadamard transform for the 4 4 array of luma DC coefficients in Intra MB
predicted in 16 16 mode;
A Hadamard transform for the 2 2 array of chroma DC coefficients;
A DCT-based transform for all other 4 4 blocks in the residual data.

57

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

4 4 residual transform
A DCT-based transform

This transform operates on a 4 4 blocks of residual data (after MC prediction or


intra prediction). Fundamental differences with previous standards:
It is an integer transform;
Possible to ensure zero mismatch between encoder and decoder;
Can be implemented using only additions and shifts;
A scaling multiplication is integrated into the quantizer (to reduce the total
number of multiplications).
Separable transform of a block B44 : C44 = Tv B44 Th .

1
1
1
1
2
1
1 2

Tv = Th =
1 1 1
1
1 2
2
1

58

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

4 4 Hadamard transform
Intra 16 16 MB type

Hadamard transform is applied on the 4 4 array of luma DC coefficients in Intra MB


predicted in 16 16 mode.
The Hadamard transform Hm is a 2m 2m matrix, that transforms 2m real numbers
into 2m real numbers. The Hadamard matrix is commonly defined with a recursive
approach. We define the 1 1 Hadamard transform H0 by the identity H0 = 1, and
then define Hm for m > 0 by:


Hm1
Hm1
,
Hm = 1
2 Hm1
H
m1

1
1
1
1



1
1
1 1
1 1

H1 = 1
, H2 = 12
1
2
1 1
1 1 1
1 1 1
1

59

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

4 4 Hadamard transform

(a) Luma 4 4 DC

60

(b) Chroma 2 2 DC

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan

4 4 Hadamard transform

(a) zigzag scan order (frame (b) zigzag scan order (field
block)
block)

61

(c) zigzag scan order (4 4 frame


block)

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

Entropy coding

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Intra prediction modes

Entropy coding
Introduction
CA-VLC
CABAC

Profiles

Amendments

10

62

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

Introduction

Definition (Entropy coding)


The entropic coding converts a vector X of integers from a source S into a binary
stream Y . It exploits the redundancies in the statistical distribution of X to reduce as
much as possible the size of Y (Variable Length Codes).
Aim: lossless compression
Idea: represent redundant or repeated data with less number of bits
Techniques:

63

Run-length coding
Huffman coding
Basic arithmetic coding
CABAC (Context Adaptive Binary Arithmetic Coding)
CAVLC (Context Adaptive VLC)

CA-VLC

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

Goal: to encode residual, zig-zag ordered 4 4 (and 2 2) blocks of transform


coefficients.
Principle
The encoder switches between different VLC tables for various syntax elements,
depending on the values of the previously transmitted syntax elements in the same
slice.

64

After prediction, transformation and quantization, blocks are typically sparse


Run Length Encoding

The highest nonzeros coefficients after the zigzag scan are often sequences of 1

The number of nonzero coeff in neighbouring block is correlated LUT choice

The magnitude of nonzero coefficients tends to be larger near the DC coeff


adapting the choice of LUT

CA-VLC

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

CAVLC uses a run-level coding to represent strings of zeros compactly.


1 Encode the number of coefficients and trailing ones (coeff_token):
The total number of nonzero coeffs (TotalCoeffs) ( {0, ..., 16});
TrailingOnes can be anything from 0 to 3.

Four LUTs are available (three VLC and one fixed-length code table). The choice
of Table depends on the number of nonzero coeffs previously coded blocks (=
Context-adaptive).
Encode the sign of each TrailingOne: for each TrailingOne signalled by
coeff_token, the sign is encoded with a single bit in reverse order, starting with
the highest-frequency TrailingOne;
Encode the levels of the remaining nonzero coefficients:
Level of non zero coeffs = sign + magnitude;
Encoded in reverse order;
Code for each level = level_prefix + level_suffix. This last value is adapted
depending on the magnitude of each successive coded level (difference of
magnitude, threshold and LUT) (= Context-adaptive).

65

Encode the total number of zeros before the last coefficient: the sum of all
zeros preceding the highest nonzero coefficient in the reordered array is coded
with a a VLC;
Encode each run of zeros: the number of zeros preceding each non zero coeff is
encoded in reverse order.

CA-VLC

Bitstream=000000011010001001000010111001100

CABAC

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

Context-based adaptive binary arithmetic codes


Difference with CAVLC
cabAC, Arithmetic Coding = Non-integer number of bits per symbol by using
arithmetic codes.
caBac, Binary
CAbac, Context Adaptive
Good compression performances due to:
Selecting probability models for each syntax element according to the elements
context;
Adapting probability estimates based on local statistics;
Using arithmetic coding rather than VLC.

67

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

CABAC - Arithmetic coding

Entire word coded as one number in the range [0 1)


Range divided into subranges
One subrange for each symbol
Length of subrange proportional to probability of the symbol
Example: encode SQUEEZE

68

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

CABAC - Arithmetic coding

Update range each time with new


symbol
Final range [0.64769 0.64777]
Pick a binary number in this range
with the smallest number of bits
0.101001011101 0.647705
12 bits only!!
Longer messagesmore repeated
symbolsbetter performance

69

Introduction
CA-VLC
CABAC

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
CA-VLC
CABAC

CABAC - Binarization and Context Adaptive

Apply binarization scheme to nonbinary symbols


Advantages:
Using a binary arithmetic coder instead of n-ary arithmetic coder
A more fast and accurate estimation of conditional probabilities(Alphabet reduced
to 2 symbols)

Context Adaptive = Adapt probabilities of symbols to the context


Basic Example: SQUEEZE
U probability in english is 3%
U after Q probability is almost 99.9%

Context here is: previous letter, 4 different context models employed in CABAC

70

Profiles

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Entropy coding

Profiles

Intra prediction modes


9

Amendments

10

71

Conclusion

Profile

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

The Baseline profile was targeted at applications in which a minimum of


computational complexity and a maximum of error robustness are required;
The Main profile was aimed at applications that require a maximum of coding
efficiency, with somewhat less emphasis on error robustness.
72

Profile

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Baseline Profile

I/P slices
Multiple reference frames
In-loop deblocking
CAVLC entropy coding

Main Profile

Baseline Profile features mentioned above


B slices
CABAC entropy coding
Interlaced coding - PAFF/MBAFF
Weighted prediction

High Profile
Main Profile features mentioned above
8 8 transform option
Custom quantisation matrices

73

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Scalable Video Coding

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Entropy coding

Profiles

Intra prediction modes


9

Amendments
Scalable Video Coding
Multiview Video Coding

Introduction
Types of scalability
Performances

10

74

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Introduction

The Scalable Video Coding amendment (SVC) of the H.264/AVC standard


(H.264/AVC) provides network-friendly scalability at a bit stream level with a
moderate increase in decoder complexity relative to single-layer H.264/AVC.

Use-cases: video telephony and video conferencing over mobile TV, wireless and
Internet video streaming, standard- and high-definition TV broadcasting, storage...
Compress once, decompress many ways!!

75

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Types of scalability

MANE = media-aware network element


76

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Performances

77

Scalable Video Coding


Multiview Video Coding

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Multiview Video Coding

The multiview video coding standardization activity in MPEG is based on the


definition of video compression algorithms for multiview video, i.e., video sequences
recorded simultaneously from multiple cameras.
The need for multiview video coding is driven by two recent technological
developments:
new 3D display technologies;
the growing use of multi-camera arrays.

Array of 16 cams.

78

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Multiview Video Coding - use cases

79

Scalable Video Coding


Multiview Video Coding

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Multiview Video Coding

In July 2008, MPEG officially approved an amendment of the ITU-T Rec. H.264 & ISO/IEC
14996-10 Advanced Video Coding (AVC) standard on Multiview Video Coding. This new
standard enables an efficient compressed representation of stereo and multiview video by
exploiting correlation among neighboring camera views to support 3D and free-viewpoint video
applications.

(a) Config.

(b) Frame coding Structure

From [Dufaux,07].

80

Inter-view prediction of key pictures

Inter-view prediction for key and non-key pictures

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Multiview Video Coding

The compression algorithm strongly depends on the data representation and on the
targeted applications.

Construction complexity (amount of processing required by the representation


construction);
Compactness (amount of physical data stored in the representation);
Compression compatibility (amount of bits needed to describe the scene at a given
quality);
View-synthesis complexity (amount of processing needed to synthesize the views at the
user side);
Navigation range and image quality.
From T. Colleus Phd Thesis http://www.irisa.fr/temics/staff/colleu/.
82

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Multiview Video Coding - Depth image-based representations


Multi-view video plus depth (MVD) = 2D+Z representation

83

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Scalable Video Coding


Multiview Video Coding

Multiview Video Coding - Depth image-based representations


Layered Depth Video:
From MVD representation,
Reduction of the multi-view redundancies while preserving important information
(occluded regions).
Pixels are no more composed by a single color and a single depth value, but can
contain several colors and associated depth values.

84

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Multiview Video Coding - Summary

85

Scalable Video Coding


Multiview Video Coding

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Conclusion

Introduction and overview

In loop Deblocking filter

Hierarchical syntax

Transform

Motion Compensation and prediction


modes

Entropy coding

Profiles

Intra prediction modes


9

Amendments

10

86

Conclusion

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Conclusion

Just an hybrid video coding with important differences compared to previous


standards...

Enhanced motion compensation (multi-references);


Small blocks for transform coding with different shapes;
In loop Deblocking filter;
Enhance entropy coding.
Substantial bit-rate savings (up to 50%) with the same perceptual quality.

87

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Conclusion

88

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Conclusion

Complexity of the encoder is 3 to 4 times greater than prior encoders.


Complexity of the decoder is 2 to 3 times greater than prior decoders.

89

Introduction and overview


Hierarchical syntax
Motion Compensation and prediction modes
Intra prediction modes
In loop Deblocking filter
Transform
Entropy coding
Profiles
Amendments
Conclusion

Conclusion

H.265 / High Efficiency Video Coding / HVC: Beyond H.264


Core experiments
CfE

CfP

First solution
Assessment of proposals

Iteration
t
Verification Model (VM)

VM Evolution

MPEGs current timetable for developing HVC is as follows:


January 2010: Call for Proposals issued;
Feb-April 2010: Technology proposals submitted and evaluated;
Late 2010: Test Model reference codec developed
2011-2012: Draft versions of the new standard
2012/2013?: New video coding standard published.
Goals:
there is likely to be a need for a new compression format, as consumers demand
higher-quality video and as processing capacity improves;
there is potential to deliver better performance than the current state-of-the art.
the next generation of ultra-HD (UHD) contents and devices (4K 2K )
90

Suggestion for further reading...


[Dufaux,07] F. Dufaux, M. Ouaret and T. Ebrahimi, RECENT ADVANCES IN MULTI-VIEW
DISTRIBUTED VIDEO CODING, DSS, 2007.
[Richardson,03] I.E.G. Richardson, H.264 and MPEG-4: video compression. John Wiley Eds,
2003.
[Yu et al,2006] L. Yu et al., Fast Frame/Field Coding for H.264/AVC, ICDT 2006.

90

You might also like