Mpeg4 DIIC3

Table of Content
MPEG-4 standard
O. Le Meur
olemeur@irisa.fr
Univ. of Rennes 1
http://www.irisa.fr/temics/staff/lemeur/
October 1, 2012
Table of Content
MPEG-4 standard
Introduction and overview
In loop Deblocking filter
Hierarchical syntax
Transform
Motion Compensation and prediction

modes
Entropy coding
Profiles
Intra prediction modes

9
Amendments
10
Conclusion
MPEG-4

Hierarchical syntax
Motion Compensation and prediction modes
Transform
Entropy coding
Profiles
Amendments
Conclusion
Hierarchical syntax
Transform

modes
Entropy coding
Profiles

9
Amendments
10
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
History
A rapid acceptance of this new standard

First Test Model of H.26L in August 1999;
Formation of JVT (Joint Video Team) between VCEG and ISO/IEC/ JTC/ 1/SC
29/WG11 (MPEG) to establish a joint standard H.264 / MPEG-4 AVC;
ITU-T Approval: May 2003, ITU-SG16, Final standard approved;
ISO/IEC Approval: March 2003, Final draft international standard.
The target is to double the coding efficiency.

Core experiments
CfE
CfP
First solution
Assessment of proposals
Iteration
t
Verification Model (VM)
4
VM Evolution
History

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
History

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Terminology
H.26L (has become outdated...);
JVT (joint Video Team) or JVT codec;
JM2.x, JM3.x, JM4.x;
AVC or Advanced Video Coding.
MPEG-4 part 10 (Official MPEG term)

H.264 (Official ITU term)
Video compression standard

A common framework for the different video standards (H.261, MPEG-1, MPEG-2,
H.263, MPEG-4, H.264/AVC):
Motion-compensated hybrid coding1
1
For this part, most of the figures have been extracted from B. Girods courses (EE398 Image and
Video Compression).
Important differences in many details

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
A big toolbox
Overview of new features to improve the prediction

Directional spatial prediction for intra coding;
Variable block-size MC with small block size;
Quarter-sample-accurate MC;
Motion vectors over picture boundaries;
Multiple reference picture MC;
Weighted prediction;
Improved skipped and direct motion inference;
In-the-loop deblocking filter.

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
A big toolbox
Overview of new features to improve the coding efficiency

Small block-size transform;
Exact-match inverse transform;
Short word-length transform;
Hierarchical block transform;
Arithmetic entropy coding;
Context-adaptive entropy coding.
10

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
A big toolbox
Overview of new features to improve the robustness to data errors/losses

Flexible slice size;
Flexible MB ordering (FMO);
Arbitrary slice ordering (ASO);
Redundant pictures;
Data partitionning;
SP/SI synchronization/switching pictures.
11

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Slice and macroblock

Slice
Macroblock
Hierarchical syntax
Hierarchical syntax
Slice
Macroblock
Transform
Entropy coding
Profiles
Amendments

modes

10
12
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Pictures are still divided in slices and MBs.

Slice
Picture has one or more slices;
Slices are self-contained;
Slices are a sequence of MBs.
Macroblock
Basic syntax and processing unit;
1 MB is composed of 16x16 pixels of luminance and 2 blocks 8x8 of chrominance (4:2:0);
MB within a slice depends on each other;
MB can be further partitionned.
13

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
FMO (Flexible MB Ordering)
Slice group
A slice group is a subset of the MB in a coded picture and may contain one or more slices.
The goal is to divide the picture into different scan patterns of macroblocks. Different types
can be used:
Type 0: interleaved;
Type 1: dispersed;
Type 2: foreground and background. All but the last slice group are defined as
rectangular regions within the picture. The last slice group contains all MBs not
contained in any other slice group (background);
Type 3: box-out. A box is created starting from the center of the frame;
Type 4: raster scan;
Type 5: wipe (vertical scan order);
Type 6: explicit (a MB map is entirely user-defined).
14

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
FMO can be used:

to distribute errors more randomly over the picture
to allocate more bit rate to Regions of Interest
to reduce the computational load while decoding (parallel processing) to reduce
the computational load
15

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Example extracted from Unequal Error Protection Technique for ROI Based H.264
Video Coding , H. K. Arachchi, CCECE 2006.
H.264 network abstraction layer (NAL) encodes each slice into separate data packet.
16

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Type 2: RoI coding (foreground background)

Type 3: Box-out (direction of
rotation, number of slices and
number of MB per slices)
17

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Type 4: Raster scan (direction

(normal vs inverse), number of
slices and number of MB per
slices)
Type 5: Wipe (direction (normal
vs inverse), number of slices and
number of MB per slices)
18

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Introduction of Error resilience/Error concealment concepts

Error resilience
Error resilience refers to mechanisms in the encoder that enhance the ability of the compressed
bitstream to resist channel errors. Error resilience functionality in the encoder produces a
bitstream that supports error recovery at the decoder.
Selective intra coding
Frame segmentation into slices
Constrained texture prediction
Flexible Macroblock Order (FMO)
Error concealment
Error concealment refers to the actions taken by the decoder to analyze losses and conceal
them in the displayed video by minimizing the visual artifacts. The concealment schemes can
be spatial, temporal or combined.
19
In spatial interpolation, the values of missing pixels are estimated from the surrounding
pixels of the same frame, without using the temporal information.
Temporal interpolation is based on the corresponding regions of the reference frames. If a
motion vector is missing, it can be estimated based on the motion vectors of the
surrounding regions.
Combination schemes use an adaptive mechanism to choose the best concealment
Slice

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
I-Slice
An I slice contains only intra-coded MBs (predicted from previously coded samples in the same
slice). I=Intra.
P-Slice
An P slice can contain inter-coded MBs (predicted from samples in previously coded pictures),
intra-coded MBs or Skipped MBs. P=Predictive.
B-Slice
An B slice can contain inter-coded MBs (each MB partition can be predicted from samples of
one or two reference pictures before and after the current picture). B=Bi-predictive (6=
Bi-directional as in MPEG-2!!)
SP-Slice and SI-Slice

These slices are specified to ease the switching between bitstreams coded at various bit-rate.
20

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Macroblock
MB-AFF=MB-adaptive frame/field
The concept of macroblock frame/field coding decision was originated from MPEG2
standard. Instead of splitting up a 16 16 MB into two 16 8 blocks, super MB is
defined as a decision unit (16 32).
The frame is scanned as MB pairs. For each MB pair (16 32), the coding type
frame/field is decided. A super MB can be coded as:
two frame MBs of 16x16;
one top-field MB of 16x16 and one bottom-field MB of 16x16.
Coding MB pair in field mode requires modifications to a number of the encoding and
decoding steps.
21

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Slice
Macroblock
Macroblock - MB-AFF=MB-adaptive frame/field

The macroblocks in grids are coded in field mode, and the remaining macroblocks are
coded in frame mode.
Extracted from [Yu et al,2006].

22

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Multiple Reference Frames
Motion vector prediction
Hierarchical syntax
Transform

modes
Motion compensation
Entropy coding
Profiles
Amendments

10
23
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Variable block-size and shape for MC

MPEG-4/AVC supports motion compensation block sizes ranging from 16 16 to
4 4 luminance samples.
MB partitions: 16 16, 16 8, 8 16, 8 8
Sub-MB partitions: 8 4, 4 8, 4 4
These partitions and sub-partitions give rise to a large number of possible combinations
within each macroblock. This method of partitionning MBs into motion compensated
sub-blocks of varying size is known as tree structured motion compensation.
24

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
In areas where there is little change between the frames, a 16 16 partition is

chosen;
In areas of detailed motion (residual appears black or white), smaller partitions
are more efficient.
25

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Motion Vectors
Each partition or sub-MB partition is predicted from an area of same size and shape in
a reference picture.
The motion vector has a quarter-sample resolution for the luma component and
one-eighth-sample resolution for the chroma components.
Sub-pixel motion compensation can provide significantly better compression
performance than integer-pixel compensation, at the expense of increased complexity.
Quarter-pixel accuracy outperforms half-pixel accuracy.
26

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Half-sample positions are obtained by applying a 6-tap filter with tap values
1 5 20 20 5 1
( 32
, 32 , 32 , 32 , 32 , 32 ):
Sample b: b = b(E 5F + 20G + 20H 5I + J)/32c
Sample h: h = b(A 5C + 20G + 20M 5R + T )/32c
27

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Once all the half-pels samples are available, the samples at quarter-pel positions
are produced by linear interpolation (average of samples at integer and half-pel
sample positions).
28

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation

More than 2 frames can be used for prediction!!
Block prediction is done by a weighted sum of blocks from reference pictures
Previously encoded pictures are stored in a reference buffer in both the encoder
and the decoder (reference pictures list 0 and list 1);
These lists may be composed of short term and long term reference pictures (list
0 is mainly composed by the closest past pictures whereas list 1 by closest future
pictures);
By default, a recently-coded picture is marked as a short term picture. When the
buffer is full, the oldest short term picture is removed from the buffer (sliding
window);
Long-term pictures remain in the buffer until explicitly removed or replaced.
29

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
List 0: the closest past picture,

followed by any other past pictures,
followed by any future pictures;
List 1: the closest future picture,
followed by any other future
pictures, followed by any past
pictures;
From [Richardson,03].
Multiple reference frames can be used to predict MBs (maximum number of
reference frames limited to 15 in the buffer (2x6 short term + 3 long term));
Multiple reference frames are most helpful for sequences with chaotic motion;
High computational load on the encoder, small at the decoder;
Increases memory requirements.
30

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
H.264/AVC allows unrestricted motion vector (motion vectors can point outsidet he
image area. In this case, the reference frame is extended beyong the image boundaries
by repeating the edge pixels before interpolation).
31

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation

For MB partitions: 16 16, 16 8, 8 16, 8 8
Each partition can have its own reference picture and prediction direction (List 0,
List 1, BiPred).
For Sub-MB partitions: 8 4, 4 8, 4 4
Each sub-MB partition can have its own motion vector but have to use the same
reference picture and prediction direction as its 8 8 partition.
32

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Motion compensation in P slices

In addition to Intra MB coding types, various predictive coding types are specified .
Inter-coded MB: inter prediction from previously-decoded reference pictures (bi-pred is
allowed);
Skipped MB: no data is sent. The decoder calculates a vector for the skipped MB (The
decoded calculates a vector for the skipped MB and reconstructs the MB using
motion-compensated prediction from the first reference picture in list 0).
For P frames, each 8 8 MB partition is only predicted from one frame, but that frame may
be different from the prediction frame used in neighboring MBs.
33

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Motion compensation in B slices

B frames can also be used as reference frames for P frames. This means that B
pictures can be stored in the reference frame buffer and used as predictors for other
pictures, which was not allowed in MPEG-2.
B-slices utilize a similar MB partitioning to P-Slices;
With intra 16 16 mode, four different types of inter-picture prediction are
supported:
List 0: the prediction signal is formed by utilizing MC from a picture of the first
reference picture buffer;
List 1: idem but with L1;
Bi-predictive: the prediction signal is formed by a weighted average of a MC list 0
and list 1 prediction signal;
Direct: no motion vector is transmitted. Instead, the decoder calculates List 0 and
list 1 vectors based on previously coded vectors and uses these to carry out
bi-predictive MC of the decoded residual samples (spatial or temporal modes).
The main difference between B and P slices: inter predicted MB or blocks in B

slices may use a weighted average of two distinct reference frames.
34

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Direct prediction mode
Spatial: it predicts the movement from neighbour macroblocks in same frame. A

possible criterium could be to copy the motion vector from a neighbor block.
These modes are used in uniform zones of the picture where there is not much
movement;
Temporal: motion information is derived by considering the motion parameters of
the co-located MB/block in the subsequent reference picture. These parameters
are scaled according to the temporal distance of the reference (Assumption:
object is moving with constant speed).
35

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
Weighted prediction
Weighted prediction is a method of modifying the samples of motion-compensated
prediction data in a P or B slice MB:
Explicit weighted prediction for P and B slice MB: the weighting factors are
determined by the encoder and transmitted in the slice header;
Implicit weighted prediction for B slice MB: the weighting factors are calculated
based on the relative temporal positions of the list 0 and list 1 reference pictures.
Weighted prediction may be effective in coding of fade transitions.
36

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation

Encoding a motion vector for each partition can cost a significant numbers of bits,
especially if small partitions are chosen.
Idea: to take into account the correlation that exists between vectors. Usually
there is a high correlation among the MVs of the adjacent blocks. Use this to
predict the motion vector, and calculate the difference (MVD);
Goal: Motion vector encoding can contribute to a significant amount of bits per
picture, especially at low bit rates (overhead). So transfer only the MVD.
37

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
E is the current block, and A,B,C are its

neighbors.
Predictor = median (MVA, MVB, MVC)
MVD = Predictor - MVE
If one or more block are not available,

modify the choice accordingly.
38
Motion compensation

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Motion compensation
Motion compensation
For 8 16 partitions, MVp for the left 8 16 partition is predicted from A and
MVp for the right 8 16 partition is predicted from C.
For 16 8 partitions, MVp for the upper 16 8 partition is predicted from B and
MVp for the lower 16 8 partition is predicted from A.
Chroma vectors are derived from luma MV by dividing them by 2.
39

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Principle
16 16 Intra prediction modes
Hierarchical syntax
Transform

modes
Entropy coding
Profiles

Principle
Amendments
10
40
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Principle
Intra prediction
This is a novelty compared to previous standards...

Principle
Spatial prediction using surrounding available samples;
Luma intra prediction can be done:
By means of a single prediction for an entire 16 16 macroblock (4 modes);
By means of 16 individual predictions on 4 4 block (9 modes).
Chroma intra prediction

Single prediction type for both 8 8 regions.
41

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Principle
Intra prediction
16 16 luma prediction modes

Four prediction modes are available:
Mode 0 (vertical): extrapolation from upper samples (H);
Mode 1 (horizontal): extrapolation from left samples (V);
Mode 2 (DC): mean of upper and left-band samples (H+V);
Mode 3 (Plane): a linear plane function is fitted to the upper and left-band
samples H and V.
42

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Principle
Intra prediction
16 16 luma DC prediction
if TOP and LEFT predictors are available:
mean=(sum(H)+sum(V)+16)/32
if TOP predictors are available:

mean=(sum(H)+8)/16
if LEFT predictors are available:

mean=(sum(V)+8)/16
else mean = 128
43
16 16 Plane mode
Given the top predictors (T0...T15), left predictors (L0...L15) and the left-top corner predictor
(LT) arranged as follows:

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Principle
Intra prediction
45

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Intra prediction

Nine prediction modes are available.
46
Principle

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Principle
Intra prediction
4 4 diagonal down/left luma prediction mode
47

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Intra prediction
48
Principle

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Idea
Algorithm
Examples
Deblocking filter
Hierarchical syntax

modes

Idea
Algorithm
Examples
Transform
Entropy coding
Profiles
Amendments
10
49
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Idea
Algorithm
Examples
Goal of the deblocking filter
Due to coarse quantization at low bit rates, block-based coding typically results in
visually noticeable discontinuities along the block boundaries.
Idea
To remove such blocking artifacts, a deblocking filter operating within the predictive
coding loop is proposed:
As the coder and the decoder must do the same operation, this filter also
constitutes a required component of the decoding process;
Adaptivity on different levels (slice, edge...);
To improve the appearance of the decoded pictures;
Significantly superior to post filtering (the filter reduces bit rate by typically 5-10
percent);
50

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Idea
Algorithm
Examples
Algorithm
Filtering is applied to vertical or horizontal edges of 4 4 blocks in a MB in the

following order:
51
Filtering the vertical boundaries of the luma component;
Filtering the horizontal boundaries of the luma component;
Filtering the vertical boundaries of each chromatic component;
Filtering the horizontal boundaries of each chromatic component;

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Idea
Algorithm
Examples
Algorithm
The deblocking filter is an adaptive filtering:

On slice level: the global filtering strength can be adjusted to the individual
characteristics of the video sequence;
On edge level: the filtering strength depends on inter/intra, motion and coded
residuals;
On sample level: the filtering strength depends on the gradient of image samples
across the boundary.
Specially strong filter for macroblocks with very flat characteristics almost
removes tiling artifacts.
A boundary strength parameter (Bs ) is defined for each block: Bs = {1, 2, 3, 4}, 4 =
strongest filtering.
52

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Idea
Algorithm
Examples
Algorithm
Filter decision
The set of eight pixels across a 4 4 block horizontal or vertical boundary is denoted
as shown below, with actual boundary between p0 and q0 .
Filtering condition: A group of samples is filtered only if:
Bs 6= 0 and
|p0 q0 | < and |p1 p0 | < and |q1 q0 |
and are defined in the standard. They increase with the average quantiser
parameter Qp of the two blocks.
disabled: there is a high gradient across the block boundary in the original image;
Qp is small: and are small (probability to have blocking effects is very low);
Qp is high: and are high.
53

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
54
Idea
Algorithm
Examples

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
55
Idea
Algorithm
Examples

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
4 4 residual transform
4 4 Hadamard transform
Zig-zag scan
Transform
Hierarchical syntax

modes
Transform
Introduction
Zig-zag scan

7
Entropy coding
Profiles
Amendments
10
56
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
Zig-zag scan
Introduction
H.264 uses 3 transforms depending on the type of residual data:

A Hadamard transform for the 4 4 array of luma DC coefficients in Intra MB
predicted in 16 16 mode;
A Hadamard transform for the 2 2 array of chroma DC coefficients;
A DCT-based transform for all other 4 4 blocks in the residual data.
57

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
Zig-zag scan
A DCT-based transform
This transform operates on a 4 4 blocks of residual data (after MC prediction or

intra prediction). Fundamental differences with previous standards:
It is an integer transform;
Possible to ensure zero mismatch between encoder and decoder;
Can be implemented using only additions and shifts;
A scaling multiplication is integrated into the quantizer (to reduce the total
number of multiplications).
Separable transform of a block B44 : C44 = Tv B44 Th .
1
1
1
1
2
1
1 2
Tv = Th =
1 1 1
1
1 2
2
1
58

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
Zig-zag scan
Intra 16 16 MB type
Hadamard transform is applied on the 4 4 array of luma DC coefficients in Intra MB

predicted in 16 16 mode.
The Hadamard transform Hm is a 2m 2m matrix, that transforms 2m real numbers
into 2m real numbers. The Hadamard matrix is commonly defined with a recursive
approach. We define the 1 1 Hadamard transform H0 by the identity H0 = 1, and
then define Hm for m > 0 by:

Hm1
Hm1
,
Hm = 1
2 Hm1
H
m1
1
1
1
1

1
1
1 1
1 1
H1 = 1
, H2 = 12
1
2
1 1
1 1 1
1 1 1
1
59

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
Zig-zag scan
(a) Luma 4 4 DC
60
(b) Chroma 2 2 DC

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
Zig-zag scan
(a) zigzag scan order (frame (b) zigzag scan order (field
block)
block)
61
(c) zigzag scan order (4 4 frame

block)

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
Entropy coding
Hierarchical syntax
Transform

modes
Entropy coding
Introduction
CA-VLC
CABAC
Profiles
Amendments
10
62
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
Introduction
Definition (Entropy coding)

The entropic coding converts a vector X of integers from a source S into a binary
stream Y . It exploits the redundancies in the statistical distribution of X to reduce as
much as possible the size of Y (Variable Length Codes).
Aim: lossless compression
Idea: represent redundant or repeated data with less number of bits
Techniques:
63
Run-length coding
Huffman coding
Basic arithmetic coding
CABAC (Context Adaptive Binary Arithmetic Coding)
CAVLC (Context Adaptive VLC)
CA-VLC

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
Goal: to encode residual, zig-zag ordered 4 4 (and 2 2) blocks of transform

coefficients.
Principle
The encoder switches between different VLC tables for various syntax elements,
depending on the values of the previously transmitted syntax elements in the same
slice.
64
After prediction, transformation and quantization, blocks are typically sparse

Run Length Encoding
The highest nonzeros coefficients after the zigzag scan are often sequences of 1
The number of nonzero coeff in neighbouring block is correlated LUT choice
The magnitude of nonzero coefficients tends to be larger near the DC coeff

adapting the choice of LUT
CA-VLC

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
CAVLC uses a run-level coding to represent strings of zeros compactly.

1 Encode the number of coefficients and trailing ones (coeff_token):
The total number of nonzero coeffs (TotalCoeffs) ( {0, ..., 16});
TrailingOnes can be anything from 0 to 3.
Four LUTs are available (three VLC and one fixed-length code table). The choice
of Table depends on the number of nonzero coeffs previously coded blocks (=
Context-adaptive).
Encode the sign of each TrailingOne: for each TrailingOne signalled by
coeff_token, the sign is encoded with a single bit in reverse order, starting with
the highest-frequency TrailingOne;
Encode the levels of the remaining nonzero coefficients:
Level of non zero coeffs = sign + magnitude;
Encoded in reverse order;
Code for each level = level_prefix + level_suffix. This last value is adapted
depending on the magnitude of each successive coded level (difference of
magnitude, threshold and LUT) (= Context-adaptive).
65
Encode the total number of zeros before the last coefficient: the sum of all
zeros preceding the highest nonzero coefficient in the reordered array is coded
with a a VLC;
Encode each run of zeros: the number of zeros preceding each non zero coeff is
encoded in reverse order.
CA-VLC
Bitstream=000000011010001001000010111001100
CABAC

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
Context-based adaptive binary arithmetic codes

Difference with CAVLC
cabAC, Arithmetic Coding = Non-integer number of bits per symbol by using
arithmetic codes.
caBac, Binary
CAbac, Context Adaptive
Good compression performances due to:
Selecting probability models for each syntax element according to the elements
context;
Adapting probability estimates based on local statistics;
Using arithmetic coding rather than VLC.
67

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
CABAC - Arithmetic coding
Entire word coded as one number in the range [0 1)

Range divided into subranges
One subrange for each symbol
Length of subrange proportional to probability of the symbol
Example: encode SQUEEZE
68

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
CABAC - Arithmetic coding
Update range each time with new

symbol
Final range [0.64769 0.64777]
Pick a binary number in this range
with the smallest number of bits
0.101001011101 0.647705
12 bits only!!
Longer messagesmore repeated
symbolsbetter performance
69
Introduction
CA-VLC
CABAC

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Introduction
CA-VLC
CABAC
CABAC - Binarization and Context Adaptive
Apply binarization scheme to nonbinary symbols

Advantages:
Using a binary arithmetic coder instead of n-ary arithmetic coder
A more fast and accurate estimation of conditional probabilities(Alphabet reduced
to 2 symbols)
Context Adaptive = Adapt probabilities of symbols to the context

Basic Example: SQUEEZE
U probability in english is 3%
U after Q probability is almost 99.9%
Context here is: previous letter, 4 different context models employed in CABAC
70
Profiles

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Hierarchical syntax
Transform

modes
Entropy coding
Profiles

9
Amendments
10
71
Conclusion
Profile

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
The Baseline profile was targeted at applications in which a minimum of

computational complexity and a maximum of error robustness are required;
The Main profile was aimed at applications that require a maximum of coding
efficiency, with somewhat less emphasis on error robustness.
72
Profile

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Baseline Profile
I/P slices
Multiple reference frames
In-loop deblocking
CAVLC entropy coding
Main Profile
Baseline Profile features mentioned above

B slices
CABAC entropy coding
Interlaced coding - PAFF/MBAFF
Weighted prediction
High Profile
Main Profile features mentioned above
8 8 transform option
Custom quantisation matrices
73

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Scalable Video Coding

Multiview Video Coding
Hierarchical syntax
Transform

modes
Entropy coding
Profiles

9
Amendments
Introduction
Types of scalability
Performances
10
74
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Introduction
The Scalable Video Coding amendment (SVC) of the H.264/AVC standard

(H.264/AVC) provides network-friendly scalability at a bit stream level with a
moderate increase in decoder complexity relative to single-layer H.264/AVC.
Use-cases: video telephony and video conferencing over mobile TV, wireless and
Internet video streaming, standard- and high-definition TV broadcasting, storage...
Compress once, decompress many ways!!
75

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Types of scalability
MANE = media-aware network element

76

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Performances
77


Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

The multiview video coding standardization activity in MPEG is based on the

definition of video compression algorithms for multiview video, i.e., video sequences
recorded simultaneously from multiple cameras.
The need for multiview video coding is driven by two recent technological
developments:
new 3D display technologies;
the growing use of multi-camera arrays.
Array of 16 cams.
78

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Multiview Video Coding - use cases
79


Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

In July 2008, MPEG officially approved an amendment of the ITU-T Rec. H.264 & ISO/IEC
14996-10 Advanced Video Coding (AVC) standard on Multiview Video Coding. This new
standard enables an efficient compressed representation of stereo and multiview video by
exploiting correlation among neighboring camera views to support 3D and free-viewpoint video
applications.
(a) Config.
(b) Frame coding Structure
From [Dufaux,07].
80
Inter-view prediction of key pictures
Inter-view prediction for key and non-key pictures

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

The compression algorithm strongly depends on the data representation and on the
targeted applications.
Construction complexity (amount of processing required by the representation

construction);
Compactness (amount of physical data stored in the representation);
Compression compatibility (amount of bits needed to describe the scene at a given
quality);
View-synthesis complexity (amount of processing needed to synthesize the views at the
user side);
Navigation range and image quality.
From T. Colleus Phd Thesis http://www.irisa.fr/temics/staff/colleu/.
82

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Multiview Video Coding - Depth image-based representations

Multi-view video plus depth (MVD) = 2D+Z representation
83

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion

Multiview Video Coding - Depth image-based representations

Layered Depth Video:
From MVD representation,
Reduction of the multi-view redundancies while preserving important information
(occluded regions).
Pixels are no more composed by a single color and a single depth value, but can
contain several colors and associated depth values.
84

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Multiview Video Coding - Summary
85


Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Conclusion
Hierarchical syntax
Transform

modes
Entropy coding
Profiles

9
Amendments
10
86
Conclusion

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Conclusion
Just an hybrid video coding with important differences compared to previous

standards...
Enhanced motion compensation (multi-references);

Small blocks for transform coding with different shapes;
In loop Deblocking filter;
Enhance entropy coding.
Substantial bit-rate savings (up to 50%) with the same perceptual quality.
87

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Conclusion
88

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Conclusion
Complexity of the encoder is 3 to 4 times greater than prior encoders.

Complexity of the decoder is 2 to 3 times greater than prior decoders.
89

Hierarchical syntax
Transform
Entropy coding
Profiles
Amendments
Conclusion
Conclusion
H.265 / High Efficiency Video Coding / HVC: Beyond H.264

Core experiments
CfE
CfP
First solution
Assessment of proposals
Iteration
t
Verification Model (VM)
VM Evolution
MPEGs current timetable for developing HVC is as follows:

January 2010: Call for Proposals issued;
Feb-April 2010: Technology proposals submitted and evaluated;
Late 2010: Test Model reference codec developed
2011-2012: Draft versions of the new standard
2012/2013?: New video coding standard published.
Goals:
there is likely to be a need for a new compression format, as consumers demand
higher-quality video and as processing capacity improves;
there is potential to deliver better performance than the current state-of-the art.
the next generation of ultra-HD (UHD) contents and devices (4K 2K )
90
Suggestion for further reading...

[Dufaux,07] F. Dufaux, M. Ouaret and T. Ebrahimi, RECENT ADVANCES IN MULTI-VIEW
DISTRIBUTED VIDEO CODING, DSS, 2007.
[Richardson,03] I.E.G. Richardson, H.264 and MPEG-4: video compression. John Wiley Eds,
2003.
[Yu et al,2006] L. Yu et al., Fast Frame/Field Coding for H.264/AVC, ICDT 2006.
90

Mpeg4 DIIC3

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mpeg4 DIIC3

Uploaded by

Copyright:

Available Formats

Table of Content

Introduction and overview

In loop Deblocking filter

Motion Compensation and prediction

Intra prediction modes

Introduction and overview

Introduction and overview

In loop Deblocking filter

Motion Compensation and prediction

Intra prediction modes

Introduction and overview

A rapid acceptance of this new standard

The target is to double the coding efficiency.

Introduction and overview

Introduction and overview

MPEG-4 part 10 (Official MPEG term)

Video compression standard

Motion-compensated hybrid coding1

Important differences in many details

Introduction and overview

Overview of new features to improve the prediction

Introduction and overview

Overview of new features to improve the coding efficiency

Introduction and overview

Overview of new features to improve the robustness to data errors/losses

Introduction and overview

Slice and macroblock

Introduction and overview

In loop Deblocking filter

Motion Compensation and prediction

Intra prediction modes

Introduction and overview

Slice and macroblock

Slice and macroblock

Pictures are still divided in slices and MBs.

Introduction and overview

Slice and macroblock

FMO (Flexible MB Ordering)

Introduction and overview

Slice and macroblock

FMO (Flexible MB Ordering)

FMO can be used:

Introduction and overview

Slice and macroblock

FMO (Flexible MB Ordering)

Introduction and overview

Slice and macroblock

FMO (Flexible MB Ordering)

Type 2: RoI coding (foreground background)

Introduction and overview

Slice and macroblock

FMO (Flexible MB Ordering)

Type 4: Raster scan (direction

Introduction and overview

Slice and macroblock

Introduction of Error resilience/Error concealment concepts

Introduction and overview

Slice and macroblock

SP-Slice and SI-Slice

Introduction and overview

Slice and macroblock

Introduction and overview

Slice and macroblock