Professional Documents
Culture Documents
SELECTED TOPICS IN
COMPUTER ENGINEERING
I B B P B B P B B P B B P B B I
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Frame Ordering
B B I B B P B B P B B P B B P B B I
-1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Macro
Block
Previous Current
Frame Frame
Motion Vector
Coding Mode II (Intra-Coding)
Macro
Block
Previous Current
Frame Frame
1
2
I-Picture Coding
8
P-Picture Coding: many coding modes
– Motion compensated coding: Motion Vector (MV) only
– Motion compensated coding: MV plus difference
macroblock
Motion
Vector
Q-1
Predictive frame
Motion vectors
Pre Buffer
processing IDCT
+ Output
Input
Motion Frame
Compensation Memory
Motion
Estimation
MPEG: Video Encoding
– Interframe predictive coding (P-pictures)
• For each macroblock the motion estimator produces the best
matching macroblock
• The two macroblocks are subtracted and the difference is DCT
coded
– Interframe interpolative coding (B-pictures)
• The motion vector estimation is performed twice
• The encoder forms a prediction error macroblock from either of
them or from their average
• The prediction error is encoded using a block-based DCT
– The encoder needs to reorder pictures because B-
frames always arrive late
MPEG-1 Video Layer
• a coded representation that can be used for compressing video
sequences - both 625-line and 525-lines - to bitrates around 1.5
Mbit/s.
• Group of Pictures
• one I frame in every group
• 10-15 frames per group
• P depends only on I, B depends on both I and P
• B and P are random within GoP
MPEG Video Filtering
I B B P B B P B B P B B P B B I
I B P B P B P B P B I
I P P P P I
I P P P I
I P P I
I I
MPEG-2
– Digital Television (4 - 9 Mb/s)
– Satellite dishes, digital cable video
– Larger data size
– includes closed-captions
– More complex encoding (“long time”)
– Support higher bit rates for HDTV instead of the 1.5Mbps
– Support a larger number of applications
– Different color subsampling modes e.g., 4:2:2, 4:2:0, 4:4:4
MPEG-2: Profiles and Levels
Profiles
Levels SNR Spatial High Multiview
4:2:0 4:2:0 4:2:0;4:2:2 4:2:0
Enhancement 1920 X 1151/60 1920 X 1151/60
Lower 960 X 576/30 1920 X 1151/60
High Bitrate 100, 80,25 130, 50, 80
Enhancement 1440 X 1152/60 1440 X 1152/60 1920 X 1152/60
A/V object
– A video object within a scene
– The background
– An instrument or voice
– Coded independently
A/V scene
– Mixture of natural or synthetic objects
– Individual bitstreams multiplexed and transmitted
– One or more channels
– Each channel may have its own quality of service
MPEG-4: Video Object Plane (VOP)
• Video frame = sum of segmented regions with
arbitrary shape (VOP)
• Shape motion and texture information of VOPs
belonging to the same video object is encoded
into a video object layer (VOL)
• Encode
– VOL identifiers
– Composition information
• Overlapping configuration of VOPs
MPEG-4: Coding
Shape coding
– Shape information in alpha planes
– Transparency of shape encoded
– Inter and intra shape coding functions
– After shape coding each VOP in a VO is
partitioned into non-overlapping macroblocks
Motion coding
– Shift parameter with respect to reference window
– Standard macroblock
– Contour macroblock
MPEG-4: Coding
Texture coding
– Intra-VOPs, residual errors from motion compensation are
DCT coded like MPEG-1
– P-VOPs (prediction error blocks) may not conform to VOP
boundary
• Pixels outside the active area are set to a constant value
• Standard compression
• Efficient prediction of DC and AC components from intra and
inter coded blocks
– Multiplexing
• Shape → motion → texture coded data
• Motion and DCT coefficients can be jointly or individually
coded
Composition of Audiovisual Objects
(AVOs)
• MPEG-4 provides a standardized way to describe a scene, allowing
the user to:
– place AVOs anywhere in a given coordinate system;
– apply transforms to change the geometrical or acoustical appearance
of a AVO;
– group primitive AVOs in order to form compound media objects;
– apply streamed data to AVOs, in order to modify their attributes;
– change interactively the user’s viewing and listening points anywhere
in the scene.
• With reference to the shown figure, for example, one can replace the
person with a different person, changes her dress or hairstyle;
group the desk and the globe to form a compound AVO since they
are static; or change the background.
An MPEG-4
audiovisual scene
Video Objects
• MPEG-4 treats a video sequence as a collection of
video objects.
• A video object (VO) is an area of video scene that may
occupy an arbitrary-shaped region and may exist for an
arbitrary length of time.
• An instance of a VO at a particular point in time is a
video object plane (VOP).
• In the traditional video coding sense, a rectangular
video frame is a VOP and a video sequence is a VO.
MPEG-4 Encoder
+ motion video
_ DCT Q texture multiplex
coding
Q-1
IDCT
+
+
S pred. 1
w
i Frame
pred. 2 Store
t
c
h pred. 3
Motion
estimation
MPEG-4 encoder.
Shape
coding
VOP Prediction
Forward Backward
Bidirectional
Forward
VOP prediction
MPEG-4 Profiles
Source
DCT Q Reorder RLC VLC
frame
Decoded
IDCT Q−1 Reorder RLD VLD
frame
ME
Source
MCP DCT Q Reorder RLC VLC
frame
Decoded
MCR IDCT Q−1 Reorder RLD VLD
frame
I4 B5 B6 P7
Forward Backward
MVF MVB
Bidirectional
Basic-quality
Base layer Decoder A
Video sequence
Encoder Enhancement
sequence
layer 1
Decoder B High-quality
Enhancement sequence
layer N
(i)
0 2
enhancement
layer VOPs
(iii)
(ii)
0 2 0 2 2
1 3 1 3
57
Fine Granular Scalability
• Fine Granular Scalability (FGS) is a method of encoding a
sequence as a base layer and enhancement layer such that the
enhancement layer can be truncated during or after encoding to
give a highly flexible control over the transmitted bitrate.
• FGS is very useful in video streaming applications where the
channel bandwidth may change. When that happens, the
streaming server transmits the base layer and a truncated version
of the enhancement layer to match the available bandwidth,
hence maximizing the decoded video quality without the need to
re-encode the video sequence.
58
MPEG-7
• Data + Multimedia Content Description Scheme
• Description Definition Language (e.g., XML-based)
• Does not deal with data, but meta-data transmission
• Description Scheme + Content Description, e.g:
• Table of content
• Still Images
• Summaries
• links
• etc.
• Focus mainly on how description of data gets generated and how
it is used
MPEG-7