Lecture 7

Analysis of MPEG
MPEG: the Organization

• Moving Picture Experts Group
• Established in 1988
• Standards under International Organization for

standardization (ISO) and International Electro
technical Commission (IEC)
• Official name is: ISO/IEC JTC1 SC29 WG11

MPEG vs. Competitor
• Generally produces better quality than the other
formats such as:
• Video for Window
• Index and QuickTime
• MPEG audio/video compression can be used many
applications:
• DVD player
• HDTV recorder
• Internet Video
• Video Conferences
• Others
MPEG Overview
• MPEG-1 : a standard for storage and retrieval of moving
pictures and audio on storage media
• MPEG-2 : a standard for digital television
• MPEG-4 : a standard for multimedia applications
• MPEG-7 : a content representation standard for
information search
• MPEG-21: offers metadata information for audio and
video files
MPEG 1
• First standard to be published by the MPEG

organization (in 1992)
• A standard for storage and retrieval of moving

pictures and audio on storage media
• Example formats: VideoCD (VCD), mp3, mp2

5 Parts of MPEG 1
• Part 1: Combining video and audio inputs into a single/multiple
data stream
• Part 2: Video Compression
• Part 3: Audio Compression
• Part 4: Requirements Verification
• Part 5: Technical report on the software implementation of the

Parts 1 - 3
Basic Structure of Audio Encoder
Note: A decoder basically works in just the opposite manner

Processes of and Audio Encoder
• Mapping Block – divides audio inputs into 32 equal-
width frequency subbands (samples)
• Psychoacoustic Block – calculates masking threshold
for each subband
• Bit-Allocation Block – allocates bits using outputs of
the Mapping and Psychoacoustic blocks
• Quantizer & Coding Block – scales and quantize
(reduce) the samples
• Frame Packing Block – formats the samples with
headers into an encoded stream
MPEG-1 Layers I, II, III
• MPEG layer differences lie in processing power

and resulting audio/sound quality
• Mp1 – little processing needed, poor quality

• Mp2 – minimal processing, “okay” quality
• Mp3 – massive processing, high “CD” quality
MPEG-2 Overview
• Extends video & audio compression of MPEG-1

- Substantially reduces bandwidth required for high-
quality
transmissions
- Optimizes balance between resolution (quality) and
bandwidth (speed)
10 Parts of MPEG-2
• Part 1: Combine video and audio data into single/multiple streams
• Part 2: Offers more advanced video compression tools
• Part 3: Is a multi-channel extension of the MPEG-1 Audio standard
• Part 4/5: Correspond to and build on part 4/5 of MPEG-1
• Part 6: Specifies protocols of managing MPEG-1 & MPEG-2 bitstreams
• Part 7: Specifies a multi-channel audio coding algorithm
• Part 8: (was discontinued because of obsolescence)
• Part 9: specifies the Real-time Interface (RTI) to Transport Stream decoders
• Part 10: the conformance part of Digital Storage Media Command and Control
(currently under development)
MPEG-2 Video Compression
Overview
VIDEO STREAM DATA HIRERARCHY
• Video stream Overview
• Group of Pictures (GOP)
• I-frames: can be reconstructed without any reference to other frames
• P-frames: forward predicted from last I-frame and P-frames
• B-frames: forward and backward predicted
• Overview
Compression: Eliminating Redundancies
• Spatial Redundancy
• Pixels are replicated within a single frame of video
• Temporal Redundancy
• Consecutive frames of video display images of the same scene
Overview
Four Video Compression Techniques:
1. Pre-processing
2. Temporal Prediction
3. Motion Compensation
4. Quantization
Overview
• Pre-processing
• Filters out unnecessary information
• Information that is difficult to encode
• Not an important component of human visual perception
Overview
• Temporal Prediction:
• Uses the mathematical algorithm Discrete Cosine Transform
(DCT) to:
• Divide each frame into 8X8 blocks of pixels
• Reorganize residual differences between frames
• Encode each block separately
Overview
Overview
Overview
Overview
• Quantization:
• Refers to DCT coefficients
• Removes subjective redundancy
• Controls compression factor
• Converts coefficients into even smaller numbers
Overview
Where It Is Used:
• Multimedia Communications
• Webcasting
• Broadcasting
• Video on Demand
• Interactive Digital Media
• Telecommunications
• Mobile communications
MPEG-2 Transmission Overview
• Building the MPEG Bit Stream:

Elementary Stream (ES)
- Digital Control Data
- Digital Audio
- Digital Video
- Digital Data
Packetised Elementary Stream (PES)
- Each ES combined into stream of PES packets.
- A PES packet may be fixed (or variable) sized block.
- Each block has up to 65536 bytes per block and a 6 byte protocol
header.
MPEG-2 Transmission Cont.
• MPEG-2 Multiplexing
MPEG Program Stream
- Tightly coupled PES packets
- Used for video playback and network application
MPEG Transport Stream

- Each PES packet is broken into fixed-sized transport packets
MPEG Transport Streams
Combining ES from Encoders into
a Transport Stream
Single & Multiple Program Transport
Streams
Format of a Transport Stream
Packet
MPEG-2 Encoders
Types of MPEG-2 Decoders
1. MPEG-2 Software Decoder & PC-Based Accelerator
2. MPEG-2 Computer Decoder
3. MPEG-2 Network Computers/Thin Clients
4. MPEG-2 Set-Top Box
5. MPEG-2 Consumer Equipment

MPEG-4 Overview
• Submergence
• Handle specific requirements from rapidly developing
multimedia applications
• Advantages over MPEG-1 and MPEG-2

• Object-oriented coding
MPEG-4 Standard: 6 Parts
•
Overview
Part 1: Systems - specifies scene description, multiplexing, synchronization, buffer
management, and management and protection of intellectual property.
• Part 2: Visual - specifies the coded representation of natural and synthetic visual objects .
• Part 3: Audio - specifies the coded representation of natural and synthetic audio objects.
• Part 4: Conformance Testing - defines conformance conditions for bit streams and
devices; this part is used to test MPEG-4 implementations.
• Part 5: Reference Software - includes software corresponding to most parts of MPEG-4,

it can be used for implementing compliant products as ISO
waives the copyright of the code.
• Part 6: Delivery Multimedia Integration Framework (DMIF) - defines a session protocol

for the management of multimedia streaming over generic delivery technologies.
Features & Functionalities
• Object Oriented
• Primitive Audiovisual Objects are Coded
• Low Data Rate

• Allows for high quality video at lower data rates and smaller
file size
• Interoperability
• Opens methods in playing with audiovisual scenes
MPEG-4 Object Based Coding
Architecture
MPEG-4 Scene
Targeted Applications
• Digital TV
• TV logos, Customized advertising, Multi-window screen
• Mobile multimedia
• Cell phones and palm computers
• TV production
• Target viewers
• Games
• Personalize games
• Streaming Video
• News updates and live music shows over Internet
MPEG-4
• MPEG-4, or ISO/IEC 14496 is an international standard
describing coding of audio-video objects
• the 1st version of MPEG-4 became an international standard in
1999 and the 2nd version in 2000 (6 parts); since then many
parts were added and some are under development today
• MPEG-4 included object-based audio-video coding for
Internet streaming, television broadcasting, but also digital
storage
• MPEG-4 included interactivity and VRML support for 3D
rendering
• has profiles and levels like MPEG-2
• has 27 parts
MPEG-4 parts
• Part 1, Systems – synchronizing and multiplexing audio and
video
• Part 2, Visual – coding visual data
• Part 3, Audio – coding audio data, enhancements to Advanced
Audio Coding and new techniques
• Part 4, Conformance testing
• Part 5, Reference software
• Part 6, DMIF (Delivery Multimedia Integration Framework)
• Part 7, optimized reference software for coding audio-video
objects
• Part 8, carry MPEG-4 content on IP networks
MPEG-4 parts (2)
• Part 9, reference hardware implementation
• Part 10, Advanced Video Coding (AVC)
• Part 11, Scene description and application engine; BIFS (Binary
Format for Scene) and XMT (Extensible MPEG-4 Textual format)
• Part 12, ISO base media file format
• Part 13, IPMP extensions
• Part 14, MP4 file format, version 2
• Part 15, AVC (advanced Video Coding) file format
• Part 16, Animation Framework eXtension (AFX)
• Part 17, timed text subtitle format
• Part 18, font compression and streaming
• Part 19, synthesized texture stream
MPEG-4 parts (3)
• Part 20, Lightweight Application Scene Representation
(LASeR) and Simple Aggregation Format (SAF)
• Part 21, MPEG-J Graphics Framework eXtension (GFX)
• Part 22, Open Font Format
• Part 23, Symbolic Music Representation
• Part 24, audio and systems interaction
• Part 25, 3D Graphics Compression Model
• Part 26, audio conformance
• Part 27, 3D graphics conformance
Motivations for MPEG-4
• Broad support for MM facilities are available
• 2D and 3D graphics, audio and video – but
• Incompatible content formats
• 3D graphics formats as VRML are badly integrated to
• 2D formats as FLASH or HTML
• Broadcast formats (MHEG) are not well suited for the Internet
• Some formats have a binary representation – not all
• SMIL, HTML+, etc. solve only a part of the problems
• Both authoring and delivery are cumbersome
• Bad support for multiple formats
MPEG-4: Audio/Visual (A/V)
• Simple video codingObjects
(MPEG-1 and –2)
• A/V information is represented as a sequence of rectangular frames:
Television paradigm
• Future: Web paradigm, Game paradigm … ?
• Object-based video coding (MPEG-4)
• A/V information: set of related stream objects
• Individual objects are encoded as needed
• Temporal and spatial composition to complex scenes
• Integration of text, “natural” and synthetic A/V
• A step towards semantic representation of A/V
• Communication + Computing + Film (TV…)
Main parts of MPEG-4
1. Systems
– Scene description, multiplexing, synchronization, buffer management,
intellectual property and protection management
2. Visual
– Coded representation of natural and synthetic visual objects
3. Audio
– Coded representation of natural and synthetic audio objects
4. Conformance Testing
– Conformance conditions for bit streams and devices
5. Reference Software
– Normative and non-normative tools to validate the standard
6. Delivery Multimedia Integration Framework (DMIF)
– Generic session protocol for multimedia streaming
Main objectives – rich data
• Efficient representation for many data types
• Video from very low bit rates to very high quality
• 24 Kbs .. several Mbps (HDTV)
• Music and speech data for a very wide bit rate range
• Very low bit rate speech (1.2 – 2 Kbps) ..
• Music (6 – 64 Kbps) ..
• Stereo broadcast quality (128 Kbps)
• Synthetic objects
• Generic dynamic 2D and 3D objects
• Specific 2D and 3D objects e.g. human faces and bodies
• Speech and music can be synthesized by the decoder
• Text
• Graphics
Main objectives – robust +
• pervasive
Resilience to residual errors
• Provided by the encoding layer
• Even under difficult channel conditions – e.g. mobile
• Platform independence
• Transport independence
• MPEG-2 Transport Stream for digital TV
• RTP for Internet applications
• DAB (Digital Audio Broadcast) . . .
• However, tight synchronization of media
• Intellectual property management + protection
• For both A/V contents and algorithms
Main objectives - scalability
• Scalability
• Enables partial decoding
• Audio - Scalable sound rendering quality
• Video - Progressive transmission of different quality levels
- Spatial and temporal resolution
• Profiling
• Enables partial decoding
• Solutions for different settings
• Applications may use a small portion of the standard
• “Specify minimum for maximum usability”
Main objectives - genericity
• Independent representation of objects in a scene
• Independent access for their manipulation and re-use
• Composition of natural and synthetic A/V objects into one
audiovisual scene
• Description of the objects and the events in a scene
• Capabilities for interaction and hyper linking
• Delivery media independent representation format
• Transparent communication between different delivery
environments
Object-based architecture
MPEG-4 as a tool box
• MPEG-4 is a tool box (no monolithic standard)
• Main issue is not a better compression
• No “killer” application (as DTV for MPEG-2)
• Many new, different applications are possible
• Enriched broadcasting, remote surveillance, games, mobile
multimedia, virtual environments etc.
• Profiles
• Binary Interchange Format for Scenes (BIFS)
• Based on VRML 2.0 for 3D objects
• “Programmable” scenes
• Efficient communication format
MPEG-4 Systems part
MPEG-4 scene, VRML-like model
Logical scene structure
MPEG-4 Terminal Components
Digital Terminal Architecture
BIFS tools – scene features
• 3D, 2D scene graph (hierarchical structure)
• 3D, 2D objects (meshes, spheres, cones etc.)
• 3D and 2D Composition, mixing 2D and 3D
• Sound composition – e.g. mixing, “new instruments”, special
effects
• Scalability and scene control
• Terminal capabilities (TermCab)
• MPEG-J for terminal control
• Face and body animation
• XMT - Textual format; a bridge to the Web world
BIFS tools – command protocol
• Replace a scene with this new scene
• A replace command is an entry point like an I-frame
• The whole context is set to the new value
• Insert node in a grouping node
• Instead of replacing a whole scene, just adds a node
• Enables progressive downloads of a scene
• Delete node - deletion of an element costs a few bytes
• Change a field value; e.g. color, position, switch on/off
an object
BIFS tools – animation protocol
• The BIFS Command Protocol is a synchronized, but non
streaming media
• Anim is for continuous animation of scenes
• Modification of any value in the scene
– Viewpoints, transforms, colors, lights
• The animation stream only contains the animation values
• Differential coding – extremely efficient
Elementary stream management
• Object description
• Relations between streams and to the scene
• Auxiliary streams:
• IPMP – Intellectual Property Management and Protection
• OCI – Object Content Information
• Synchronization + packetization
– Time stamps, access unit identification, …
• System Decoder Model
• File format - a way to exchange MPEG-4 presentations
An example MPEG-4 scene
Object-based compression and
delivery
Linking streams into the scene (1)
• An object descriptor contains ES descriptors pointing to:
• Scalable coded content streams terminal may
select suitable
• Alternate quality content streams streams
• Object content information

• IPMP information
• ES descriptors have subdescriptors to:
• Decoder configuration (stream type, header)
• Sync layer configuration (for flexible SL syntax)
• Quality of service information (for heterogeneous nets)
• Future / private extensions
Describing scalable content
Describing alternate content
versions
Decoder configuration info in older
standards
cfg = configuration information (“stream headers”)

Decoder configuration information
in MPEG-4
• the OD (ESD) must be retrieved first

• for broadcast ODs must be repeated periodically
The Initial Object Descriptor
• Derived from the generic object descriptor
– Contains additional elements to signal profile and level (P&L)
• P&L indications are the default way of content selection
– The terminal reads the P&L indications and knows whether it
has the capability to process the presentation
• Profiles are signaled in multiple separate dimensions
• Scene description
• Graphics
• Object descriptors
• Audio
• Visual
• The “first” object descriptor for an MPEG-4 presentation is
always an initial object descriptor
Transport of object descriptors
• Object descriptors are encapsulated in OD commands
– ObjectDescriptorUpdate / ObjectDescriptorRemove
– ES_DescriptorUpdate / ES_DescriptorRemove
• OD commands are conveyed in their own object descriptor stream
in a synchronized manner with time stamps
– Objects / streams may be announced during a presentation
• There may be multiple OD & scene description streams
– A partitioning of a large scene becomes possible
• Name scopes for identifiers (OD_ID, ES_ID) are defined
– Resource management for sub scenes can be distributed
• Resource management aspect
- If the location of streams is changed, only the ODs need modification. Not
the scene description
Initial OD pointing to scene and
OD stream
Initial OD pointing to a scalable scene
Auxiliary streams
• IPMP streams
• Information for Intellectual Property Management and Protection
• Structured in (time stamped) messages
• Content is defined by proprietary IPMP systems
• Complemented by IPMP descriptors
• OCI (Object Content Information) streams
• Meta data for an object (“Poor man’s MPEG-7”)
• Structured descriptors conveyed in (time stamped) messages
• Content author, date, keywords, description, language, ...
• Some OCI descriptors may be directly in ODs or ESDs
• ES_Descriptors pointing to such streams may be attached to any object
descriptor – scopes the IPMP or OCI stream
• An IPMP stream attached to the object descriptor stream is valid for all streams
Adding an OCI stream to an audio
stream
Adding OCI descriptors to audio
streams
Linking streams to a scene –
including “upstreams”
MPEG-4 streams
Synchronization of multiple
• elementary
Based on two streams
well known concepts
• Clock references
– Convey the speed of the encoder clock
• Time stamps
– Convey the time at which an event should happen
• Time stamps and clock references are

• defined in the system decoder model
• conveyed on the sync layer
System Decoder Model (1)
System Decoder Model (2)
• Ideal model of the decoder behavior
– Instantaneous decoding – delay is implementation’s problem
• Incorporates the timing model
– Decoding & composition time
• Manages decoder buffer resources
• Useful for the encoder
• Ignores delivery jitter
• Designed for a rate-controlled “push” scenario
– Applicable also to flow-controlled “pull” scenario
• Defines composition memory (CM) behavior
• A random access memory to the current composition unit
• CM resource management not implemented
Synchronization of elementary streams with
time events in the scene description
• How are time events handled in the scene

description?
• How is this related to time in the elementary
streams?
• Which time base is valid for the scene description?
Cooperating entities in synchronization
• Time line (“object time base”) for the scene
• Scene description stream with time stamped BIFS access
units
• Object descriptor stream with pointers to all other streams
• Video stream with (decoding & composition) time stamps
• Audio stream with (decoding & composition) time stamp
• Alternate time line for audio and video
A/V scene with time bases and
stamps
Hide the video at time T1
Hide the video on frame boundary
The Synchronization Layer (SL)
• Synchronization layer (short: sync layer or SL)
• SL packet = one packet of data
• Consists of header and payload
• Defines a “wrapper syntax” for the atomic data: access unit
• Indicates boundaries of access units
• AccessUnitStartFlag, AccessUnitEndFlag, AULength
• Provides consistency checking for lost packets
• Carries object clock reference (OCR) stamps
• Carries decoding and composition time stamps (DTS, CTS)
Elementary Stream Interface (1)
The sync layer design
• Access units are conveyed in SL packets
• Access units may use more than one SL packet
• SL packets have a header to encode the information
conveyed through the ESI
SL packets that don’t start an AU have a smaller header

How is the sync layer designed ?
• As flexible as possible to be suitable for
• a wide range of data rates
• a wide range of different media streams
• Time stamps have
• variable length
• variable resolution
• Same for clock reference (OCR) values
• OCR may come via another stream
• Alternative to time stamps exists for lower bit rate
• Indication of start time and
• duration of units (accessUnitDuration,compositionUnitDuration)
SLConfigDescriptor syntax
class SLConfigDescriptor {example
uint (8) predefined;
if (predefined==0) {
bit(1) useAccessUnitStartFlag;
bit(1) useAccessUnitEndFlag;
bit(1) useRandomAccessPointFlag; SDL-
Syntax Description
bit(1) usePaddingFlag; Language
bit(1) useTimeStampsFlag;
uint(32) timeStampResolution;
uint(32) OCRResolution;
uint(6) timeStampLength;
uint(6) OCRLength;
if (!useTimeStamps) {
................
Wrapping SL packets in a suitable
layer
MPEG-4 Delivery Framework
(DMIF)
The MPEG-4 Layers and DMIF
• DMIF hides the delivery technology
• Adopts QoS metrics
• Compression Layer
• Media aware
• Delivery unaware
• Sync Layer
• Media unaware
• Delivery unaware
• Delivery Layer
• Media unaware
• Delivery aware
DMIF communication architecture
Multiplex of elementary streams
• Not a core MPEG task
• Just respond to specific needs for MPEG-4 content
transmission
• Low delay
• Low overhead
• Low complexity
• This prompted the design of the “FlexMux” tool
• One single file format desirable
• This lead to the design of the MPEG-4 file format
Modes of FlexMux
How to configure MuxCode mode ?
A multiplex example
Multiplexing audio channels in
FlexMux
Multiplexing all channels to
MPEG-2 TS
MPEG-2 Transport Stream
MPEG-4 content access procedure
• Locate an MPEG-4 content item (e.g. by
URL) and connect to it
– Via the DMIF Application Interface (DAI)
• Retrieve the Initial Object Descriptor
• This Object Descriptor points to an
BIFS + OD stream
– Open these streams via DAI
• Scene Description points to other streams
through Object Descriptors
- Open the required streams via DAI
• Start playing!
MPEG-4 content access example

Lecture 7

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 7

Uploaded by

Copyright:

Available Formats

Analysis of MPEG

MPEG: the Organization

• Standards under International Organization for

• Official name is: ISO/IEC JTC1 SC29 WG11

• First standard to be published by the MPEG

• A standard for storage and retrieval of moving

• Example formats: VideoCD (VCD), mp3, mp2

• Part 2: Video Compression

• Part 3: Audio Compression

• Part 4: Requirements Verification

• Part 5: Technical report on the software implementation of the

Note: A decoder basically works in just the opposite manner

• MPEG layer differences lie in processing power

• Mp1 – little processing needed, poor quality

• Extends video & audio compression of MPEG-1

• Building the MPEG Bit Stream:

MPEG Transport Stream

1. MPEG-2 Software Decoder & PC-Based Accelerator

2. MPEG-2 Computer Decoder

3. MPEG-2 Network Computers/Thin Clients

4. MPEG-2 Set-Top Box

5. MPEG-2 Consumer Equipment

• Advantages over MPEG-1 and MPEG-2

• Part 5: Reference Software - includes software corresponding to most parts of MPEG-4,

• Part 6: Delivery Multimedia Integration Framework (DMIF) - defines a session protocol

• Low Data Rate

• Object content information

cfg = configuration information (“stream headers”)

• the OD (ESD) must be retrieved first

• Time stamps and clock references are

• How are time events handled in the scene

SL packets that don’t start an AU have a smaller header

You might also like