Introduction To MPEG Standards

Saturn Software Mills
Visit for more technical article http://www.saturnsoftmills.com/FreeDownloads.htm
Introduction to MPEG Video Stream

Digital techniques have made rapid progress in audio and
video. Digital Information is more robust and error
resilient. This means that generation losses during
recording and losses in transmission can be eliminated. The
compact disk (CD) was the first consumer product to
demonstrate this. Digital recording and transmission
techniques allow content manipulation that is not possible
in analog. Once audio or video is digitized, the contents
are in the form of data. Such data can be handled in the
same way as any other kind of data. However, production
standard digital video generates over 200 megabits per
second of data, and this bit rate requires extensive
capacity for storage and wide bandwidth for transmission.
This extensive storage and bandwidth requirement can be
reduced by compression. Compression is a way of expressing
digital audio and video by using less data.
Video Sequence
Group of Pictures
Picture Block
Slice Macroblock 8 pixels
8 pixels
Y Cb Cr
1 2 5 6
3 4
Structure of macroblock
Figure 1. Video Sequence in MPEG stream
MPEG is one of the most popular audio/video compression

techniques because it is not just a single standard.
Instead, It is a range of standards suitable for different
applications but based on similar principles. MPEG is an
acronym for the Moving Picture Expert Group established by
ISO (International Standards Organization) and IEC
(International Electrotechnical Commission). A video is a

sequence of pictures and each picture is an array of
pixels. This video data is organized in a hierarchical
fashion in an MPEG video stream. MPEG video sequence
consists of different layers, GOP, Pictures, Slices,
Macroblock, Block. A comprehensive picture is shown in
figure 1.
Video Sequence
Begins with a sequence header, includes one or more groups
of pictures, and ends with an end-of-sequence code.
Group of Pictures (GOP)

A header and a series of one or more pictures intended to
allow random access into the sequence.
Picture
This is primary coding unit of a video sequence. A picture
consists of three rectangular matrices representing
luminance (Y) and two chrominance (Cb and Cr) values. The Y
matrix has an even number of rows and columns. The Cb and Cr
matrices are one half the size of the Y matrix in
horizontal and vertical directions.
Slice
It contains one or more contiguous macroblocks. The order
of the macroblocks within a slice is from left to right and
top to bottom. Slices are important in the handling of
errors. If the bitstream contains an error, the decoder can
skip to start of the next slice.
Macroblock
This is basic coding unit in the MPEG algorithm. It is a
16x16 pixel segment in a frame. If each chrominance
component has one-half the vertical and horizontal
resolution of the luminance component, a macroblock
consists of four Y, one Cr, and one Cb block.
Block
This is smallest coding unit in the MPEG algorithm. It
consists of 8x8 pixels and can be one of three types:
luminance(Y), red chrominance(Cr), or blue chrominance(Cb).
Picture Types
The MPEG standard specifically defines three types of
pictures:
• Intra Pictures (I-Pictures)
• Predicted Pictures (P-Pictures)
• Bidirectional Pictures (B-Pictures)
These three types of pictures are combined to form a group

of picture (GOP). Typical GOP structures are as follows:
IBBPBBPBBPBBPI……
IPPIPPIPPIPPIP……
IIIIIIIIIIIIII……
Intra Pictures
Intra pictures, or I-Pictures, are coded using only
information present in the picture itself, and provides
potential random access points into the compressed video
data. It uses only transform coding and provide moderate
compression.
Predicted Pictures
Predicted pictures, or P-Pictures, are coded with respect
to the nearest previous I or P-Pictures. This technique is
called forward prediction. P-Pictures use motion
compensation to provide more compression than is possible
with I-pictures.
Bidirectional Pictures
Bidirectional pictures, or B-pictures, are pictures that
use both a past and future picture as a reference. This
technique is called bidirectional prediction. B-pictures
provide the most compression since it uses the past and
future picture as a reference, however the computation time
is largest.
Encoding Intra Picture

The MPEG transform coding algorithm for Intra picture
includes the following
steps:
• Discrete cosine transform (DCT)
• Quantization
• Run-length encoding
For every
8x8 block DCT Quantization
macroblock Zig-Zag
scan
Huffman RLE
01100010
Figure 2. Encoding of Intra Picture
The 8x8 block in a picture generally contains high spatial

redundancy. To reduce this redundancy, the MPEG algorithm
transforms 8x8 blocks of pixels from the spatial domain to
the frequency domain with the discrete cosine transform
(DCT). The combination of DCT and quantization results in
many of the high frequency coefficients being zero. To take
maximum advantage of this, the coefficients are organized
in a zigzag order to produce long runs of zero. This zigzag
sequence is then coded with a variable length code (Huffman
Encoding), which uses shorter coded for commonly occurring
pairs and longer codes for less common pairs. The intra
picture coding steps are shown in figure 2.
Target Frame
difference
Reference Frame
DCT
+
Best Match Quant.
+
Motion vector RLE
Huffman
10010011
Figure 3. Encoding of Predicted Picture

Encoding of Predicted Picture

A P-picture is coded with reference to a previous image
(reference image) that is an I or P pictures as shown in
figure 3. Motion compensation based prediction is used to
exploit the temporal redundancy. Since the frames are
closely related, it is possible to accurately represent or
predict the data of one frame based on the data of a
reference image, provided the translation is estimated.
This translation is known as motion vector of macroblock.
In P pictures, each 16x16 sized macroblock is predicted
from a macroblock of a previously encoded I picture. A
search is conducted in the I frame to find the macroblock
which closely matches the macroblock under consideration in
the P frame. The difference between two macroblock is the
prediction error. This error can be coded in the DCT domain
and quantized. Finally it uses the run-length encoding and
Huffman encoding to encode the data.
Encoding of Bi-directional Pictures

A B picture is bidirectional predicted picture. Two frames
are used to predict the current B picture, the previous
frame and the next frame. Hence B pictures are coded like P
pictures except the motion vectors can reference either the
previous reference picture, the next picture, or both.
Consider a B picture B. B will be predicted from two
reference frames R1 and R2. R1 is previous I/P picture and R2
is next I/P picture. For each macroblock MB of B, find the
closest match MB1 in R1 and MB2 in R2. The predicted
macroblock, PM is calculated as given below.
PM = NINT (α1 MB1 + α2 MB2)

where,
NINT is nearest integer operator and α1 and α2 are described
below.
α1 = 0.5 and α2 = 0.5 if both matches are satisfactory.
α1 = 1 and α2 = 0 if only first match is satisfactory.
α1 = 0 and α2 = 1 if only second match is satisfactory.
α1 = 0 and α2 = 0 if neither match is satisfactory.
Finally the error block E is computed by taking the

difference of MB and PM. This error block E is coded as per
Intra coding standards.

Introduction To MPEG Standards

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Introduction To MPEG Standards

Uploaded by

Copyright:

Available Formats

Saturn Software Mills

Visit for more technical article http://www.saturnsoftmills.com/FreeDownloads.htm

Introduction to MPEG Video Stream

Figure 1. Video Sequence in MPEG stream

MPEG is one of the most popular audio/video compression

(International Electrotechnical Commission). A video is a

Group of Pictures (GOP)

These three types of pictures are combined to form a group

Encoding Intra Picture

Figure 2. Encoding of Intra Picture

The 8x8 block in a picture generally contains high spatial

Figure 3. Encoding of Predicted Picture

Encoding of Predicted Picture

Encoding of Bi-directional Pictures

PM = NINT (α1 MB1 + α2 MB2)

Finally the error block E is computed by taking the

You might also like