You are on page 1of 15

UNIT III

VIDEO COMPRESSION

Video compression can be defined as reducing the file size of a video by discarding some
information or quality.

ADVANTAGES:

1. optimal use of storage space and


2. reduce the cost
3. speed up file transfer
4. Compressed videos are easier to transmit
5. consume less network bandwidth
6. Compression can improve the quality
TYPES OF VIDEO COMPRESSION:
Lossy video compression:
The compressed file has less data than the original file. Images and sounds that occur repeatedly
may be removed to cutout parts that are repeatedly occurs.
Lossless video compression:
The original and the compressed videos looks almost identical. And none of the data is lost.
FRAME:
A single still image in a sequence of pictures. In general one second of a video is comprised of
Picture types in video compression
In video compression a video frame is compressed using different algorithms with different
advantages and disadvantages, centered mainly around amount of data compression. These
different algorithms for video frames are called picture types or frame types.
The three major picture types used in the different video algorithms are I, P and B. They are
different in the following characteristics:

 I-Intraframe - frames are the least compressible. Compressed independently.


 P-Predictively coded frame – coded based on the previous frames
 B-Bidirectionally predicted frame - coded based on the previous frame and future frame
.

-----------------------------------------------------------------------------------------------------------------

MOTION COMPENSATION:

Motion compensation describes a picture in terms of the change from the reference picture.
The reference picture may be previous in time (past) of from the future.

In most video sequences there is little change in the contents of the image from one frame to the
next. Even in sequences that depict a great deal of activity, there are significant portions of the
image that do not change from one frame to the next. Most video compression schemes uses the
previous frame to generate a prediction for the current frame

Example:

Consider the two frames of a motion video sequence shown in Figure .

The only differences between the two frames are that the the image has moved slightly downward
and to the right of the frame, and the triangular object has moved to the left. The differences
between the two frames are so slight.

if the first frame was available to both the transmitter and receiver, not much information would
need to be transmitted to the receiver in order to reconstruct the second frame. The change
between the previous frame and the current frame is only sent. To use a previous frame to predict
the pixel values in the frame being encoded, we have to take the motion of objects in the image into
account. This is called as motion compensation.

Block based motion compensation:

In this approach, the frame being encoded is divided into blocks of size M ×M. For each block, we
search the previous reconstructed frame for the block of size M × M that most closely matches the
block being encoded.
We can measure the closeness of a match, or distance, between two blocks by the sum of absolute
differences between corresponding pixels in the two block.

If the distance from the block being encoded to the closest block in the previous reconstructed
frame is greater than the threshold, the block is declared uncompensable and encoded separately.

If the distance is below the threshold, then a motion vector is transmitted to the receiver. The
motion vector is used for prediction.

Example in 2 locations of block 1: (24,40) and (31,47)

2 locations of block 2: (21,43) and (28,50)

Then the motion vector is 21-24 and 43-40 = (-3,+3)

If the x match is positive then the best matching frame is in the Right and negative best matching is
in the left. If the x match is positive then the best matching frame is in the Below and negative best
matching is in the Above of the location to be encoded.

Example

Let us again try to predict the second frame of Example using motion compensation. We divide the
image into blocks and then predict the second frame from the first.

Motion compensation is the displacement between the encoded and matching block.
It is an integer in x, y axis. The Displacement is measured in half. Hence the coded frame is doubled.

. In this image A, B, C, and D are the pixels of the original frame. The pixels h1, h2, v1, and v2 are
obtained using two neighboring pixels:
c is the average of the four neighboring pixels from the coded original

---------------------------------------------------------------------------------------------------------------------
VIDEO SIGNAL REPRESENTATION:
Different representations of video signals has depended on past history
BLACK-AND-WHITE
A black-and-white television screen is coated with phosphor. A B&W picture is generated by exciting the
phosphor on the television screen using an electron beam
Intensity is modulated to generate the image we see. The path that the modulated electron beam traces
is shown in Figure.

The horizontal line from Left to Right is called as Trace, the line of the image.
To trace the second line the electron beam has to be deflected back to the left. This is called as retrace.
During retrace the electron gun will be in off condition.
Tracing and retracing are so swift so that it is not visible to the eyes, the data transfer needs higher BW
and high cost
To reduce the cost, it sends 525 lines 30 times a second. These 525 lines are said to constitute a frame.
1/30 second between frames is long enough for the image to appear to flicker. To avoid the flicker, it
was divided the image into two interlaced fields. A field is sent 1/60 second. First, one field consisting of
262.5 lines is traced by the electron beam. Then, the second field consisting of the remaining 262.5 lines
is traced between the lines of the first field. The situation is shown in Figure
Due to gun positions some lines are lost. Only 486 lines are visible.
Color Television:
Three electron guns are used. These guns excite red, green, and blue phosphor dots embedded in the
screen. The beam from each gun strikes only one kind of phosphor, the red gun strikes only the red
phosphor, the blue gun strikes only the blue phosphor, and the green gun strikes only the green
phosphor.
n order to control the three guns we need three signals: a red signal, a blue signal, and a green signal. If
we transmitted each of these separately, we would need three times the bandwidth. With the advent of
color television, there was also the problem of backward compatibility. Hence composite color signals
are used.
The composite color signal consists of a luminance component, corresponding to the black-and-white
television signal, and two chrominance components.
The luminance component is denoted by Y :

R is the red component, G is the green component, and B is the blue component.
The two chrominance signals are obtained as

These three signals can be used by the color television set to generate the red, blue, and green signals
needed to control the electron guns. The luminance signal can be used directly by the black-and-white
televisions.
However, there is a catch. Video compression affects a video's visual quality to a certain extent.

Sampling format is shown in the figure.


Y-Luminance
U and V – Chrominance.
Ys- Sampled Y
Cbs and Cys – Sampled Cb and Cy.
Y component takes a value between 16 and 235 while U and V components take a value between 16
and 240.
-------------------------------------------------------------------------------------------------------------------------------

H261

 H.261 is an algorithm that determines how to encode and compress the data electronically.

 It is a video coding standard published by the ITU (International Telecommunication Union)


in 1990.

 It is the most widely used international compression technique for encoding videos.

 H.261 encoding technique can encode only video part of an audiovisual service.

 H.261 is a two-way communication over ISDN lines (Video conferencing and Video calling)
and supports data rate in multiples of 64 KBPS.

DCT (discrete cosine transform)

In H261, we use 8*8 DCT. By using this transform we can convert a 8 by 8 pel block to another 8
by 8 block
Inverse DCT (IDCT)

In IDCT the converted 8x8 pels are converted into the original 8x8 pels.

Quantization

Quantization is to reduce a range of numbers to a single small value, so we can use less bits to
represent a large number. For example , we can round a real to a integer. That is a kind of
quantization.

A matrix called quantizer ( Q[i,j] ) to define quantization step.

Every time when a pixels matrix ( X[i,j] ) with the same size to Q[i,j] come , X[i,j] is divided by Xq[i,j]

Quantization Equation : Xq[i,j] = Round( X[i,j]/Q[i,j] )


Inverse Quantization Equation : X'[i,j]=Xq[i,j]*Q[i,j]

Inverse Quantization (dequantize) is to reconstruct original value. But you can see quantization
equation , use Round() function to get a nearest integer value ,so reconstructed value will not the
same with original value.

The difference between actual value and reconstructed value from quantized value is called the
quantization error.

In general if we carefully design Q[i,j], visual quality will not be affected.

A example of quantization process is described below :


Q[i,j]=2
Zig-Zag Scan And Run Length Encoding(RLE)

After DCT and quantization most AC values will be zero. By using zig-zag scan we can gather even
more consecutive zeros, then we use RLE to gain compression ratio. Below is an zig-zag scan
Example:

After zig-zag scan, many zeros are together now, so we encode the bitstream as (skip,value) pairs,
where skip is the number of zeros and value is the next non-zero component.
Zig-zag scan and RLE are only used in AC coefficient. But for DC coefficient we apply the DPCM
coding method.
Example of RLE:
The value 168 is DC coefficent , so need not to code it.
EOB is the end of block code defined in the MPEG-1 standard.

Motion Compensation:

Motion compensation describes a picture in terms of the change from the reference picture. The
reference picture may be previous in time (past) of from the future.

In most video sequences there is little change in the contents of the image from one frame to the next.
Even in sequences that depict a great deal of activity, there are significant portions of the image that do
not change from one frame to the next. Most video compression schemes uses the previous frame to
generate a prediction for the current frame

Example:

Consider the two frames of a motion video sequence shown in Figure .


The only differences between the two frames are that the the image has moved slightly downward and
to the right of the frame, and the triangular object has moved to the left. The differences between the
two frames are so slight.
if the first frame was available to both the transmitter and receiver, not much information would need
to be transmitted to the receiver in order to reconstruct the second frame. The change between the
previous frame and the current frame is only sent. To use a previous frame to predict the pixel values in
the frame being encoded, we have to take the motion of objects in the image into account. This is called
as motion compensation.
Motion Estimation:

Motion Estimation is a vital technique used in various disciplines, such as video encoding,
computer vision, and robotics, to assess and predict the movement of objects within a
sequence of images or frames

Linear Filter:
Sometimes sharp edges in the block used for prediction can result in the generation of sharp changes in
the prediction error. To reduce the sharp changes in prediction error Linear filters are used.

DIFFERENCE BETWEEN H261 AND H263


------------------------------------------------------------------------------------------------------------------------

MPEG COMPRESSION:
The Full Form of MPEG is the Moving Picture Experts Group.
It is a working group of authorities that was formed by the International Organization
for Standardization (ISO) and the International Electrotechnical Commission (IEC) to
set standards for audio and video compression and transmission.
The standards developed by the group are known as the MPEG standards. These standards
are used in a wide range of applications, including DVD, digital television, and streaming
media. The group is made up of experts in the fields of audio and video engineering,
computer science, and telecommunications. It was formed in 1988 and is headquartered in
Geneva, Switzerland.

Some of the key MPEG standards include:

1. MPEG-1: This standard is used for the coding and compression of audio and video
signals for CD-ROMs and other storage media.
2. MPEG-2: This standard is used for the coding and compression of audio and video
signals for digital television and DVD.
3. MPEG-4: This standard is used for the coding and compression of audio and video
signals for a variety of applications, including streaming video, mobile devices, and
interactive media.
4. H.264/MPEG-4 AVC: This standard is used for the coding and compression of high-
definition video for a variety of applications, including Blu-ray, HDTV, and online
video.

Features of Moving Picture Experts Group

1. Efficient compression:
2. Wide range of applications:
3. Support for a variety of content types:
4. Compatibility with other standards:
5. Regular updates:

Benefits of Moving Pictures Experts Group

1. Improved quality:
2. Increased efficiency:
3. Compatibility:
4. Innovation:
5. Interoperability:

Limitations of Moving Picture Experts Group

1. Complexity:
2. License fees:
3. Limited Compatibility
4. Limited Quality
5. Changing Technology
MPEG 2

In MPEG-1, video is represented as a sequence of pictures, and each picture is treated as a


two-dimensional array of pixels (pels). The color of each pel consists of three components : Y
(luminance), Cb and Cr (two chrominance components).

Color space conversion:

Each pixels in a picture consists of three components : R (Red), G (Green), B (Blue). But (R,G,B)
must be converted to (Y,Cb,Cr) in MPEG-1 ,then they are processed.

Conversion equation is

In MPEG-1, Y 's resolution is 4 times than Cb's and than Cr's resolution (horizonal 2 and
vertical 2), describing as below :and vertical 2), describing as below :
DCT (discrete cosine transform)

In MPEG-1, we use 8*8 DCT. By using this transform we can convert a 8 by 8 pel block to another 8
by 8 block. In general most of the energy(value) is concentrated to the top-left corner(DC
component).

Quantization

Quantization is to reduce a range of numbers to a single small value, so we can use less bits to
represent a large number. For example , we can round a real to a integer. That is a kind of
quantization.

In MPEG-1, use a matrix called quantizer ( Q[i,j] ) to define quantization step. Every time when a
pels matrix ( X[i,j] ) with the same size to Q[i,j] come ,use Q[i,j] to divide X[i,j] to get quantized
value matrix Xq[i,j] .

Quantization Equation : Xq[i,j] = Round( X[i,j]/Q[i,j] )


Inverse Qantization Eqation : X'[i,j]=Xq[i,j]*Q[i,j]

Q[i,j]=2

Zig-Zag Scan And Run Length Encoding(RLE)

After DCT and quantization most AC values will be zero. By using zig-zag scan we can gather even
more consecutive zeros, then we use RLE to gain compression ratio. Below is an zig-zag scan
example:
After zig-zag scan, many zeros are together now, so we encode the bitstream as
(skip,value)pairs,where skip is the number of zeros and value is the next non-zero component. For
zig-zag scan and RLE are only used in AC coefficient. But for DC coefficient we apply the DPCM
(will be descibed in the next section) coding method.

Below is an example of RLE:

The value 168 is DC coefficent , so need not to code it.

Predictive Coding:
Predictive coding is a technique to reduce statistical redundancy. That is based on the current
value to predict next value and code their difference (called prediction error).

Example :

Intra frame coding :


Inter Frame coding

Difference between MPEG 1 and MPEG 2;

Difference between MPEG and H261 standards

You might also like