You are on page 1of 198

A/V COMPRESSION TECHNIQUES

& COMPRESSION STANDARDS


Journée TELECOMS & MULTIMEDIA. 14 Mai 2010 – ENSA TANGER

Pr. Zouhair GUENNOUN


Ecole Mohammadia d’ingénieurs – EMI
Laboratoire d’Electronique et Communications – LEC
zouhair@emi.ac.ma
Communication Model

Information
Channel Receiver
Source Transmitted Received
information information

Data Noise source


reduction

2
Audio/Video Coding Applications

3
Detailed Communication Model

Transmitter
Information Data Source Channel
Encrypt Coding
Source Reduction Coding

Information
Noise
Channel
Receiver
Data Source Channel
Destination Reconstruction Decrypt Decoding
Decoding

4
Agenda
• Introduction

• Audio & Video compression principles

• A/V Compression standards

• Conclusion

5
Agenda
• Introduction
– Why compressing?
– Audio & Video basics
– MPEGx, & H.26x Compression Standards Overview

• Audio & Video compression principles

• A/V Compression standards

• Conclusion

6
Why compressing?

7
The need for compression
• Audio: Compression needed in spectral domain

• Bit rate of a stereo audio source (CD-DA encoding)


– Sampling frequency : 44.1kHz
– Stereo - 16-bit per sample A ud oi w a ve of m
r (m
ti e )

m
ti e

– Bit rate = 44100 * 2 * 16 = 1.41Mbits/sec

8
Digital Audio

Type Sampling Bits per # Bit Rate (Mbps)


Frequency Sample Channels
(kHz)
Telephone signal 8 8 1 0.064
(G.711) (ISDN)

CD-DA 44,1 16 2 1,411


(Compact Disc – Digital Audio) (CD-ROM 1x)

DAT 48 16 2 1,536
(Digital Audio Tape)

9
The need for compression
• Video: Compression needed in spatial domain

• Bit rate of a video source (CCIR 601 - 50Hz countries)


V di eo mi age
– 25 images per second 720 sam p el s
– YUV colour coding
(Y: luminance –
U,V : Chrominance) 576
• Y: 8 bit per pixel – lni e s
• U,V: 1 pixel on 2 coded,
8 bit per pixel

Bit rate = (576*720)*25*16 = 166Mbits/sec

10
SQ
CI
F
(1
28
*9
6)
Q

3,69
C
IF
(1
74
*1
44
)

7,52
CI
F
(3
52
*2
88
)

30,41
4C
IF
(7
04
16 *5
Bit Rate (in Mbps)

76
CI
F )
4:
3
162,20

(1
40
8
*1
16 15
CI 2)
F
16
648,81

:9
(1
92
0*
11
52
Bit Rate versus Spatial Resolution

)
1061,68

12
The need for compression
• Channels available for A/V transmission
– Analog television channel (compatibility)
• Cable (bandwidth = 8MHz)
• Satellite (Bandwidth = 30-40MHz)
Capacity around 40Mbits/sec

– Compact Disc (CD – 650MB)


For 74 min. play time : 1.41Mbits/sec

– Digital Versatile Disc (DVD – 4.7GB)


For 135 min. play time : 4.6Mbits/sec

13
Illustrative example

• PSTN modem - maximum bit rate: 56kbps

• Video frame sequence -


– Resolution: 288x352 (CIF format)
– RGB colors: 8x3 bits per pixel
– Frame rate transmission: 30 frames per second

• Required bit rate: 288x352x8x3x30 = 72.99Mbps

• Ratio between the required bit rate and largest possible bit
rate: 72.99Mbps/56kbps = 1289
– To accomplish the transmission over PSTN, a need to compress data by
at least 1289 times.

15
The need for compression

• MPEG-1 target (Video-CD : 74 min. constraints)

V di eo :166M b it/se c
1 4. M b it/se c
A ud oi :1 4. M b it/se c

C om p re s s oi n

But quality was judged too poor (about VHS quality)

16
The need for compression
• MPEG-2 target
– Program stream (DVD)

1 p ro g ram
(v di e o , 3 -9 M b it/se c (va rai b el b itra et )
m u ltci h a n n e l (b u th gi h e r q u a lity ht a nM P EG -1 )
a u d oi , ....)
= m o tvi a toi n of r ht e ca p a c ity
C om p re s s oi n ni c re a se o f ht e C D (--> D VD )

– Transport stream (DVB)

n p rog ram s
(v di eo , ab ou t 40 M b it /se c (con s tan t b itra te )
m u lt ci hanne l D
( V B -S a te llite & D VB C
- ab el )
aud oi , . . . .)

C om p re ss oi n
17
The need for compression
• Compression extends the playing time of a given storage
device.

• Compression allows a reduction in bandwidth

• For the same bandwidth, compression allows faster


transmission, and better quality.

• Compression removes redundancy from signals.


– Redundancy is however essential to making data more resistant to
errors.
– Compressed data are more sensitive to errors than uncompressed
data.

18
Principles of Compression
• Compression (or Source Coding) is achieved by
suppressing information:
– redundant information
– irrelevant information

• Suppression of redundant information


lossless compression
Fc(x,y,t) Fp(x,y,t) = Fc(x,y,t)
Compression Decompression
Rc (bps) Ri < Rc Rp = Rc

The original signal and the one obtained after


encoding and decoding are identical

19
Principles of Compression
• Suppression of irrelevant information
lossy compression (Perceptive Coding)
Example: bandwidth limitation, masking in audio

Fc(x,y,t) Fp(x,y,t) Fc(x,y,t)


Compression Decompression
Rc (bps) Ri < Rc Rp = Rc

The original signal and the one obtained after encoding and decoding
are different but are perceived as identical

20
Principles of Compression
• Lossless vs. lossy data compression
L0
– Source entropy H(X)
– Rate-Distortion function R(D)
Lossless methods
or D(R)
H(S)

• Probabilistic modeling is at the heart Lossy methods


of data compression
– What is P(X) for video source X?
– Is video coding more difficult than Distortion
image coding?
0 Dmax

22
Principles of Compression
• Reversible (lossless): data files (i.e.: V.42bis standard in
modems, zip files)

• Non-reversible (lossy): audio & video signals

• Usually more compression to lower quality and higher CPU


consumption.

– Different compression algorithms also differ in their computational


complexity, generally for the same bit rate more complex techniques get
better quality at the expense of using more CPU.

– Compression algorithms designed for telephony should introduce very little


delay because otherwise lost interactivity and echoes are problems and poor
quality of sound.

23
Principles of Compression

Bit
For Gaussian source N(0, 2)
Rate
Constant Bit Rate
Constant
2 2R
Quality DR 2
Complex
Simple
Distortion

• Scene more complex Higher bit rate for same quality


• CBR variable quality (example : Video CD artefact)
• Constant quality VBR necessary (e.g.: DVD-Video)

24
Principles of Compression
• Constant Bit Rate systems –
CBR (G.711, G.722, G.729) are better suited for
connection-oriented services.

• Variable Bit Rate systems –


VBR (MPEG, G.723.1) are best suited to networks
without constant bit rate reserve.

– MPEG compression is the most efficient and gives better


quality but consumes much CPU and introduce so much
delay can not be used in interactive applications (video
conferencing or telephone).

26
Principles of Compression

• Video codec key issues:


– Compression efficiency and image quality
– Computational complexity
– Frame rate

Encoder Channel Decoder

28
Principles of Compression

• General-purpose compression: Entropy encoding


– Remove statistical redundancy from data
– E.g. encode common values with short codes, uncommon
values with longer codes

– Good for text files, poor for images/video

Source Entropy Entropy Decoded


Data
Channel Data
Encoder Decoder

29
Principles of Compression

• Add a model that attempts to represent the image/video


signal in a form that can be easily compressed by the entropy
encoder
• Model exploits the subjective redundancy of images and
video (Spatial, Temporal, Chromatic redundancies)
• Decoded image may not be identical to original image
• Image properties that are useful for compression:
– Many of the pixels of a typical photographic image contain little or no
« useful » detail (e.g. flat area)
– The eye is insensitive to « high frequency » image information

Image Entropy Entropy Image


Model Encoder
Channel Decoder Model

30
Principles of Compression
• Trade-off Complexity/Quality/Bit Rate

• New technique may result in new trade-off


Complexity
Quality

MPEG Layer 2

MPEG Layer 1

MPEG Layer 3
Other Technique
Speech coding MPEG AAC

Bitrate
32
Principles of Compression

Redundancies

Statistical Psychological
Redundancy Redundancy (HVS)

Interpixel Coding Luminance (Contrast) Masking


Redundancy Redundancy Texture Masking
Color Masking
Frequency Masking
Spatial (intraframe) Temporal Masking
Redundancy
Variable-Length Coding
Temporal (interframe) Huffman, Arithmetic
Redundancy Run Length Coding, …

33
Quality Measurements
• Objective
– Mean Square Error (MSE)
– Peak Signal-to-Noise-Ratio (PSNR)
– Measure the fidelity to original video

• Subjective
– Human Vision System (HVS) based
– Emphasize audiovisual quality rather than fidelity

34
Quality Measurements

• Signal distortion is not a good measure of the performance of


a lossy compression method
an other method is necessary: MOS scale (Mean Opinion
Score)

• The five-grade CCIR impairment scale (Rec.562)


– 1 – unsatisfactory (Very annoying),
– 2 – poor (Annoying),
– 3 – satisfactory (Slightly annoying),
– 4 – good (Perceptible but not annoying),
– 5 – Excellent (Imperceptible)

• Example: Double blind test

36
Quality Measurements
Speech Coding - Compression vs quality

Standard MOS
64 PCM (G.711)
G.711 (64 Kb/s): 4,10
Bit Rate (Kb/s)

G.729 ( 8 Kb/s): 3,92


56 G.726 (32 Kb/s): 3,85
G.729a ( 8 Kb/s): 3,70
48 G.723.1 (5,3 Kb/s): 3,65
G.728 (16 Kb/s): 3,61
40

32 ADPCM 32 (G.726)

24 ADPCM 24 (G.725)

16 ADPCM 16 (G.726) LDCELP 16 (G.728)

8 CS-ACELP (G.729a) CS-ACELP 8 (G.729) Require special


LPC 4.8 MP-MLQ 6,4 (G.723.1) hardware (DSP)
0
0 1 2 3 4 5
MOS (Mean Opinion Score)

38
Audio & Video Basics

39
Audio Basics
• Analog signal sampled at
• Example: 8,000 mono samples/sec,
constant rate 256 quantized values --> 64kbps
– telephone: 8,000 samples/sec • Receiver converts it back to analog
– CD music: 44,100 samples/sec signal:
• Each sample quantized, i.e., – some quality reduction
rounded
– e.g., 28=256 possible quantized
values Example rates
• Each quantized value represented • CD: 1.411Mbps
by bits • MP3: 96, 128, 160kbps
– 8 bits for 256 values • Internet telephony: 5.3 - 13kbps
– 16 bits for 65536 values (G.723.3, G.729, and GSM – Global
• Mono, stereo, or surround? System for Mobile communication)
– 1, 2 or more channels

40
Audio Basics:
Speech Coding and compression
• 5 quality ranges (human ear sensitivity: 20Hz to 20kHz):

Range Frequency Bandwidth Quality and Applications

Telephone channel 300Hz – 3.4kHz intelligible speech, noisy natural,

Expanded bandwidth 50Hz – 7kHz speech with respected natural

Hi.Fi. bandwidth 20Hz – 15kHz excellent speech and music

Stereo bandwidth 20Hz – 20kHz CD quality

Stereo bandwidth 20Hz – 48kHz perfect quality, studio, cinema, DVD

43
Video Basics
• Operation of analogue television: The image captured by the camera lens
is converted into three monochrome images obtained by applying filters of
the three fundamental (primary) colors –
R (Red), G (Green), B (Blue).

– All kind colors are produced by using different proportions of these primary
colors
• Additive Color Mixing on a black surface
• Subtractive Color Mixing on a white surface

– The correct combination of the three monochrome images can reconstruct


the original image.

– RGB signals thus obtained are available in some cameras, though it is unusual
to work with them

44
Video Basics: Digital Video & Pixels
• Digital video is a sequence of frames, each consisting
of a rectangular grid of picture elements or pixels.

– For purely black-and-white video, each pixel is


represented as a single bit, 0 for black or 1 for white.

– For grey-scale video, 8 bits per pixel can be used to


represent 256 levels of grey … good enough for most
cases.

– For good colour video, 8 bits are used per pixel for each of
the RGB colours, resulting in 24 bits per pixel.

46
Video Basics : Digital Video & Pixels

Digital Camera

Film
Source: Digital Image Processing – Gonzalez, Woods. Prentice Hall

The Eye 47
Video Basics: Sampling & Quantization

Sampling & Quantization

Source: Digital Image Processing – Gonzalez, Woods. Prentice Hall

48
Video Basics: Scanning
• When an image (frame) appears on the retina of the human
eye, the image is retained for several milliseconds before
decaying.

• Consequently, if a sequence of images is displayed at the


appropriate rate, the eye does not notice that it is looking at
discrete images.
– This is how you get smooth motion in videos!

• What that rate is depends on the eye in question and how


the images are displayed.

49
Video Basics: Scanning

51
Spatial and Temporal Sampling of a Video Sequence

Source: H.264 and MPEG-4 Video Compression. Video Coding for next generation multimedia. I.E.G. Richardson. John Wiley & Sons, Ltd. 2003. Chapter 2.

53
Video Basics: Color Format
• RGB is not efficient since it uses equal bandwidth for each
color component.

• R,G,B components are correlated


– Transmitting R,G,B components separately is redundant
– More efficient use of bandwidth is desired

• To store or transmit video signals (sequence of images –


frames at constant rate), RGB signals are transformed into
three linear combination of such signals.

55
Video Basics: Color Format
• The combination is performed such that:
– One of the new signals collects all the information light or brightness of the
image, Y, this signal is called luminance.
– The other two signals, called U and V, correspond to different combinations of
the three original signals, chosen so that capture all the color information
which is why these two signals are generically referred to as chrominance.

• Various formulae have been devised to convert RGB values to


chrominance and luminance values, depending on the format: YUV, YIQ,
YCbCr, …

• Consider switching from RGB to YUV as a change of a coordinate system to


one that maintains the same number of degrees of freedom but can solve
the problem more easily.

• For backward compatibility, colour signals had to be receivable and


watchable on a black-and-white set.

56
Color Formats Conversion
Cr R Y
Y kr R k g G kb B Cg G Y Cr Cg Cb cste
Cb B Y
• kr, kg, kb are weighting factors

1 kr
Y kr R 1 kr kb G kb B R Y Cr
0.5
0.5 2k r 1 k r 2k b 1 k b
Cr R Y G Y Cr Cb
1 kr 1 k r kb 1 k r kb
0.5 1 kb
Cb B Y B Y Cb
1 kb 0.5
58
Color Formats Conversion

• ITU-R recommendation BT.601 defines


kr = 0.299 and kb = 0.114.

Y 0.299 R 0.587G 0.114 B


Cr 0.713 R Y
Cb 0.564 B Y
R Y 1.402Cr
G Y 0.714Cr 0.344Cb
B Y 1.772Cb

59
Video Basics: Color Format

http://www.yorku.ca/eye/photopik.htm
61
Video Basics: Color Format

• Human eye is more sensitive to the luminance


(brightness) component than the color component: the
latter need not be transmitted as accurately.

– The luminance is broadcast at the same frequency as a black-


and-white signal, and the chrominance is ignored on black-
and-white sets.

– The two chrominance signals are broadcast in narrow bands


at higher frequencies.
• Called hue and saturation or tint and colour
62
Video Basics:
Chrominance Downsampling
• The reduced resolution in the chroma components is called
downsampling (subsampling).

• The subsampling is based on the human eye less sensitive to


chrominance.

• (Y, Cr, Cb) may use different resolutions 4:n:m: The numbers
indicate the relative sampling rate of each component in the
horizontal direction.

63
Video Basics:
Chrominance Downsampling
• 4:4:4 sampling: the three components
have the same resolution (3n bits per
pixel)
– a sample of each component exists at
every pixel position.
– Preservation of the full fidelity of the
chrominance components.

• 4:2:2 sampling: Cb and Cr have the


same vertical resolution as Y, but half
the horizontal resolution
(2n bits per pixel).
– 4:2:2 video is used for high-quality
color reproduction.

64
Video Basics:
Chrominance Downsampling
• 4:1:1 sampling: Cb and Cr have the
same vertical resolution as Y, but
quarter the horizontal resolution (1.5n
bits per pixel).

• 4:2:0 sampling: Cb and Cr each have


half the horizontal and vertical
resolution of Y (1.5n bits per pixel).
– 4:2:0 video requires exactly half as
many samples as 4:4:4 video
– 4:2:0 is widely used for consumer
applications such as video
conferencing.

65
Video Basics: Spatial Resolution Formats
• CIF: Common Interchange (Intermediate) Format - Intermediate format used
in videoconferencing (communication between US & Europe)

– Luma resolution: 352x288 (360x288) pixels


– Sampling frequency: 30Hz (30 frames/second - fps),
non-interlaced, sampling rate 4:2:0

• QCIF:176x144 pixels, 30fps (Quarter CIF) –


used in Video Telephony applications

• SQCIF: 128x96 pixels, 30fps (Sub QCIF),


mobile multimedia applications

• 4CIF: 704x576 pixels, 30fps, appropriate for


standard-definition television and DVD-video

• 16CIF: 1408x1152 pixels, 50fps


70
Spatial Resolution Formats

QCIF SQCIF

CIF

SCIF

16CIF 4:3
16CIF 16:9

71
Video Basics: Spatial Resolution Formats

• SIF: Simple Input Format (Source Intermediate Format) - Half the


vertical & horizontal resolution of 4:2:0. Used in Video Cassette
Recorders (VCRs)

– 360x242 (352x240) pixels, 30 frames/second for NTSC,


sampling rate 4:2:0

– 360x288 (352x288) pixels, 25 frames/second for PAL, SECAM, sampling


rate 4:2:0

• CCIR-601 (ITU-R 601 or BT 601)


– 720x525 pixels, 30 frames/second, sampling rate 4:4:4 & 4:2:2
– 720x625 pixels, 25 frames/second, sampling rate 4:4:4 & 4:2:2

72
MPEG, what is it?

76
International Organizations
•ISO (1947): International Organization for Standardization;

•IEC (1906): International Electrotechnical Commission,

•ISO/IEC JTC 1 (1987): Joint Technical Committee 1 of the ISO and the
IEC. It deals with all matters of information technology.

•ITU-T : Telecommunication Standardization Sector coordinates


standards for telecommunications on behalf of the International
Telecommunication Union (ITU 1993 – 1956 CCITT).

77
International Organizations (Cont’d)
• JPEG - ITU-T T.81, ISO/IEC IS 10918-1 : Joint Photographic Experts Group one of
two sub-groups of ISO/IEC Joint Technical Committee 1, Subcommittee 29,
Working Group 1 (ISO/IEC JTC 1/SC 29/WG 1) - titled as Coding of still pictures.

• MPEG: Moving Picture Experts Group (ISO/IEC JTC 1/SC 29/WG 11) - a working
group of ISO/IEC in charge of the development of standards for coded
representation of digital audio and video and related data.

• ITU-T SG15 : H26x – Videophone & Videoconference standards

• JVT: Joint Video Team - a group of video coding experts from ITU-T Study Group
16 (VCEG) and ISO/IEC JTC 1 SC 29 / WG 11 (MPEG), created to develop an
advanced video coding specification.
•Formed in 2001, the JVT’s main result has been ITU-T Rec. H.264 | ISO/IEC 14496-10,
commonly referred to as H.264/MPEG-4-AVC, H.264/AVC, or MPEG-4 Part 10 AVC.

78
MPEG: Moving Picture Experts Group
• Moving Picture Expert Group established in 1988 for the
development of digital video
– Still active (MPEG-21 is currently in development)

• International standard (ISO/IEC)


Interoperability & economy of scale

• Compression of audio and video and multiplexing in a single


stream

• Definition of the interface not of the codecs


room for improvement

79
MPEG: Moving Picture Experts Group

• Official home page of the Moving Picture Experts


Group (MPEG):
www.chiariglione.org/mpeg/

• In charge of the development of standards for coded


representation of digital audio and video and related
data.

• The group produces standards that help the industry


offer end users an ever more enjoyable digital media
experience.

80
List of MPEG standards
• MPEG-1 (ISO 11172)
The standard on which such products as Video CD and MP3 are based
(approved in Nov. 1992)

– Video-oriented CD-ROM, SIF format (video progressive)

– Objective: VHS quality. Typical bit rate 1.5Mb/s

– Useful for tele-education, enterprise applications, business, etc.

81
List of MPEG standards (Cont’d)
• MPEG-2 (ISO 13818)
The standard on which such products as Digital Television set top boxes and DVD
are based (approved in 1994, 1996);
– Compatible extension of MPEG-1 'up‘
– Oriented broadcast (interlaced video)
– Multiple resolutions standardized, from SIF (compatible with MPEG 1 up to
high definition formats for DVDs and so on.
– Intended for studio-quality audio and video. Broadcast quality HDTV also.
– Various bit rates 4-100Mb/s.(CBR & VBR)
– Useful for all types of applications (business, entertainment, etc.).

• MPEG-3: Originally designed for HDTV, finally resolved by reparameterization of


MPEG-2.

82
List of MPEG standards (cont’d)
• MPEG-4 (ISO 14496)
The standard for multimedia for the fixed and mobile web (Version 1 -
approved in Oct. 1998, Version 2 - approved in Dec. 1999, Versions 3, 4, 5)
– Computer Graphics Applications;

– Originally intended to similar applications as H.263, but expanded to cover a


wider range of multimedia applications.

– Extension 'down' MPEG-1. Internet video Oriented

– Useful in the range 28,8-500Kb/s. New compression algorithms. Typically less


than 1 Mbps but could be as high as tens of Mbps.

83
List of MPEG standards (cont’d)
• MPEG-4 (ISO 14496) …

– Coding of Audiovisual Objects - Standard for audio, video and graphics in


interactive 2D and 3D multimedia communication - MPEG-4 v.2 & 3

– Supports scene composition and content-based functionalities, in which


scenes are expressed in terms of multiple audio-visual objects (AVOs) that can
be manipulated together or individually.

– Supports layering/scaling: multiple versions of AVOs can be provided and


matched against needs and available resources.
• For example, a base level AVO can be provided to give the bare essentials, with
multiple optional AVOs that provide levels of enhancement details.
• If we don’t have enough network resources, drop the enhancements and stick with
the basics!

84
List of MPEG standards (cont’d)

• MPEG-7 (ISO 15938) The standard for description and search of audio and
visual content (approved in Jul. 2001);

– Audiovisual content description (indexing, searching, databases, etc.)..


Interprets semantics of audiovisual information

– More to do with structuring, and describing and searching through


multimedia content

• MPEG-21 (21000) The Multimedia Framework.

– Focus on multimedia distribution and on DRM aspects;

85
List of MPEG standards (cont’d)
• MPEG-A (23000) – Application-specific formats, integrating multiple MPEG technologies

• MPEG-B (23001) – Systems specific standards

• MPEG-C (23002) – Video specific standards

• MPEG-D (23003) – Audio specific standards

• MPEG-E (23004) – MPEG multimedia Middleware - support to download and


execution of multimedia applications

• MPEG-V (23005) – Context and media control - interchange with virtual worlds

• MPEG-M (23006) – MPEG extensible Middleware - packaging and reusability


of MPEG technologies

• MPEG-U (23007) – MPEG Rich Media User Interface

86
List of ITU-T Standards
• H.261 (1983-1990)
– A standard for video telephony and video conferencing
over PSTN (Public Switching Telephone Networks) and wireless
networks.
– Uses either the CIF or QCIF format.
– Uses p x 64kbps where p can be between 1 and 30.
– Originally designed for ISDN usage (Integrated Services Digital
Network).
– Still in use
• Low complexity, low latency
• Mostly as a backward-compatibility feature
• Overtaken by H.263

87
List of ITU-T Standards (cont’d)
• H.263, H.263+, H.263++ (1993-1999)
– Based on H.261 but offers significant improvement on
coding efficiency, employs advanced coding options and
lower resolutions to preserve quality over lower bit rates
channels.
– Uses either the QCIF or S-QCIF formats.
– Uses less than 64kbps.
– PSTN and mobile network: 10 to 24kbps
– Adopted by several videophone terminal standards:
H.324 (PSTN), H.320 (ISDN), H.310 (B-ISDN)

• H.264/AVC (1999-2003)
– Double the coding efficiency in comparison to any other
existing video coding standards
88
Chronological Table of Video Coding Standards

ITU-T H.263 H.263++


VCEG (1995/96) H.263+ (2000)
H.261 (1997/98) H.264
(1990) MPEG-2
( MPEG-4
(H.262)
Part 10 )
(1994/95) MPEG-4 v1
ISO/IEC (1998/99)
(2002)
MPEG MPEG-4 v2
MPEG-1 (1999/00)
(1991) MPEG-4 v3
(2001)

1990 1992 1994 1996 1998 2000 2002 2003


92
Agenda
• Introduction

• Audio & Video compression principles


– Audio compression
– Video compression
– Audio/Video synchronisation

• A/V Compression standards

• Conclusion

94
Audio Compression principles

95
Speech Coding and Compression

• Waveform coding (PCM, DPCM, ADPCM)


– Samples coding (G.711, G.721, G.722, G.723,
G.725, G.726, …)

• Source Coding
– Speech modeling and parameters transmission of
the model (G728, G729, …)

• Hybrid Coding
96
Audio compression

• By identifying what can and, more important what


cannot be heard, the schemes described obtain
much of their compression by discarding
information that cannot be perceived.

• Over the course of our evolutionary history we have


developed limitations on what we can hear.
– Some of these limitations are physiological, based on the
machinery of hearing.
– Others are psychological, based on how our brain
processes auditory stimuli.

98
Audio Compression

• Sub-band Coding
– Techniques used in Layer I and II of MPEG audio are based
on sub-band coding.

• Transform Coding
– DCT is used in Layer III of MPEG audio.

• Predictive Coding
– Frequency prediction is used in AC-3 and MPEG AAC.

100
Common Audio Formats and Standards

 Pulse Code Modulation (PCM)


– Differential Pulse Code Modulation (DPCM)
– Adaptive Differential Pulse Code Modulation (ADPCM)

• Compact Disc Digital Audio (CD-DA)

• MPEG Audio
– Layer I
– Layer II
– Layer III

104
Audio compression

• Based on psycho-acoustics • 4 main principles :

– Threshold of audibility
• Compress the bit rate without
affecting the quality perceived
– Frequency masking
by the human ears (based on the
imperfection of human ears)
– Critical bands

• Removal of irrelevancies – Temporal masking

112
Audio compression
• Principle 1: Threshold of audibility
Not all frequency components need to be encoded with the
same resolution. Nr_bit(f) = (signal/threshold)db/6

http://www.audiodesignline.com
113
Audio compression
• Principle 2: Frequency masking
Analysis of the incoming signal

http://www.audiodesignline.com
114
Audio compression
• Principle 3: Critical bands
– Human ear may be modelled as a collection of narrow band filters
– Bandwidth of these filters = critical band
– critical band
(<100 Hz) for lowest audible frequencies
( 4 kHz) for highest audible frequencies
– The human ear cannot distinguish between two sounds having two different
frequencies in a critical band.
Example : when we hear 50 & 60 Hz at the same time we cannot distinguish
them.
– Consequence:
Noise masking threshold depends solely of the signal energy within a limited
bandwidth domain.
The largest sound is taken as the representative of the critical band.
Necessity to analyse the signal at 100Hz resolution at low-frequency

115
Audio compression
• Principle 4: Temporal masking
The masking that occurs when a sound raises the audibility
threshold for a brief interval preceding and following the
sound, selection of the frame duration for frequency analysis
and encoding.

http://www.audiodesignline.com
116
The MPEG encoder

http://www.audiodesignline.com
117
Audio features in MPEG

• MPEG1 :
– Mono/stereo/dual/joint stereo (Possibility Dolby surround)
– Sampling frequencies : 32, 44.1 & 48 kHz
– 3 layers : trade-off complexity/delay versus coding
efficiency of compression
– Various bit rate : trade-off quality versus bit rate

• MPEG2 :
– 5.1 channels
– Sampling frequencies extended to 16, 22.05 & 24 kHz

122
Layer I coding
• The Layer I coding scheme provides a 4:1 compression.

• In Layer I coding the time frequency mapping is accomplished


using a bank of 32 subband filters.

• The output of the subband filters is critically sampled. That is,


the output of each filter is down-sampled by 32.

• The samples are divided into groups of 12 samples each.


– Twelve samples from each of the 32 subband filters, or a total of 384
samples, make up one frame of the Layer I coder.

123
Layer II Coding
• The Layer II coder provides a higher compression rate by
making some relatively minor modifications to the Layer I
coding scheme.

• The compression ratio in Layer II coding can be increased from


4:1 to 8:1 or 6:1.

• These modifications include:


– how the samples are grouped together,
– the representation of the scale factors, and
– the quantization strategy.

130
Layer III Coding - MP3
• One of the problems with the Layer I and Layer II coding
schemes was that with the 32-band decomposition, the
bandwidth of the subbands at lower frequencies is
significantly larger than the critical bands.

• This makes it difficult to make an accurate judgment of the


mask-to-signal ratio.
– If we get a high amplitude tone within a subband and if the subband
was narrow enough, we could assume that it masked other tones in
the band.
– However, if the bandwidth of the subband is significantly higher than
the critical bandwidth at that frequency, it becomes more difficult to
determine whether other tones in the subband will be be masked.

131
Layer III Coding - MP3
• Layer III offers almost CD quality with less than 2 bits/sample (enables
transferring music files via Internet over 28.8kbps modems)

• A simple way to increase the spectral resolution would be to decompose


the signal directly into a higher number of bands.

• However, one of the requirements on the Layer III algorithm is that it be


backward compatible with Layer I and Layer II coders.

• To satisfy this backward compatibility requirement, the spectral


decomposition in the Layer III algorithm is performed in two stages.

132
Layer III Coding - MP3
• First the 32-band subband decomposition used in Layer I and
Layer II is employed.

• The output of each subband is transformed using a modified


discrete cosine transform (MDCT) with a 50% overlap.

• The Layer III algorithm specifies two sizes for the MDCT, 6 or
18. This means that the output of each subband can be
decomposed into 18 frequency coefficients or 6 frequency
coefficients.

133
Advanced Audio Coding

• AAC (Advanced Audio Coding): audio compression


formats defined by MPEG-2 standard.

• AAC was known as NBC (Non-Backward-Compatible),


non compatible with MPEG-1 audio formats.

134
Advanced Audio Coding

• AAC can manipulate more channels than MP2 or


MP3 (48 full audio channels and 16 enhanced low-
frequency channels compared to 5 full audio
channels and 1 enhanced low-frequency channel for
MP2 or MP3),

• AAC can manipulates higher sampling frequencies


than MP3 (up to 96kHz compared to 48kHz).

135
Video Compression principles

136
Video Compression
• Two applied techniques for video compression:

– Spatial or intraframe compression: removal of intra-


picture redundancy in the image of each frame as in JPEG
images

– Temporal or interframe compression: removal of inter-


picture redundancy (between consecutive frames.) Coding
of difference with an interpolated picture (moving
vectors).

137
Video compression
• Result
– 4:2:0 SIF resolution : 30 Mbps
(= 25images/sec * 288lines * 352pixels * 1.5(lum & chrom) * 8bits)

±1.2 Mbps (CBR) in video CD (MPEG1)

– 4:2:2 CCIR 601 resolution : 166 Mbps


(= 25images/sec * 576lines * 720pixels * 2(lum & chrom) * 8bits)

± 3-4 Mbps (mean) in MPEG2

138
Image Codec (e.g. JPEG)
Image Model Entropy Decoder

Block DCT Quantize Zigzag RLE VLC

Transmit
/Store

IBlock IDCT IQuantize IZigzag RLD VLD


Blocks Block

• Process the data in blocks (sub-images) of 8x8 samples

• Covert Red-Green-Blue into Luminance (grayscale) and


chrominance (Blue color difference and Red color difference)

• Use half resolution for chrominance (because eye is more


sensitive to grayscale than to color)

• Each block contains redundant information.

140
Discrete Cosine Transform
DCT

• DCT transformation (in frequency domain)


decorrelates the input signal.

• Transform each block of 8x8 samples into a block of


8x8 spatial frequency coefficients.

• Most image blocks only contain a few significant


coefficients (usually the lowest “frequencies”)
– Energy tends to be concentrated into a few significant
coefficients (most energy in low spatial frequencies)
– Other coefficients are close to zero / insignificant

141
Discrete Cosine Transform
• Any 8x8 block of pixels can be
represented as a sum of 64 basis
patterns (black and white patterns)

• Output of the DCT is the set of


weights for these basis patterns (The
DCT coefficients)
– Multiply each basis pattern by its weight
and add them together
– Result is the original image block

142
Quantize and zig-zag scanning
Quantize Zigzag

• Divide each DCT coefficient by an integer, discard


remainder

• high frequent spatial frequencies quantized with


lower resolution than low ones (remove irrelevancy)
- Result: loss of precision. Typically, a few non-zero
coefficients are left

• Scan quantized coefficients in a zig-zag order: Non-


zero coefficients tend to be grouped together

143
Video compression
• Spatial redundancy reduction (DCT example)

1 39 1 44 1 49 15 3 15 5 15 5 1 55 1 55 1 26 0 - 1 -1 2 -5 2 -2 -3 1
1 44 1 51 1 53 15 6 15 9 15 6 1 56 1 56 -2 3 -1 7 - 6 -3 -3 0 0 -1
1 50 1 55 1 60 16 3 15 8 15 6 1 56 1 56 DC T -1 1 - 9 - 2 2 0 -1 -1 0
1 59 1 61 1 62 16 0 16 0 15 9 1 59 1 59 -7 -2 0 1 1 0 0 0
1 59 1 60 1 61 16 2 16 2 15 5 1 55 1 55 -1 -1 1 2 0 -1 1 1
1 61 1 61 1 61 16 1 16 0 15 7 1 57 1 57 2 0 2 0 -1 1 1 -1
1 62 1 62 1 61 16 3 16 2 15 7 1 57 1 57 -1 0 0 -1 0 2 1 -1
1 62 1 62 1 61 16 1 16 3 15 8 1 58 1 58 -3 2 -4 -2 2 1 -1 0

Q uan tsi a toi n


1 58 0 -1 0 0 0 0 0
-1 -1 0 0 0 0 0 0
-1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 z gi -zag scan
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 158 0 -1 -1 -1 -1 EO B
0 0 0 0 0 0 0 0

144
Run-Length Encoding
RLE

• Encode each coefficient value as a (run, level) pair:


– Run = number of zeros preceding value
– Level = non-zero value

• Usually, the block data is reduced to a short sequence of (run,


level) pairs
– This is now easy to compress using an entropy encoder

145
Variable-Length Encoding
VLC

• Encode each (run, level) pair using a variable-length code


– Frequently occurring groups – assign a short code
– Infrequently occurring groups – assign a long code

• Result: compressed version of the image

146
Image decoding
• Reverse the stages to recover the image

• Information was thrown away during quantization


– Decoded image will not be identical to the original

• In general: more compression = more quality loss

• Too much compression:


– Block edges start to show (“blockiness”)
– High-frequency patterns start to appear (“mosquito noise”)

147
Video coding
• Moving images contain significant temporal redundancy
– Successive frames are very similar

• Add an extra “motion model” at the “front end” of the image


encoder

• The amount of data to be coded can be reduced significantly


if the previous frame is subtracted from the current frame.

148
Video Encoder
• Video frames
Motion Model

Motion
DCT Quantize Zigzag RLE VLC Buffer
Comp.
Motion
Vectors
Headers
Motion
Estim.
Motion
Vectors

Recon. IDCT Rescale


Video Decoder

Buffer VLD RLD IZigzag Rescale IDCT Recon.

Headers
Motion Estimation
• Process 16x16 luminance samples at a time (“macroblock”)

• Compare with neighboring area in previous frame

• Find closet matching area


– Prediction reference

• Calculate offset between current macroblock and prediction


reference area
– Motion vector

151
Motion Estimation

152
Motion Compensation
• Subtract the reference area from the current macroblock
– Difference macroblock

• Encode the difference macroblock with an image encoder

• If motion estimation was effective


– Little data left in difference macroblock
– More efficient compression

153
Motion Compensation
– In Motion Estimation (ME), each macroblock (MB) of
the Target P-frame is assigned a best matching MB
from the previously coded I or P frame - prediction.

– prediction error: The difference between the MB and


its matching MB, sent to DCT and its subsequent
encoding steps.

– The prediction is from a previous frame — forward


prediction.

154
Motion Compensation
• MPEG introduces a third frame type — B-frames, and its
accompanying bi-directional motion compensation.

– Each MB from a B-frame will have up to two motion vectors


(MVs) (one from the forward and one from the backward
prediction).

– If matching in both directions is successful, then two MVs will be sent


and the two corresponding matching MBs are averaged before
comparing to the Target MB for generating the prediction error.

– If an acceptable match can be found in only one of the reference


frames, then only one MV and its corresponding MB will be used from
either the forward or backward prediction.

155
B-frame Coding Based on Bidirectional Motion Compensation.

156
Motion Compensation

The Need for Bidirectional Search.

The MB containing part of a ball in the Target frame


cannot find a good matching MB in the previous frame
because half of the ball was occluded by another object.
A match however can readily be obtained from the next
frame.
157
Video MPEG - Frame Types
• I (Intra): self-contained, only spatial compression (like JPEG)

• P (Predictive): referred to the P/I before. Temporal compression by


extrapolation using macroblocks. A macroblock can be:

• Same: no change over the reference frame


• Moved: (eg. A ball in motion) is described by a vector of movement
and eventually a correction (difference from original)
• New: (eg. What appears behind a door that opens) is described by
spatial compression (like an I-frame)

• B (Bidirectional): temporal compression with interpolation; referred to


the P/I before and the P/I after. Maximum compression, maximum
computational complexity. It softens the image, reducing noise.

158
I Frames (Intra)
Intra frames are coded as self-contained,
without reference to other frames

18 KBytes I
18 KBytes I
18 KBytes I
18 KBytes I
18 KBytes I

25 frames
72 x 1024 x 8 / 0,16 = 3,7Mbps per second

159
P frames (Predictive)
Predictive frames are encoded using
motion compensation based on
previous I or P frame 18 KB I
6 KB P
6 KB P
18 KB I
6 KB P
6 KB P
18 KB I

60 x 1024 x 8 / 0,24 = 2,0Mbps


160
B frames (Bidirectional)
Bidirectional frames are encoded
18 KB I
using motion compensation based
on the nearest I or P previous and 4 KB B
subsequent 4 KB B
6 KB P
4 KB B
Common Values
4 KB B 10
6 KB P 9
4 KB B 8
4 KB B 7
18 KB I 6
5
4
3
2
1
54 x 1024 x 8 / 0,36 = 1,2Mbps

Transmission order: 1,4,2,3,7,5,6,10,8,9,…


161
Group of Picture Structure
Bidirectional Motion Pred 12
Compensation
Intra 9
B11
Pred 6 B10
B8
Pred 3 B7 16 x 16 bidirectional
B5 macroblocks
Intra 0 B4 - Intra
B2 - Forward
B1 - Reverse
- Bidirectional
 I-frames: for random access
 intraframe coded; lowest compression
 P-frames: predictive encoded
 most recent I- or P- frame, medium compression
 B-frames: interpolation
 most recent & subsequent I- or P-frame, highest compression 162
Video compression
• Temporal redundancy reduction
In c rea se o f I : In tra -coded p c i tu re
com p re ss o in P :P red c i ted p ci tu re
ra te B :B i-d ire c toi na ly
l n i te rpo a
l ted p c
i tu re

B i-d ire c to
i na lp red c
i to
in

O rde r o f 0 1 2 3 4 5 6 7 8 9
p re sen ta to
in

I B B P B B P B B P I B

P red c
i to
in

O rde r o f
tran sm s
i so
in 0 3 1 2 6 4 5 9 7 8

I P B B P B B P B B I P

163
Synchronisation - Getting data on time
• Synchronisation in the multimedia context refers to the
mechanism that ensures a temporal consistent presentation
of the audio-visual information to the user

• “On time” Not too late, not too early


No buffer over- or underflow
• Flow control : not applicable in broadcasting
• Common time base and Definition of a standard target
decoder that describes the data consumption pattern of the
receiver.
– Remark: Direct MPEG (Microsoft) does not use time information for
clock recovery but relies on flow control

164
Streams
• Idea of continuity (pipelining): Carry time information for
clock recovery

• No flow control (allows broadcasting): The emitter must have


a precise knowledge of the receiver data consumption
pattern (explicit in MPEG STD)

• Just-in-time: Shorter delay and smaller buffer size than with


flow control

• Two aspects in synchronisation :


Clock recovery & timing control (model & buffering)

165
Requirement on for stream transport

• Data information
BER (Bit Error Rate) requirement
No repetition of frame possible FEC (Forward
Error Correction)

• Time information No jitter

166
Agenda
• Introduction

• Audio & Video compression principles

• A/V Compression standards


– The MPEG model and its situation in a communication context
– JPEG & MJPEG
– H.261 & MPEG-1
– H.263 & MPEG-2
– Visioconference

• Conclusion

167
MPEG Versions
• MPEG-1
– For video storage in CD-ROM & transmission over T-1 lines (1.5Mbps)

• MPEG-2
– Many options: 352x240 pixel; 720x480 pixel; 1440x1152 pixel;
1920x1080 pixel
– Many profiles (set of coding tools & parameters)
• Main Profile
– I, P & B frames; 720x480 conventional TV
– Very good quality @ 4-6 Mbps

• MPEG-4
– <64kbps to 4Mbps
– Designed to enable viewing, access & manipulation of objects, not only
pixels
– For digital TV, streaming video, mobile multimedia & games
168
MPEG Coding Standard
• Motion Picture Expert Group (MPEG)
– Video and audio compression & multiplexing
– Video display controls
• Fast forward, reverse, random access

• Elements of encoding
– Intra- and inter-frame coding using DCT
– Bidirectional motion compensation
– Group of Picture structure
– Scalability options

• MPEG only standardizes the decoder

169
Video H.26x
• ITU-T video Standards for video conferencing: low speed,
low turnover. Less action in movies.
– H.261: Developed in the late 80 for ISDN (constant flow).
– H.263, H.263+, H.264. More modern and efficient.

• Simplified MPEG compression algorithms:


– More restricted motion vectors (least action)
– In H.261: No frames B (excessive latency and complexity)

• Less CPU intensive. Feasible real-time software codec

170
Video H.26x (Cont’d)
• Subsampling 4:1:1

• Resolutions:
– CIF (Common Interchange Format): 352 x 288
– QCIF (Quarter CIF): 176 x 144
– SCIF (Super CIF): 704 x 576

• Independent Audio: G.722 (quality), G.723.1, G.728, G.729

• Audio-video synchronization using H.320 (ISDN) and H.323


(Internet)

171
The MPEG model

A ud oi A ud oi A ud oi A ud oi
s gi na l en code r de c ode r s gi na l
M u ltpi el xe r T ran sm si s oi n D em u lt i-
cha nne l p el xe r
V di eo V di eo V di eo V di eo
s gi na l en code r D gi ita l s to rage m ed uim de c ode r s gi na l
or
N e wt o rk
C ap tu red s gi na sl P re sen ted s gi na sl

172
Components of the MPEG standard
• The MPEG standard is composed of 3 main parts :
– Audio : Specifies the compression of audio signals
– Video : Specifies the compression of video signals
– System : specifies how the compressed audio and video signals are
combined in the multiplexed stream (program stream or transport
stream).

• Each part specifies :


– The bitstream syntax
– The timing requirement and the related information (bit rate, buffer
needs)

174
MPEG in a communication context
• A simple view of MPEG in the communication context
ES TS (T ran spo r tS tre am )
E( elm en at ry or
S tre am ) PS P( ro g ram S tream )

TS A da p -
at toi n
A ud oi , M u lt i- ot ht e
v di eo p el x ni g ch ann e l C a b el
so u rce s TS
(n p ro -
V di eo g ram s )
E n code r A da p -
at toi n
A ud oi ot ht e S a et lliet
en cod e r ch ann e l
PS
M u lt i-
p el x ni g
PS A da p -
(1 p ro - at toi n
g ram ) ot ht e D si c
ch ann e l

M PEG 2 com p re s s oi n al ye r M PEG 2 s y s etm al ye r DVB ,DVD ...


178
JPEG & MJPEG

179
JPEG Coding Standard
• Key Components:
– Transform:
• 8×8 DCT
• boundary padding
– Quantization:
• uniform quantization
• DC/AC coefficients
– Coding:
• Zigzag scan
• run length/Huffman coding

180
JPEG Baseline Coder

Tour Example
183 160 94 153 194 163 132 165
183 153 116 176 187 166 130 169
179 168 171 182 179 170 131 167
177 177 179 177 179 165 131 167
178 178 179 176 182 164 130 171
179 180 180 179 183 169 132 169
179 179 180 182 183 170 129 173
180 179 181 179 181 170 130 169

181
Step 1: Transform
• DC level shifting

183 160 94 153 194 163 132 165 55 36 34 25 66 35 4 37


183 153 116 176 187 166 130 169 55 25 12 48 59 38 2 41
179 168 171 182 179 170 131 167 51 40 43 54 51 42 3 39
177 177 179 177 179 165 131 167
-128 49 49 51 49 51 37 3 39
178 178 179 176 182 164 130 171 50 50 51 48 54 36 2 43
179 180 180 179 183 169 132 169 51 52 52 51 55 41 4 41
179 179 180 182 183 170 129 173 51 51 52 54 55 42 1 45
180 179 181 179 181 170 130 169 52 51 53 51 53 42 2 41

• 2D DCT
55 36 34 25 66 35 4 37 313 56 27 18 78 60 27 27
55 25 12 48 59 38 2 41 38 27 13 44 32 1 24 10
51 40 43 54 51 42 3 39 20 17 10 33 21 6 16 9
49 49 51 49 51 37 3 39
DCT 10 8 9 17 9 10 13 1
50 50 51 48 54 36 2 43 6 1 6 4 3 7 5 5
51 52 52 51 55 41 4 41 2 3 0 3 7 4 0 3
51 51 52 54 55 42 1 45 4 4 1 2 9 0 2 4
52 51 53 51 53 42 2 41 3 1 0 4 2 1 3 1

182
Step 2: Quantization

16 11 10 16 24 40 51 61
12 12 14 19 26 58 60 55 Why increase
Q-table 14 13 16 24 40 57 69 56
14 17 22 29 51 87 80 62 from top-left to
18 22 37 56 68 109 103 77 bottom-right?
24 35 55 64 81 104 113 92
49 64 78 87 103 121 120 101
72 92 95 98 112 100 103 99

313 56 27 18 78 60 27 27 20 5 3 1 3 2 1 0
38 27 13 44 32 1 24 10 3 2 1 2 1 0 0 0
20 17 10 33 21 6 16 9 Q 1 1 1 1 1 0 0 0
10 8 9 17 9 10 13 1 1 0 0 1 0 0 0 0
6 1 6 4 3 7 5 5 0 0 0 0 0 0 0 0
2 3 0 3 7 4 0 3 0 0 0 0 0 0 0 0
4 4 1 2 9 0 2 4 0 0 0 0 0 0 0 0
3 1 0 4 2 1 3 1 0 0 0 0 0 0 0 0

183
Step 3: Entropy Coding
20 5 3 1 3 2 1 0
3 2 1 2 1 0 0 0
1 1 1 1 1 0 0 0
1 0 0 1 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0
Zigzag Scan
(20,5,-3,-1,-2,-3,1,1,-1,-1,
0,0,1,2,3,-2,1,1,0,0,0,0,0,
0,1,1,0,1,EOB)

End Of the Block:


Zigzag Scan All following coefficients
are zero
184
Video M-JPEG (Motion JPEG)
• The simplest: Try the video as a sequence of JPEG photos,
without taking advantage of redundancy between frames.
• DCT Algorithm (Discrete Cosine Transform)
• less efficient, but low delay.
• Used in:
– Some digital recording systems and nonlinear
editing (editing independent of each frame)
– Some videoconferencing systems (low delay).
• It does not include standard audio support. The audio
has been encoded by some other means (eg CD-DA)
and synchronized by non-standard mechanisms.

185
H.261 & MPEG1

186
H.261 Coding Standard
• Background:
– Facilitate video conferencing and videophone service over
ISDN
– p×64 kbps
• p=1: videophone;
• p>5: videoconference;
• p=30: VHS-quality;
– Basis of MPEG-1 and MPEG-2

• Features
– Maximum coding delay of 150ms
– Amenable to low-cost VLSA implementation

187
Input Image Formats
CIF QCIF

# of pels/line (Y) 360(352) 180(176)


# of pels/line (U/V) 180(176) 90(88)
# of lines/pic (Y) 288 144
# of lines/pic (U/V) 144 72
Interlacing 1:1 1:1
Temporal rate 30,15,10,7.5 30,15,10,7.5
Aspect ratio 4:3 4:3

188
Video Multiplex
• It defines a data structure so that a decoder can
interpret the received bit stream without any
ambiguity

• Hierarchical data structure


– Picture layer
– Group of blocks (GOB) layer
– Macroblock (MB) layer
– Block layer

– Each layer has a distinct header

189
Picture and GOB Layers

• Picture layer consists of picture header followed by


the data for GOBs
– Picture header contains data such as picture format (CIF or
QCIF)

• GOB layer is always composed of 33 MBs


– GOB header contains a MB address and compression mode
followed by the data for the blocks

190
Macroblock and Block Layers

Macroblock: the smallest unit to select the compression mode

Y1 Y2
Cr Cb

Y3 Y4

A MB always consists of 6 blocks (Y1 – Y4, Cr, Cb)

MBA MTYPE MQUANT MVD CBP Block Data

191
Compression Modes
• Intra Mode
– Similar to JPEG coding
– Support two compression modes

• Inter Mode
– ME is not specified (MC is optional)
– Usually, 16-by-16 BMA, integer-pel accuracy,
search range [-15,15]
– Support various compression modes

192
H.261 Encoder
Intra
Huffman
8x8 DCT Q VLC
block
- Inter

Q-1
Filter
CRC error p x 64
and
Frame Fixed-length
I-DCT
Memory control

Motion
Estimation Motion Vector

• Intended for videoconferencing applications


• Bit rates = p x 64 kbps, p = 2, 6, 24 common
197
MPEG-1
• MPEG-1 adopts the SIF (Source Input Format) digital TV format.

• MPEG-1 supports only non-interlaced video. Normally, its picture resolution


is:
– 352×240 for NTSC video at 30 fps
– 352×288 for PAL video at 25 fps
– It uses 4:2:0 chroma subsampling

• The MPEG-1 standard is also referred to as ISO/IEC 11172. It has five parts:
– 11172-1 Systems,
– 11172-2 Video,
– 11172-3 Audio,
– 11172-4 Conformance, and
– 11172-5 Software.

198
Hierarchical Data Structure
• Sequences are formed by Group Of Pictures (GOP)

• GOP are made up of pictures (frames)

• Pictures consist of slices

• Slices are made up of macro-blocks (MB)

• Macro-blocks consist of blocks

• Blocks are 8×8 pixels arrays

200
Hierarchical Data Structure

Layers of MPEG-1 Video Bitstream.


201
Example of temporal picture structure

202
Slices in an MPEG-1 Picture.

203
Video MPEG (MPEG-1)
• Subsampling 4:2:0 (25% more savings than 4:2:2)

• Two possible formats:


– SIF (Standard Interchange Format) - in PAL (396 MBs):
– Y: 352x288 pixels,
– Cr & Cb: 176x144 pixels
– QSIF (Quarter SIF) (99 MBs):
– Y: 176 x 144;
– Cr & Cb : 88 x 72

• Two compression types (simultaneously):


– Spatial: as in JPEG
– Temporal: takes advantage of each frame having similarity with
those around.

204
MPEG-1 Video
• Typical Sequence (360ms): I1 B2 B3 P4 B5 B6 P7 B8 B9 I10
• Order of encoding / decoding : I1 P4 B2 B3 P7 B5 B6 I10 B8 B9

• Typical size of frames (SIF, 352x288):


– I: 18kBytes (7:1)
– P: 6kBytes (20:1)
– B: 2.5 - 4kBytes (50:1)

– Average bit rate (IBBPBBPBBI): 1.2Mbps


– With QSIF the bit rate is reduced to 300kbps

• Compression Latency (Typical values):


– M-JPEG: 45 ms
– MPEG frames I: 200 - 400 ms
– MPEG frames I & P: 200 - 500 ms
– MPEG frames I, P & B: 400 - 850 ms
206
MB Types in MPEG-I
I-pictures P-pictures B-pictures
Intra Intra Intra
Intra-A Intra-A Intra-A
Inter-D Inter-F
Inter-DA Inter-FD
Inter-F Inter-FDA
Inter-FD Inter-B
Inter-FDA Inter-BD
Skipped Inter-BDA
A- adaptive quantization Inter-I
F- forward prediction with MC Inter-ID
D- DCT of prediction error will be coded Inter-IDA
B – backward prediction with MC
Skipped
I – interpolated prediction with MC

210
Audio MPEG-1
• Mono or stereo sampling to 32, 44.1 (CD) or 48 (DAT) kHz. If you are
using a reduced bit rate it is desirable to sample at 32 kHz.
• Psychoacoustic compression (with losses) asymmetric.
• From 32 to 448 kbps per audio channel
• Three layers in ascending order of complexity/quality:
– Layer I: good quality with 192-256 kbps per channel is not used
– Layer II: 96-128 kbps CD quality per channel
– Layer III: quality CD with 64 kbps per channel
• Each layer introduces new algorithms, and includes those of the
above.
• Layer III used in DAB (Digital Audio Broadcast) and MP3

214
MPEG-1System
• Responsible for ensuring the synchronization between
audio and video through a system of time slots (
'timeslots') based on a clock of 90kHz.

• It is only necessary if using audio and video


simultaneously (not for MP3 streams for example)

• Requires a small flow (5-50kbps)

216
Synchronization of audio and video MPEG

Digital audio stream


Analog audio Audio with timeslots
signal encoder

Clock System MPEG-1 stream


90 KHz Multiplexer

Analog video Video Digital video stream


signal encoder with timeslots

During the decoding the reverse process is performed

217
Prototypical Decoder
ISO/IEC 11172

219
Major Differences from H.261

• Source formats supported:


– H.261 only supports CIF (352 × 288) and QCIF (176 × 144) source formats,
MPEG-1 supports SIF (352 × 240 for NTSC, 352 × 288 for PAL).
– MPEG-1 also allows specification of other formats as long as the Constrained
Parameter Set (CPS) as shown in the following Table is satisfied:

The MPEG-1 Constrained Parameter Set


Parameter Value
Horizontal size of picture ≤ 768
Vertical size of picture ≤ 576
No. of MBs / picture ≤ 396
No. of MBs / second ≤ 9,900
Frame rate ≤ 30 fps
Bit-rate ≤ 1,856 kbps
220
MPEG-I vs. H.261
H.261 MPEG-1
Sequential access Random access
One basic frame rate Flexible frame rate
CIF and QCIF images only Flexible image size
I and P frames only I, P and B frames
MC over 1 frame MC over 1 or more frames
Integer-pel MV accuracy Half-pel MV accuracy
Spatial filtering in the loop No filter
Variable threshold+uniform Quantization matrix
quantization
No GOP structure GOP structure
GOB structure Slice structure

225
H.263/H.263+ & MPEG2

226
Video Codecs: H.263
• Frame-based coding
• Low Bit rate Coding:
– < 64 kbps (typical)

• H.261 coding with improvements


– I/P/B frames
– Additional Image formats: 4CIF, 16CIF

• Suitable for desktop video conferencing over


low-speed links
227
H.263 Baseline Coding Algorithm
• Video Frame Structure
– support sub-QCIF, QCIF, CIF, 4CIF and 16CIF

• Video Coding Tools


– Motion estimation and compensation
• range : [-16,15.5] accuracy : half-pel
– Transform: 8×8 DCT
cm , n
cmq ,n ,0 m, n 7
– Quantization: Q factor Q
– Entropy Coding: 3D VLC (LAST,RUN,LEVEL)

• Coding Control
– Intra/Inter switch

230
Advanced Coding Modes in H.263

Unrestricted motion vector mode


• range : [-31.5,31.5]
• Allow MV to point outside the picture boundaries
• Syntax-based arithmetic coding mode
• About 5% savings over VLC
• Advanced prediction mode

Overlapped Block Motion Compensation (OBMC)


• PB-frame mode

I B P B P …

231
H.263+
• Advanced intra coding mode • Temporal, SNR and Spatial
scalability mode
• Deblocking filter mode
• Reference picture resampling
• Slice structure mode mode
• Supplemental enhancement • Reduced resolution update mode
information mode

• Improved PB-frame mode • Independently segmented


decoding mode
• Reference picture selection mode
• Alternative Inter VLC mode

• Modified quantization mode

235
MPEG-2
• MPEG-2: For higher quality video at a bit-rate of more than 4
Mbps.

• Defined seven profiles aimed at different applications


(toolboxes) :
– Simple profile (No B picture),
– Main profile (=MPEG1+interlaced, Does not support scalability),
– SNR scalable profile (allows graceful degradation (noise improvement
at same resolution),
– Spatial scalable profile (hierarchical coding : improvement at higher
resolution),
– High profile.
– 4:2:2 Profile,
– Multiview Profile.

244
Video MPEG-2
• Compatible extension of MPEG-1

• Designed for digital TV:


– Optimized for transmission, not storage
– Provides interlaced video (TV) as well as progressive (MPEG-1 was
only progressive)

• According to the values of the sampling parameters used


are defined in MPEG-2 four levels exist:
– Low: 352x288 (supports MPEG-1)
– Main: 720x576 (equivalent CCIR 601)
– High-1440: 1440x1152 (HDTV 4:3)
– High: 1920x1152 (HDTV 16:9)
246
Profiles and Levels in MPEG-2
Level Simple Main SNR Spatially High 4:2:2 Multiview
profile profile Scalable Scalable Profile Profile Profile
profile profile
High * *
High 1440 * * *
Main * * * * * *
Low * *

Four Levels in the Main Profile of MPEG-2

Level Max. Resolution Max Max Max coded Data Application


fps pixels/sec Rate (Mbps)

High 1,920 × 1,152 60 62.7 × 106 80 film production


High 1440 1,440 × 1,152 60 47.0 × 106 60 consumer HDTV
Main 720 × 576 30 10.4 × 106 15 studio TV
Low 352 × 288 30 3.0 × 106 4 consumer tape equiv.

247
Bit rates of Levels and Profiles MPEG-2
Profiles Simple Main SNR Spatial High 4:2:2
Scalability Scalability (Studio)
Subsampling 4:2:0 4:2:0 4:2:0 4:2:0 4:2:0/2 4:2:2
High 1920x1152 80Mbps 100Mbps
(HDTV 16:9)
High -1440 60Mbps 60Mbps 80Mbps
1440x1152
(HDTV 4:3)
Levels

Main 720x576 15Mbps 15Mbps 15Mbps 20Mbps 50Mbps


(CCIR 601)
Low 352x288 4Mbps 4Mbps
(MPEG1)

The peak rates are shown under the standard for each combination of profile and level.

248
Five Modes of Predictions
• MPEG-2 defines Frame Prediction and Field Prediction as well
as five prediction modes:

1. Frame Prediction for Frame-pictures:


Identical to MPEG-1 MC-based prediction methods in both P-frames
and B-frames.

2. Field Prediction for Field-pictures:


A macroblock size of 16×16 from Field-pictures is used.

249
Five Modes of Predictions
3. Field Prediction for Frame-pictures:
The top-field and bottom-field of a Frame-picture are treated
separately. Each 16×16 macroblock (MB) from the target Frame-
picture is split into two 16×8 parts, each coming from one field. Field
prediction is carried out for these 16×8 parts.

4. 16×8 MC for Field-pictures:


Each 16×16 macroblock (MB) from the target Field-picture is split into
top and bottom 16×8 halves. Field prediction is performed on each
half. This generates two motion vectors for each 16×16 MB in the P-
Field-picture, and up to four motion vectors for each MB in the B-
Field-picture.

This mode is good for a finer MC when motion is rapid and irregular.

250
Five Modes of Predictions
5. Dual-Prime for P-pictures:
First, Field prediction from each previous field with the same parity
(top or bottom) is made. Each motion vector mv is then used to derive
a calculated motion vector cv in the field with the opposite parity
taking into account the temporal scaling and vertical shift between
lines in the top and bottom fields. For each MB the pair mv and cv
yields two preliminary predictions. Their prediction errors are
averaged and used as the final prediction error.

This mode mimics B-picture prediction for P-pictures without adopting


backward prediction (and hence with less encoding delay).

This is the only mode that can be used for either Frame-pictures or
Field-pictures.

251
Supporting Interlaced Video
• MPEG-2 must support interlaced video as well since this is one of
the options for digital broadcast TV and HDTV.

• In interlaced video each frame consists of two fields, referred to


as the top-field and the bottom-field.

– In a Frame-picture, all scanlines from both fields are interleaved to form


a single frame, then divided into 16×16 macroblocks and coded using
MC.

– If each field is treated as a separate picture, then it is called Field-


picture.

252
Audio MPEG-2
• Algorithms:
– Version compatible with MPEG-1 Layer I, II and III
– Improved Compression System Advanced Audio Coding (AAC).
Comparable quality to MPEG-1 layer III with 50-70% of flow. Not
compatible with MPEG-1.

• Channels:
– Stereo version compatible with MPEG-1
• Independent (each channel)
• Set (exploits redundancy between channels)
– Support multi-channel (languages) and 5.1 (5 channels surround)

259
MPEG-2 Scalabilities
• The MPEG-2 scalable coding: A base layer and one or more enhancement
layers can be defined — also known as layered coding.

– The base layer can be independently encoded, transmitted and decoded to


obtain basic video quality.

– The encoding and decoding of the enhancement layer is dependent on the


base layer or the previous enhancement layer.

• Scalable coding is especially useful for MPEG-2 video transmitted over


networks with following characteristics:
– Networks with very different bit-rates.
– Networks with variable bit rate (VBR) channels.
– Networks with noisy connections.

261
MPEG-2 Scalabilities (Cont’d)
• MPEG-2 supports the following scalabilities:

1. SNR Scalability—enhancement layer provides higher SNR (Different levels of


quality), base/enhancement layer uses a coarse/fine quantizer for DCT
coefficients.

2. Spatial Scalability — enhancement layer provides higher spatial resolution


(Different resolutions), base/enhancement layer is a low/high spatial resolution
of the video.

3. Temporal Scalability—enhancement layer facilitates higher frame rate (Different


frame rates), allow the decodability at different frame rates.

4. Hybrid Scalability — combination of any two of the above three scalabilities.

5. Data Partitioning — quantized DCT coefficients are split into partitions (Separate
headers and payloads apart).

• Limited scalability capabilities: Three layers only

262
Non-Scalable

Non-scalable Bit stream

Decoder 1 Decoder 2 Decoder 3


264
Spatial Scalability

Scalable bit stream

Decoder 1

Decoder 2

Decoder 3 265
Decoder 4
PSNR Scalability (Quality)

Scalable Bit stream

Decoder 1 Decoder 2 Decoder 3


268
Temporal scalability

1 0 1 1 1 … 0 1 0 1 0 0 0 … 1 1 0 1 0 0

Frame 0,4,8,12,… Frame 0,2,4,6,8,… Frame 0,1,2,3,4,5,…

7.5Hz 15Hz 30Hz

272
Hybrid Scalability
• Any two of the above three scalabilities can be combined
to form hybrid scalability:
1. Spatial and Temporal Hybrid Scalability.
2. SNR and Spatial Hybrid Scalability.
3. SNR and Temporal Hybrid Scalability.

• Usually, a three-layer hybrid coder will be adopted which


consists of:
– Base Layer,
– Enhancement Layer 1, and
– Enhancement Layer 2.

276
Data Partitioning
• The Base partition contains lower-frequency DCT coefficients,
enhancement partition contains high-frequency DCT
coefficients.

• Strictly speaking, data partitioning is not layered coding, since a


single stream of video data is simply divided up and there is no
further dependence on the base partition in generating the
enhancement partition.

• Useful for transmission over noisy channels and for progressive


transmission.

277
Major Differences from MPEG-1
• Better resilience to bit-errors: In addition to Program Stream, a
Transport Stream is added to MPEG-2 bit streams.

• Support of 4:2:2 and 4:4:4 chroma subsampling.

• More restricted slice structure: MPEG-2 slices must start and end in
the same macroblock row. In other words, the left edge of a picture
always starts a new slice and the longest slice in MPEG-2 can have
only one row of macroblocks.

• More flexible video formats: It supports various picture resolutions


as defined by DVD, ATV and HDTV.

278
Major Differences from MPEG-1 (Cont’d)

• Nonlinear quantization — two types of scales are allowed:

1. For the first type, scale is the same as in MPEG-1 in which it is an


integer in the range of [1, 31] and scalei = i.

2. For the second type, a nonlinear relationship exists, i.e., scalei ≠ i.


The ith scale value can be looked up from the following Table.

Table : Possible Nonlinear Scale in MPEG-2

279
Other Improvements

MPEG-I MPEG-II

Intra MB 8bits 11bits


DC Coeff.
Intra MB [-256,255] [-2048,2047]
AC Coeff.
Non-intra MB [-256,255] [-2048,2047]
Coeff.

Finer Quantization of the DCT Coefficients

280
Videoconference
• Interactive communication through audio, video and
data sharing

• It can be:
– Point to point
– Point to multipoint
– Multipoint to multipoint

282
Requirements / Features of the
videoconference
• Compression / Decompression in real time.

• 200-400 ms maximum delay.

• Mobility disabled.

• Normally acceptable quality audio phone.

• Need to synchronize audio and video.

• Need for signaling protocol (connectionless service).

283
Videoconference Standards
• Videoconferencing systems have been standardized by the
ITU-T (International Telecommunications Union -
Telecommunications sector) in the standards of the series H
(multimedia and audiovisual systems)

• The H.32x are videoconferencing standards.


The 'x' depends on the type of network used

284
H.32x Standards
Standard Physical Service Type Year approval
environment
H.320 ISDN Circuit 1990
Streaming a/v
128 to 384 Kb/s
H.321 ATM Circuit
H.322 IsoEthernet TDM
H.323 Ethernet Packet 1996
Streaming a/v
14,4 - 512 Kb/s
H.324 analog Modem Circuit

The H.32x are standards umbrella. Each is based on a previous set of standards to
specify all the necessary services in a videoconference.
e.g., G.711 audio coding
285
H.320 Standard

286
H.323 Standard
• Packet-based multimedia communications systems

287
H.320 & H.323 Standards

H.261 H.221 H.243 G.711 H.261/263 H.245 Q.931 G.711


Video Binary train Multi Point 3.1kHz audio Video Coding Control Call 3.1kHz audio
Coding conversion 64/56kbps
G.728 Protocol Signalization 64/56kbps
H.230 H.242 G.722 H.225
3.1kHz audio RAS G.723
packetization
Signalization Control 16kbps 7kHz audio Gate Keeper 3.1kHz audio
and Control Protocol 64/56/48kbps Signalization 5.3kbps
T.120
Data Protocols Multimedia
Communication

ISDN IP

288
H.320 & H.323 Standards

H.323 H.320
Control H.225.0 Call Control Q.931
H.245 System Control H.242
H.225.0 Multiplexing H.221
Media G.711 Audio G.711
G.722 G.722
G.723.1 G.728
G.728
H.261 Video H.261
H.263 H.263
T.120 Data T.120

289
H.32x audio Formats

Codec Original bandwidth Compression Compressed


(kbps) Ratio Bandwidth (kbps)

G.711 64 1:1 64
G.722 224 3,5-4,6 : 1 48-64
G.723.1 64 10 : 1 6,4
G.728 64 4:1 16
G.729 64 8:1 8
MPEG 706 3-11 : 1 64-256

MPEG is not an audio format H.323. It only appears for comparison

290
Agenda
• Introduction

• Audio & Video compression principles

• A/V Compression standards

• Conclusion

294
Some Digital Audio Formats
Sampling Freq. Capacity per Channel
Format # Channels Application
(KHz) (Kb/s)
PCM (G.711) 8 1 64 Telephony

ADPCM (G.721) 8 1 32 Telephony

SB-ADPCM (G.722) 16 1 48/56/64 Vídeoconferenc.

MP-MLQ (G.723.1) 8 1 6,3/5,3 variable Internet Telephony

ADPCM (G.726) 8 1 16/24/32/40 Telephony

E-ADPCM (G.727) 8 1 16/24/32/40 Telephony


Low delay
LD-CELP (G.728) 8 1 16 Telephony /Videoc.

CS-ACELP (G.729) 8 1 8 Internet Telephony

RPE-LTP (GSM 06.10) 8 1 13,2 GSM Telephony

CELP (FS 1016) 8 1 4,8


LPC-10E (FS 1015) 8 1 2,4
CD-DA / DAT 44,1/48 2 705,6/768 Hi-Fi Audio

MPEG-1 Layer I 32/44,1/48 2 192-256 variable

High delay MPEG-1 Layer II 32/44,1/48 2 96-128 variable


MPEG-1 Layer III (MP3) 32/44,1/48 2 64 variable Hi-Fi Internet

MPEG-2 AAC 32/44,1/48 5.1 32-44 variable Hi-Fi Internet

295
Digital Video Formats
Color Frame Rate Raw Data Rate
Video Format Y Size
Sampling (Hz) (Mbps)

HDTV Over air. cable, satellite, MPEG2 video, 20-45 Mbps


SMPTE296M 1280x720 4:2:0 24P/30P/60P 265/332/664
SMPTE295M 1920x1080 4:2:0 24P/30P/60I 597/746/746

Video production, MPEG2, 15-50 Mbps


BT.601 720x480/576 4:4:4 60I/50I 249
BT.601 720x480/576 4:2:2 60I/50I 166

High quality video distribution (DVD, SDTV), MPEG2, 4-10 Mbps


BT.601 720x480/576 4:2:0 60I/50I 124

Intermediate quality video distribution (VCD, WWW), MPEG1, 1.5 Mbps


SIF 352x240/288 4:2:0 30P/25P 30

Video conferencing over ISDN/Internet, H.261/H.263, 128-384 Kbps


CIF 352x288 4:2:0 30P 37

Video telephony over wired/wireless modem, H.263, 20-64 Kbps


QCIF 176x144 4:2:0 30P 9.1

296
Compressed video standard resolutions

Format SQCIF QCIF CIF 4CIF or 16CIF 16CIF 16:9


SCIF 4:3

Resolution 128x96 176x144 352x288 704x576 1408x11521 1920x1152


720x576 440x1152
H.261 Op.

H.263 Op. Op.


Standard

MPEG-4
MPEG-1
MPEG-2 Low Principal High 1440 High

297
Video compression formats

System Spatial Temporal Complexity Efficiency delay


Compression Compression Compression
(DCT)
M-JPEG Yes No Medium Low Very
small

H.261 Yes Limited High Medium small


(I & P)

MPEG-1/2 Yes Extended Very High Large high


(I, P & B)

H.263 Yes Extended Enormous large Half


MPEG-4 (I, P & B) high

298
Video compression formats Bit rates

Standard/Format Typical Bandwidth Compression


Ratio
CCIR 601 170Mbps 1:1 (Reference)
M-JPEG 10-20Mbps 7-27:1
Low H.261 64 – 2000kbps 24:1
delay
H.263 28,8-768kbps 50:1
MPEG-1 0,4-2,0Mbps 100:1

High MPEG-2 1,5-60Mbps 30-100:1


delay
MPEG-4 28,8-500kbps 100-200:1

299
Video compression formats
Type Method Format Original Compressed

Video H.261 176x144 or 2-36 Mbps 64-1544kbps


Conference 352x288
@10-30 fr/sec

Full Motion MPEG2 720x480 249 Mbps 2-6Mbps


@30 fr/sec

HDTV MPEG2 1920x1080 @30 1.6 Gbps 19-38Mbps


fr/sec

300
Agenda
• Introduction

• Audio & Video compression principles

• A/V Compression standards

• Conclusion

303
References
• Yun Q. Shi, Huifung Sun, 2008. Image and Video Compression for Multimedia Engineering.
Fundamentals, Algorithms, and Standards. CRC Press.
• Gonzalez, Woods, 2008. Digital Image Processing. Prentice Hall.
• Jae-Beom Lee, Hari Kalva, 2008. The VC-1 and H.264 Video Compression Standards for
Broadband Video Services. Springer.
• H.R. Wu & .R. Rao, 2006. Digital Video Image Quality and Perceptual Coding. Taylor & Francis
Group. LLC.
• Khalid Sayood, 2005. An introduction to data compression. Morgan Kaufmann Publishers.
• I.E.G. Richardson, 2003. H.264 and MPEG-4 Video Compression. Video Coding for next
generation multimedia. John Wiley & Sons, Ltd.
• Richardson, 2002. Video Codec Design. John Wiley & Sons.
• John WATINSON, 2001. The MPEG Handbook MPEG1, MPEG2, MPEG4. Focal Press.
• Ghanbari, 1999. Video coding: an introduction to standard codecs. IEE Press.
• Riley and Richardson, 1997. Digital Video Communications. pub. Artech House.
• Bhaskaran V, Konstantinides, 1996. Image and video compression standards – algorithms and
architectures. Kluwer academic publishers.
• Netravali, A N and Haskell, B G, 1995. Digital pictures: Representation, Compression and
Standards. 2nd Edition, Plenum Press.

304
References
• www.chiariglione.org/mpeg/
• http://www.mpeg.org
• http://jura1.eng.rgu.ac.uk/ (Digital Video pages)
• http://www.vcodex.com

305