Professional Documents
Culture Documents
Ban Dich MultiMedia
Ban Dich MultiMedia
Khi qut
Gii thiu
Chng 1: Nn tng k thut nn
Chng 2: Cc k thut multimedia
Jpeg
Mpeg-1/Mpeg-2 Audio&Video
Mpeg-4
Mpeg-7 (Gii thiu vn tt)
HDTV (Gii thiu vn tt)
H261/H263 (Gii thiu vn tt)
Model-Based coding (Gii thiu vn tt)
9/14/2006
Multimedia Technology
Overview
Introduction
Chapter 1: Background of compression
techniques
Chapter 2: Multimedia technologies
9/14/2006
JPEG
MPEG-1/MPEG-2 Audio & Video
MPEG-4
MPEG-7 (brief introduction)
HDTV (brief introduction)
H261/H263 (brief introduction)
Model base coding (MBC) (brief introduction)
Gii thiu
Trong PC:
9/14/2006
Introduction
On PCs:
Music and Video are free on the INTERNET (mp2, mp3, mp4, asf, mpeg,
mov, ra, ram, mid, DIVX, etc)
Video/Audio Conferences.
Tele-Medicine
CD/VCD/DVD/Mp3 players
9/14/2006
Mng Multimedia
9/14/2006
Introduction (2)
Multimedia network
9/14/2006
The Internet was designed in the 60s for low-speed internetworks with boring textual applications High delay,
high jitter.
Multimedia applications require drastic modifications
of the INTERNET infrastructure.
Many frameworks have been being investigated and
deployed to support the next generation multimedia
Internet. (e.g. IntServ, DiffServ)
In the future, all TVs (and PCs) will be connected to the
Internet and freely tuned to any of millions broadcast
stations all over the World.
At present, multimedia networks run over ATM (almost
obsolete), IPv4, and in the future IPv6 should
guarantee QoS (Quality of Service) !!
Ti sao phi nn ?
H s nn hay t l nn
2 loi nn:
9/14/2006
Nn khng tn hao
Nn tn hao
Why compression ?
2 types of compression:
9/14/2006
Lossless compression
Lossy compression
Nguyen Chan Hung Hanoi University of Technology
Nn l loi b s d tha
9/14/2006
Information rate
Entropy is the measure of information content.
9/14/2006
10
9/14/2006
11
Entropy (b sung 2)
9/14/2006
12
9/14/2006
13
Lossless Compression
9/14/2006
14
2.4. Nn tn hao:
9/14/2006
15
Lossy Compression
9/14/2006
16
9/14/2006
17
Process of Compression
9/14/2006
18
Ti sao ly mu?
Lng t ho:
9/14/2006
19
Why sampling?
PCM
Quantization
9/14/2006
20
2.7. M ho d on:
D on:
9/14/2006
21
In predictive coding, rather than directly coding the data itself, the coded data consists of
a difference signal formed by subtracting a prediction of the data from the data
itself.
The prediction for the current sample is usually formed using past data. A predictive
encoder and decoder are shown in Figure, with the difference signal given by d. If the
internal loop states are initialized to the same values at the beginning of the signal, then y
= x.
If the predictor is ideal at removing redundancy, then the difference signal contains
only the new information at each time instant that is unrelated to previous data.
This new information is sometimes referred to as the innovation, and d is called the
innovations process. If predictive coding is used, an appropriate predictor must be
determined.
9/14/2006
22
Predictive coding
Prediction
9/14/2006
23
9/14/2006
24
9/14/2006
25
D gy li d liu
Mo nhn to (Artifact):
9/14/2006
26
Drawbacks of compression
Artifacts
9/14/2006
27
2.10. Mt v d v m ho: Tp hp cc im
mu.
Trong 1 tm nh, gi tr im nh c tp hp trong
vi cc i.
Mi tp hp i din cho 1 vng mu ca 1 i tng
trong nh (v d: bu tri xanh)
Qu trnh m ho:
9/14/2006
28
9/14/2006
29
9/14/2006
30
Frame-Differential Coding
9/14/2006
31
2.12. D bo b chuyn ng
9/14/2006
32
9/14/2006
33
Actions:
9/14/2006
1. Compute Motion
Vector
2. Shift Data from Picture
N Using Vector to Make
Predicted Picture N+1
3. Compare Actual
Picture with Predicted
Picture
4. Send Vector and
Prediction Error
34
1.
2.
9/14/2006
35
Unpredictable Information
Unpredictable information from the previous
frame:
9/14/2006
36
Phng thay i
9/14/2006
37
Scene change
Uncovered information
In MPEG:
9/14/2006
38
9/14/2006
39
Transform Coding
9/14/2006
40
9/14/2006
41
Cc loi m ho nh:
Fourier ri rc (DFT)
Karhonen-Loeve
Walsh-Hadamard
Lapped orthogonal
Cosine ri rc (DCT) -> dng trong MPEG 2
Wavelet -> Mi
Nhng s khc bit gia cc phng php m ho
bin i:
9/14/2006
42
9/14/2006
43
9/14/2006
Loi b mt s h s DCT
iu chnh th ca qu trnh lng t ha cc
h s -> bin php tt hn.
Nguyen Chan Hung Hanoi University of Technology
44
9/14/2006
45
9/14/2006
46
Masking
9/14/2006
47
9/14/2006
48
Variable quantization
9/14/2006
49
2.16. M ho Run-level
V d (V d 1 ngi chn b m b
c v b ci)
9/14/2006
50
Run-Level coding
9/14/2006
51
M ho Run-level ( B sung)
Let an event represent the pair (run, level), where run represents the
number of zeros and level represents the magnitude of the
nonzero coefficient.
9/14/2006
52
Bng VLC mu
9/14/2006
53
9/14/2006
D bo b chuyn ng (MOTION
ESTIMATION)
M ha bin i (DISCRETE COSINE
TRANSFORM - DCT)
Lng t ha bin i (QUANTIZATION)
ZIG ZAG SCAN
RUN LEVEL CODING (RLC)
M ha thng k - Huffman (VARIABLE
LENGTH CODING VLC)
54
Nn khng
tn hao
M ha
VLC
bin i
(Huffman)
9/14/2006
Nn tn hao
RLC
Lng t
ha bin i
M ha
d on
55
Qu trnh nn
Ly mu v lng t ho
M ho:
Hin tng mt n
9/14/2006
56
Key points:
Compression process
Quantization & Sampling
Coding:
Masking
9/14/2006
57
M ha Huffman (b sung) Bi tp mu
As a simple example of the use of Huffman codes for images, consider an image in which
the pixels (or the difference values) can have one of 8 brightness values.
This would require 3 bits per pixel (2^3=8) for conventional representation. From a
histogram of the image, the frequency of occurrence of each value can be determined and
as an example might show the following results (Table 1), in which the various brightness
values have been ranked in order of frequency. Huffman coding provides a straightforward
way to assign codes from this frequency table, and the code values for this example are
shown.
Note that each code is unique and no sequence of codes can be mistaken for any other
value, which is a characteristic of this type of coding.
Table 1. Example of Huffman codes assigned to brightness values
Brightness Value
4
5
3
6
2
7
1
0
Frequency
0.45
0.21
0.12
0.09
0.06
0.04
0.02
0.01
Huffman Code
1
01
0011
0010
0001
00001
000000
000001
Notice that the most commonly found pixel brightness value requires only a single bit, but
some of the less common values require 5 or 6 bits, more than the three that a simple
representation would need. Multiplying the frequency of occurrence of each value times the
length of the code gives an overall average of
0.451 + 0.212 + 0.124 + 0.094 + 0.064 + 0.045 + 0.026 + 0.016 = 2.33 bits/pixel
9/14/2006
58
Bi tp chng 1
9/14/2006
59
BT3:
0
0
0
0
0
0
0
0
9/14/2006
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0 0
0 0
1 0
0 0
0 0
0 1
0 0
0 0
60
BT 3 (cha)
nh tri:
nh phi:
9/14/2006
61
Ni dung
9/14/2006
JPEG
MPEG-1/MPEG-2 Video
MPEG-1 Layer 3 Audio (mp3)
MPEG-4
MPEG-7 (gii thiu)
HDTV (gii thiu)
H261/H263 (gii thiu)
M ho da trn m hnh ha (model base coding
- MBC) (gii thiu)
Nguyen Chan Hung Hanoi University of Technology
62
Roadmap
9/14/2006
JPEG
MPEG-1/MPEG-2 Video
MPEG-1 Layer 3 Audio (mp3)
MPEG-4
MPEG-7 (brief introduction)
HDTV (brief introduction)
H261/H263 (brief introduction)
Model base coding (MBC) (brief introduction)
63
B m ho JPEG
B gii m JPEG
9/14/2006
64
JPEG encoder
JPEG decoder
9/14/2006
65
9/14/2006
66
9/14/2006
67
JPEG - DCT
nh u vo A:
nh u ra B:
9/14/2006
68
JPEG - DCT
Input image A:
Output image B:
9/14/2006
69
Cc bc lng t s l:
Nh pha trn bn tri (tn s thp)
Ln pha di bn phi (tn s cao)
Bc lng t = 1 l chnh xc nht
B lng t chia h s DCT cho bc lng t tng ng ca n,
sau lm trn ti s nguyn gn nht
Cc bc lng t ln s lm cho cc h s nh gim xung bng 0
Kt qu l:
Nhiu h s tn s cao bin thnh zero -> loi b d dng
Cc h s tn s thp ch chu s iu chnh nh.
9/14/2006
70
9/14/2006
71
43
58
-12
-4
-6
78
-1
-1
-73 -27
-1
-5
-5
-4
-1
-4
-5
-3
11
-65
80
-49
37
-87
12
10
27
-50
29
13
13
-6
-16
21
-11 -10
10
-21
-6
-1
-14
14
-14
16
-8
-4
-1
-13
12
-9
-1
-4
-2
-7
-1
DCT Coefficients
Quantization result
Kt qu scan Zigzag : 78 -1 1 -4 -5 4 4 6 3 2 -1 -3 -5 -4 -1 0 -1 0 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EOB
72
43
58
-12
-4
-6
78
-1
-1
-73 -27
-1
-5
-5
-4
-1
-4
-5
-3
11
-65
80
-49
37
-87
12
10
27
-50
29
13
13
-6
-16
21
-11 -10
10
-21
-6
-1
-14
14
-14
16
-8
-4
-1
-13
12
-9
-1
-4
-2
-7
-1
DCT Coefficients
Quantization result
Zigzag scan result: 78 -1 1 -4 -5 4 4 6 3 2 -1 -3 -5 -4 -1 0 -1 0 1 1 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 EOB
73
u thu TV k thut s
B gii m HDTV
u c DVD
Hi tho truyn hnh
Internet video. v.v..
Cc chun MPEG:
9/14/2006
74
MPEG standards:
9/14/2006
75
Cc chun MPEG:
MPEG 1 ( lc hu)
9/14/2006
76
MPEG standards
MPEG-1 (Obsolete)
9/14/2006
77
9/14/2006
78
9/14/2006
79
9/14/2006
80
9/14/2006
81
MPEG layers
9/14/2006
82
Pixel v block:
Pixel = phn t nh
Block
9/14/2006
83
Block
9/14/2006
= 8 x 8 array of pixels.
A block is the fundamental unit for the DCT coding
(discrete cosine transform).
84
Macroblock
9/14/2006
85
Macroblock
9/14/2006
86
Slice
9/14/2006
87
Slice
9/14/2006
88
Picture
9/14/2006
89
Picture
9/14/2006
90
I, P, B Pictures
nh m ho c chia lm 3 loI: I, P, B
I picture = Intra coded Pictures (nh m ha trong)
9/14/2006
91
I, P, B Pictures
Encoded pictures are classified into 3 types: I, P, and B.
I Pictures = Intra Coded Pictures
9/14/2006
92
Nhm nh (GOP)
Time
93
Time
94
Sequence (chui):
Kch thc nh
T s din mo (Aspect ratio)
Tc khung v tc bit
Cc ma trn lng t ho tu chn
Kch thc yu cu ca b m gii m
Cu trc mu (chroma pixel)
D liu tu chn ngi s dng
9/14/2006
95
Sequence
9/14/2006
96
9/14/2006
97
These data blocks need header information to identify the start of the
packets and must include time stamps because the packetizing process
disrupts the time axis.
Video Elementary Stream (video ES), consists of all the video data for a
sequence, including the sequence header and all the subparts of a sequence.
An ES carries only one type of data (video or audio) from a single video or
audio encoder.
PES packets have variable length, not corresponding to the fixed packet
length of transport packets, and may be much longer than a transport packet.
9/14/2006
98
The figure shows that one video PES and a number of audio
PES can be combined to form a Program Stream, provided
that all of the coders are locked to a common clock.
Time stamps in each PES ensure lip-sync between the
video and audio.
9/14/2006
99
9/14/2006
100
9/14/2006
Video filter
Discrete cosine transform (DCT)
DCT coefficient quantizer
Run-length amplitude/variable length coder (VLC)
101
B lc video:
9/14/2006
102
Video Filter
9/14/2006
103
4:4:4 Format
Legends:
4:4:4 Format:
Cr
Cb
4:1:1 Format
4:2:2 Format:
4:2:0 Format
9/14/2006
104
Th t thi gian
ng dng
trong macroblock
4:2:0
(6 block)
YYYYCbCr
TV v cc thit b gii
tr dn dng
4:2:2
(8 block)
YYYYCbCrCbCr
Thit b studio
Thit b son tho
Video chuyn nghip
4:4:4
(12 block)
YYYYCbCrCbCrCb ha my tnh
CrCbCr
9/14/2006
105
Application
YYYYCbCr
4:2:2
(8 blocks)
YYYYCbCrCbCr
Studio production
environments
Professional editing
equipment,
4:4:4
(12 blocks)
YYYYCbCrCbCrCbCrCbCr
Computer graphics
4:2:0
(6 blocks)
9/14/2006
106
nh dng mu 4:2:0
nh I, P, B
Gii hn trn:
1152x1920, 60 Hz qut lin tc
80 Mbits/s
9/14/2006
107
9/14/2006
108
M ho/gii m MPEG:
9/14/2006
109
MPEG encoder/decoder
9/14/2006
110
D on:
D on trc t cc nh trc
D on sau t cc nh pha sau
Hay d on ni suy
9/14/2006
111
Prediction
9/14/2006
112
Qu trnh sp xp li nh I P B
Do qu trnh d on 2 chiu ca nh B
V d chng ta c 1 GOP di 12 nh
Th t ngun v th t u vo b m ho:
1 2 3 4 5 6 7 8 9 10 11 12 13
IBBPBB P B B P B B I
1 4 2 3 7 5 6 10 8 9 13 11 12
I P B B PB B P B B I
9/14/2006
113
I P B Picture Reordering
I(1) B(2) B(3) P(4) B(5) B(6) P(7) B(8) B(9) P(10) B(11)
B(12) I(13)
I(1) P(4) B(2) B(3) P(7) B(5) B(6) P(10) B(8) B(9) I(13) B(11)
B(12)
9/14/2006
I(1) B(2) B(3) P(4) B(5) B(6) P(7) B(8) B(9) P(10) B(11)
B(12) I(13)
Nguyen Chan Hung Hanoi University of Technology
114
DCT:
IDCT:
9/14/2006
Trong :
115
DCT:
IDCT:
Eq 3 Normal form
Eq 4 Matrix form
Where:
9/14/2006
Eq 1 Normal form
Eq 2 Matrix form
F(u,v) = two-dimensional
NxN DCT.
u,v,x,y = 0,1,2,...N-1
x,y are spatial coordinates in
the sample domain.
u,v are frequency coordinates
in the transform domain.
C(u), C(v) = 1/(square root
(2)) for u, v = 0.
C(u), C(v) = 1 otherwise.
116
DCT vs DFT:
9/14/2006
117
9/14/2006
118
9/14/2006
119
Ch gi tr cc
h s DCT l:
Ti sao?
9/14/2006
Nh trn bn tri
(tn s thp)
Ln gc di bn
phi (tn s cao)
xem li JPEG
HVS t nhy cm vi
cc li tn s cao
hn cc tn s thp
Tn s cng cao
cng nn c lng
t ho th hn
120
Quantization matrix
Note DCT
coefficients are:
Small in the upper left
(low frequencies),
Large in the upper right
(high frequencies)
Recall the JPEG
mechanism !!
Why ?
9/14/2006
121
Kt qu ma trn DCT (v d)
9/14/2006
122
9/14/2006
After adaptive
quantization, the
result is a matrix
containing many
zeros.
123
Qut MPEG:
9/14/2006
124
MPEG scanning
9/14/2006
125
Huffman/Run-level coding:
9/14/2006
126
Huffman/Run-Level Coding
9/14/2006
127
Minh ho m Huffman/run-level
Zero
Run-Length
Amplitude
MPEG
Code Value
N/A
8 (DC Value)
110 1000
0000 1100
0000 1100
0100 0
0100 0
0100 0
110
110
110
110
12
0010 0010 0
EOB
EOB
10
9/14/2006
S dng ma trn u ra
DCT slide trc, sau khi
c qut ziczac -> u ra
s l 1 chui s:
4,4,2,2,2,1,1,1,1,0 (12 s
0),1,0 (41 s 0)
Cc gi ti ny c tra
trong bng cc m c
chiu di bin i
128
Amplitude
MPEG
Code Value
N/A
8 (DC Value)
110 1000
0000 1100
0000 1100
0100 0
0100 0
0100 0
110
110
110
110
12
0010 0010 0
EOB
EOB
10
9/14/2006
129
9/14/2006
130
9/14/2006
131
9/14/2006
132
MPEG packages all data into fixed-size 188-byte packets for transport.
Video or audio payload data placed in PES packets before is broken up
into fixed length transport packet payloads.
A PES packet may be much longer than a transport packet Require
segmentation:
9/14/2006
133
9/14/2006
134
9/14/2006
135
9/14/2006
136
Gi vn chuyn MPEG:
9/14/2006
PCR_flag
OPCR_flag
splicing_point_flag
transport_private_data_flag
adaptation_field_extension_flag
137
Adaptation Field:
9/14/2006
PCR_flag
OPCR_flag
splicing_point_flag
transport_private_data_flag
adaptation_field_extension_flag
The optional fields are present if
indicated by one of the preceding flags.
The remainder of the adaptation field is
filled with stuffing bytes (0xFF, all
ones).
138
1.
2.
3.
4.
Dng video
Dng audio
D liu khc
lung chuyn vn MPEG2 l nh dng gi cho truyn
thng d liu ng xung (downstream) trn mng CATV
9/14/2006
139
1.
2.
3.
4.
Video stream
Audio stream
9/14/2006
140
nh thi v iu khin m:
9/14/2006
im A: u
vo b m ho
tc khng
i
im B: u ra
b m ho
tc thay i
im C: u ra
b m m ho
tc khng
i
im D: Knh
giao tip + b
m gii m
tc khng i
im E: u
vo b gii m
tc thay
i
im F: u ra
b gii m
tc khng i
141
9/14/2006
Point A:
Encoder input
Constant/specifi
ed rate
Point B:
Encoder
output
Variable rate
Point C:
Encoder buffer
output
Constant rate
Point D:
Communication
channel +
decoder buffer
Constant
rate
Point E:
Decoder input
Variable rate
Point F:
Decoder output
Constant/specifi
ed rate
142
ng b thi gian
1 thnh phn chng trnh c th thm ch khng c nhn thi gian ->
nhng s khng th ng b vi cc thnh phn khc
u vo b m ho, (im A), thi gian xut hin ca video pic hay
audio block u vo c nh du bng cch ly mu STC.
tr tng cng ca b m m ho v gii m c cng thm vo
STC, to nn nhn thi gian hin th (PTS)
9/14/2006
143
Timing - Synchronization
9/14/2006
144
9/14/2006
145
Decode Time Stamp (DTS) can optionally combined into the bit
stream represents the time at which the data should be taken
instantaneously from the decoder buffer and decoded.
DTS and PTS are identical except in the case of picture reordering for B
pictures.
The DTS is only used where it is needed because of reordering.
Whenever DTS is used, PTS is also coded.
PTS (or DTS) inserted interval 700 mS.
In ATSC PTS (or DTS) must be inserted at the beginning of each
coded picture (access unit ).
System Clock Reference (SCR) in a Program Stream.
Program Clock Reference (PCR) in a Transport Stream.
9/14/2006
146
Tt c cc dng video audio nm trong cng 1 chng trnh phI ly nhn thi
gian ca chng t 1 STC chung c th ng b cc b gii m video v
audio vi nhau
Tc d liu v tc gi trn knh ( u ra b ghp knh) c th hon
ton khng ng b vi ng h thi gian h thng STC
Cc nhn thi gian PCR cho php s ng b ca cc chng trnh khc
nhau vi STC khc nhau ghp knh vi nhau trong khi vn cho php ti to
li STC ca mi chng trnh
Nu khng xy ra hin tng trn hoc rng b m th tr trong b m
v knh dn ca c video v audio l khng i
u vo b m ho v u ra b gii m chy vi tc bng nhau v khng
i
Tr t u vo b m ho v u ra b gii m l c nh
Nu khng cn s ng b chnh xc, th ng h gii m c th chy t
do cc khung video c th lp li hoc b qua khi cn thit ngn cn
vic rng hoc trn b m.
9/14/2006
147
All video and audio streams included in a program must get their
time stamps from a common STC so that synchronization of the
video and audio decoders with each other may be accomplished.
The data rate and packet rate on the channel (at the multiplexer
output) can be completely asynchronous with the System Time
Clock (STC)
PCR time stamps allows synchronizations of different
multiplexed programs having different STCs while allowing STC
recovery for each program.
If there is no buffer underflow or overflow delays in the buffers
and transmission channel for both video and audio are
constant.
The encoder input and decoder output run at equal and constant
rates.
Fixed end-to-end delay from encoder input to decoder output
If exact synchronization is not required, the decoder clock can be
free running video frames can be repeated / skipped as
necessary to prevent buffer underflow / overflow, respectively.
9/14/2006
148
9/14/2006
149
9/14/2006
150
HDTV (2)
HDTV proposals are for a screen which is wider than the conventional
TV image by about 33%. It is generally agreed that the HDTV aspect
ratio will be 16:9, as opposed to the 4:3 ratio of conventional TV
systems. This ratio has been chosen because psychological tests have
shown that it best matches the human visual field.
It also enables use of existing cinema film formats as additional source
material, since this is the same aspect ratio used in normal 35 mm film.
Figure 16.6(a) shows how the aspect ratio of HDTV compares with that
of conventional television, using the same resolution, or the same
surface area as the comparison metric.
To achieve the improved resolution the video image used in HDTV
must contain over 1000 lines, as opposed to the 525 and 625 provided
by the existing NTSC and PAL systems. This gives a much improved
vertical resolution. The exact value is chosen to be a simple multiple of
one or both of the vertical resolutions used in conventional TV.
However, due to the higher scan rates the bandwidth requirement for
analogue HDTV is approximately 12 MHz, compared to the nominal 6
MHz of conventional TV
9/14/2006
151
HDTV (2)
N cng cho php vic s dng cc dng phim chiu bng hin c,
v y cng l t l mn nh s dng cho phim 35mm thng thng.
nhn phn gii cao hn, cc nh dng trong HDTV phi cha
hn 1000 dng, khc vi h NTSC v PAL hin ti ch c 525 hay
625 dng.
iu ny em li phn gii theo chiu dc cao hn. Gi tr chnh
xc c chn la l bi s ca mt phn gii ca TV thng.
Tuy vy, do tc qut cao hn nn di thng yu cu cho HDTV
tng t xp x 12MHz, so vi 6MHz ca TV thng.
9/14/2006
152
HDTV (3)
9/14/2006
153
HDTV (3)
9/14/2006
154
H261- H263
9/14/2006
155
H261- H263
9/14/2006
156
H261-H263 (2)
9/14/2006
157
H261-H263 (3)
9/14/2006
158
H261-H263 (3)
9/14/2006
159
At the very low bit rates (20 kbit/s or less) associated with video
telephony, the requirements for image transmission stretch the
compression techniques described earlier to their limits.
In order to achieve the necessary degree of compression they
often require reduction in spatial resolution or even the
elimination of frames from the sequence.
Model based coding (MBC) attempts to exploit a greater degree
of redundancy in images than current techniques, in order to
achieve significant image compression but without adversely
degrading the image content information.
It relies upon the fact that the image quality is largely subjective.
Providing that the appearance of scenes within an observed
image is kept at a visually acceptable level, it may not matter that
the observed image is not a precise reproduction of reality.
9/14/2006
160
9/14/2006
161
9/14/2006
162
9/14/2006
163
164
9/14/2006
165
9/14/2006
166
9/14/2006
167
Key points:
Pixel, Block, Macroblock, Field DCT Coding / Frame DCT Coding, Slice,
Picture, Group of Pictures (GOP), Sequence, Packetized Elementary Stream
(PES)
Prediction
Motion compensation
Scanning
YCbCr formats (4:4:4, 4:2:0, etc)
Profiles @ Level
I,P,B pictures & reordering
Encoder/ Decoder process & Block diagram
9/14/2006
STC/SCR/DTS
PCR/PTS
168
Cc im quan trng
C ch nn MPEG:
D on
B chuyn ng
Qut
Cc dng YcbCr (4:4:4, 4:2:0, etc)
Profiles @ Level
I,P,B picture, s sp xp li
Qu trnh m ho/gii m, s khi
Truyn d liu MPEG
nh thi v iu khin m
9/14/2006
STC/SCR/DTS
PCR/PTS
Nguyen Chan Hung Hanoi University of Technology
169
Technical terms
Macro blocks
HVS = Human Visual System
GOP = Group of Pictures
VLC = Variable Length Coding/Coder
IDCT/DCT = (Inverse) Discrete Cosine Transform
PES = Packetized Elementary Stream
MP@ML = Main profile @ Main Level
PCR = Program Clock Reference
SCR = System Clock Reference
STC = System Time Clock
PTS = Presentation Time Stamp
DTS = Decode Time Stamp
PAT = Program Association Table
PMT = Program Map Table
9/14/2006
170
Cc cm t k thut
Macroblock
HVS = Human Visual System
GOP = Group of picture
VLC = Variable Length Coding/Coder
IDCT/DCT = (Inverse) Discrete Cosine Transform
PES = Packetized Elementary Stream
MP@ML = Main Profile @ Mail Level
PCR = Program Clock Reference
SCR = System Clock Reference
9/14/2006
171