You are on page 1of 49

EE591f Digital Video Processing 1

Roadmap
Introduction to Block Matching Algorithm
Fast BMA
Three classes of speed-up strategies
Generalized BMA
From integer-pel to fractional-pel
From fixed block size to variable block size
Deformable BMA (DBMA) or mesh-based BMA
Experimental results
How do block size and motion accuracy affect the MCP
efficiency?
EE591f Digital Video Processing 2
An Intuitive Way of Understanding
Block Matching Algorithm (BMA)
a
b
c
f
e
?
template
database
d
EE591f Digital Video Processing 3
inquiry block
in current frame
reference frame
a
b
c
d
Block Matching in Motion Estimation
a: (-3,-2)
b: (-3,-1)
c: (0,0)
d: (1,2)
EE591f Digital Video Processing 4
Motion Estimation and Compensation
B j i d j d i s j i s j i e
k k
e + + =
+
) , ( ), , ( ) , ( ) , (
2 1 1
Motion Compensated Prediction (MCP) residues
Motion Compensation
With the estimated motion vector, the block in the reference frame is
displaced to generate a prediction of the inquiry block in the current
frame. Such procedure is called motion compensation.
(d
1
,d
2
) : estimated motion vector
current frame
displaced reference frame
EE591f Digital Video Processing 5
Two Key Elements in BMA
Matching criterion: How do I measure the
similarity between two blocks?
Mean Square Error (MSE): L
2
norm
Mean Absolute Difference (MAD): L
1
norm
Search strategy: How do I find the best
match of the given block?
Exhaustive search: global minimum
Non-exhaustive search: close to global
minimum
EE591f Digital Video Processing 6
Goal: Find the Best Tradeoff
computational cost
variance of
MCP residues
EE591f Digital Video Processing 7
Roadmap
Introduction to Block Matching Algorithm
Fast BMA
Three classes of speed-up strategies
Generalized BMA
From fixed block size to variable block size
From integer-pel to fractional-pel
Experimental results
How do block size and motion accuracy affect
the MCP efficiency?
EE591f Digital Video Processing 8
Benchmark: Exhaustive Search
It searches (2T+1)
2
=225 points in total
An example of
window size T=7
EE591f Digital Video Processing 9
Fast Block Matching Algorithms
Class-A (I-IV): ad-hoc speed-up strategies
Class-B (V-VII): advanced speed-up
strategies (wise use of computational
resource to account for probabilities)
Class-C (VIII): hierarchical strategy
General Principle trade complexity with performance
EE591f Digital Video Processing 10
Fast BMA (I): 3-Step-Search
search 9+8+8=
25 points
EE591f Digital Video Processing 11
Fast BMA (II): Logarithmic Search
search at most
5+4+2+3+2+8=
24 points
EE591f Digital Video Processing 12
Fast BMA (III): Orthogonal Search
search at most
2(3+2+2+2+2+2)=
26 points
EE591f Digital Video Processing 13
Fast BMA (IV): Cross Search
search at most
5+4+4+4=
17 points
EE591f Digital Video Processing 14
Why does probabilistic modeling of MV
help?
Empirical pdf of motion vectors
EE591f Digital Video Processing 15
Fast BMA (V): New 3-Step Search
EE591f Digital Video Processing 16
New 3-Step Search: Examples
EE591f Digital Video Processing 17
Search the 9 checking points located at
a 5-by-5 window to see if the point reaching
the minimum distortion is found at the center?
N
Search 5 additional
Checking points
Search 3 additional
Checking points
Is it at the corner or not?
Y
N
Final 3-by-3 search
Repeat the procedure
in the dashed box
Y
Fast BMA (VI): 4-Step Search
EE591f Digital Video Processing 18
4-Step Search: Examples
EE591f Digital Video Processing 19
The Idea of Successive Refinement
Note that in all previous approaches to fast
BMA, we only consider the possibility of
reducing the number of search points
For each search point, we still need to
calculate the matching criterion for a B-
times-B block
To further reduce the complexity, we might
consider reducing the cost of each
matching as well
EE591f Digital Video Processing 20
M
N
M/2
N/2
M/4
N/4
Multi-resolution representation by pyramid
Multi-resolution Representation of
Images
EE591f Digital Video Processing 21
Why does Hierarchical Strategy Help?
Level-0
Level-1
Level-2
ME result
ME result
EE591f Digital Video Processing 22
Hierarchical Block Matching
Algorithm (HBMA)
EE591f Digital Video Processing 23
EE591f Digital Video Processing 24
Example:
Three-level HBMA
EE591f Digital Video Processing 25
Fast BMA (VIII): Hierarchical Search
EE591f Digital Video Processing 26
Summary
Why do we care fast BMA?
Driven by the application demands of video
coding
Can we go beyond BMA?
The block-based constraint is simple but not
appropriate for accounting for arbitrary shape of
moving objects
The integer-pel accuracy is not sufficient to
account for continuous nature of motion
EE591f Digital Video Processing 27
Roadmap
Introduction to Block Matching Algorithm
Fast BMA
Three classes of speed-up strategies
Generalized BMA
From integer-pel to fractional-pel
From fixed block size to variable block size
Deformable BMA (DBMA) or mesh-based BMA*
Experimental results
How do block size and motion accuracy affect
the MCP efficiency?
EE591f Digital Video Processing 28
Why Do We Need Fraction-pel?
EE591f Digital Video Processing 29
Fractional-pel BMA
original reference frame
linear
interpolation
interpolated reference frame
M
N
2M
2N
EE591f Digital Video Processing 30
Half-pel BMA
current frame
reference frame
1
1
1
1
digits indicate physical distances
EE591f Digital Video Processing 31
Bilinear Interpolation
(x+1,y) (x,y)
(x+1,y+1) (x,y+!)
(2x,2y) (2x+1,2y)
(2x,2y+1) (2x+1,2y+1)
O[2x,2y]=I[x,y]
O[2x+1,2y]=(I[x,y]+I[x+1,y])/2
O[2x,2y+1]=(I[x,y]+I[x,y+1])/2
O[2x+1,2y+1]=(I[x,y]+I[x+1,y]+I[x,y+1]+I[x+1,y+1])/4
Generalize to 1/K pixel where K >2
EE591f Digital Video Processing 32
Hierarchical Strategy for
Half-pel BMA
Integer-pel
Half-pel
EE591f Digital Video Processing 33
Beyond Half-pel Accuracy
There exist results supporting the further
prediction efficiency gain from half-pel to
quarter-pel; sometimes it is even worthwhile to
reach 1/8-pel accuracy
The improved prediction efficiency is comprised
by modestly increased computational complexity
and overhead
Question: for what kind of video, finer-accuracy
improves the MCP efficiency most?
EE591f Digital Video Processing 34
Generalizations of BMA
Variable block-size matching algorithms
Widely used by various video coding standards
H.264 includes three variable block sizes: 4-by-
4, 8-by-8 and 16-by-16
Fractional-pel accuracy BMA
Half-pel : MPEG-1/2/4, H.263/H.263+
Quarter-pel: H.264 (even 1/8-pel)
Tradeoff between overhead on motion and
MCP efficiency
EE591f Digital Video Processing 35
Variable Block-size BMA
16-by-16
8-by-8 4-by-4
EE591f Digital Video Processing 36
BMA Strategy Adopted by H.263
16-by-16
8-by-8
Macroblock level Block level
EE591f Digital Video Processing 37
BMA Strategy Adopted by H.264
16-by-16
8-by-16 16-by-8 8-by-8
8-by-8
4-by-8 8-by-4 4-by-4
Note: require overhead to signal which partition is adopted by the encoder
EE591f Digital Video Processing 38
Deformable Block Matching Algorithm
EE591f Digital Video Processing 39
Overview of DBMA
Three steps:
Partition the anchor frame into regular blocks

Model the motion in each block by a more
complex motion
The 2-D motion caused by a flat surface patch
undergoing rigid 3-D motion can be approximated
well by projective mapping
Projective Mapping can be approximated by affine
mapping and bilinear mapping
Estimate the motion parameters block by block
independently
Discontinuity problem cross block boundaries still
remain
EE591f Digital Video Processing 40
Affine and Bilinear Model
Affine (6 parameters):
Good for mapping triangles to triangles


Bilinear (8 parameters):
Good for mapping blocks to quadrangles
(

+ +
+ +
=
(

y b x b b
y a x a a
y x d
y x d
y
x
2 1 0
2 1 0
) , (
) , (
(

+ + +
+ + +
=
(

xy b y b x b b
xy a y a x a a
y x d
y x d
y
x
3 2 1 0
3 2 1 0
) , (
) , (
EE591f Digital Video Processing 41
Mesh-Based Motion Estimation
(a) Using a triangular mesh
(b) Using a quadrilateral mesh
A control grid is used to partition a
frame into non-overlapping polygon
elements. The nodal motion is
constrained so that a feasible mesh
is still formed with the motion.
EE591f Digital Video Processing 42
Mesh-based vs Block-based
(a) block-based ME
(b) mesh-based ME
(c) mesh-based motion tracking
EE591f Digital Video Processing 43
Mesh-based method (29.72dB)
EBMA (half-pel) (29.86dB)
Example: BMA vs Mesh-based
Target
Anchor
Predicted
EE591f Digital Video Processing 44
Roadmap
Introduction to Block Matching Algorithm
Fast BMA
Three classes of speed-up strategies
Generalized BMA
From fixed block size to variable block size
From integer-pel to fractional-pel
Experimental results
How do block size and motion accuracy affect
the MCP efficiency?
EE591f Digital Video Processing 45
Experiment Results
Frame #1 Frame #2
EE591f Digital Video Processing 46
Motion-Compensated Prediction
Residues
16-by-16 block, integer-pel, var(e)=271.8
EE591f Digital Video Processing 47
Motion-Compensated Prediction
Residues
8-by-8 block, integer-pel, var(e)=220.8
EE591f Digital Video Processing 48
Motion-Compensated Prediction
Residues
16-by-16 block, half-pel, var(e)=164.2
EE591f Digital Video Processing 49
Motion-Compensated Prediction
Residues
8-by-8 block, half-pel, var(e)=123.8

You might also like