IEEE Transactions on Consumer Electronics, Vol. 53, No. 2, MAY 2007750
SDSP is the final best MV. The local search patterns such asDS may be trapped into local minimum when the motion islarge. To solve this problem, global search which has moresearch points that can cover the overall search window isused. The cross search (CS) shown in Fig. 1(b) is one kind of global search pattern, the search positions are arranged in across which can span the whole search window, the center of the cross is the initial search point of ME. The cross searchcan extend the search range so that the search process may not be trapped into local minimum.
B.
Fractional Motion Estimation
A BC DE F G H I JK L M N O PQ RS Tdgf baehc j jaabbcc dd eeff gghhi i kk l lmmnn ooppqqrr I nteger pi xel Hal f pi xel Quarter pi xel
Fig. 2. The FME of H.264
Fig. 2 illustrates the FME process of H.264, the reference pixels are obtained by interpolation. The horizontal half pixelsuch as gg is obtained with a 6-tap horizontal FIR:
' -52020-5 (1)
ggEFGHIJ
= + + +
('16)5 (2)
gggg
= + >>
The vertical half pixel such as kk is obtained with a 6-tapvertical FIR:
' 520205 (3)
kkACGMQS
= − + + − +
('16)5 (4)
kkkk
= + >>
The horizontal-vertical half pixel such as ll is obtained with a6-tap vertical FIR:
' 520'205 (5)
llaabbggppqqrr
= − + + − +
('512)10 (6)
llll
= + >>
The
aa
,
bb
,
pp
,
qq and rr
are obtained in a similar manner with'
gg
.The quarter-pixels are obtained with bilinear filters, for example:
(1)1 (7)
aggkk
= + + >>
b (1)1 (8)
ggll
= + + >>
Detailed information for the interpolation of fractional pixelcan be referred to [1].The search pattern of FME is also shown in Fig. 2. Supposethe best integer MV points to G, then eight half-pixel positions (cc, dd, ee, ff, gg, jj, kk and ll) around G aresearched, suppose the best one of the eight half-pixel positionsis ll, then eight quarter-pixel positions (a, b, c, d, e, f, g and h)around ll are searched, the best quarter-pixel position is thefinal best MV.
C. Analysis
It can be observed that the search pattern of DS for IMEand FME are similar: firstly 9 candidate positions are checked(including the center point), the best position will be used asthe center position of the next search pass. This similarity has provided chances for designing reusable architecture.In both IME and FME, the MV which has the minimalMotion_Cost is selected as best MV. _cos*_
otiontSADMVDCost
λ
= +
(9)In (9),
λ
is a constant parameter, the MVD_cost is the bitsused to encode the MVD (motion vector difference), the SAD(sum of absolute difference) is the difference between thecurrent block and reference block.
,,
MN ijij j1i1
SAD O R
= =
= −
∑∑
(10)The O
i,j
is one pixel of the original block, the R
i,j
is thecorresponding pixel in reference block. For IME, R
i,j
isdirectly obtained from the reference frame, for FME, P
i,j
isobtained by interpolation operations. After the fractional pixels have been interpolated, the computation of SAD is thesame for IME and FME, so the hardware unit for calculating
SAD
can be reused for both IME and FME.
III.
R
EUSABLE
A
RCHITECTURE
A.
Proposed Architecture
Data ArrayCost Adder0......MV Comparator Integer reference pi xelOrgi nalMB pi xelMemory fetchrequestMVD_CostCost Adder1Cost Adder8FetchEngi nebest MV
Fig. 3. The block diagram of the proposed architecture
The proposed architecture is shown in Fig. 3, it includesfour parts: fetch engine, data array, cost adder and MVcomparator. In the proposed architecture, 9 search positionscan be processed in parallel. The fetch engine fetches theneeded integer reference pixels and original pixels from thememory. The data array provides the needed integer/fractionalreference pixels for 9 search positions. At each cycle, 9 rowsof reference pixels are generated by the data array. Since thelargest width of one block is 16 in H.264, each row contains16 reference pixels. Each row is input into one cost adder.
Authorized licensed use limited to: SAMSUNG ELECTRONICS DMC. Downloaded on July 29, 2009 at 08:21 from IEEE Xplore. Restrictions apply.