You are on page 1of 10

A Probability-Based Flow Analysis Using MV Information in Compressed Domain

N.W. Kim, T.Y. Kim, and J.S. Choi


Department of Image Engineering, Graduate School of Advanced Imaging Science, Multimedia, and Film, Chung-Ang University {mysope,kimty,jschoi}@imagelab.cau.ac.kr.

Abstract. In this paper, we propose a method that utilizes the motion vectors (MVs) in MPEG sequence as the motion depicter for representing video contents. We convert the MVs to a uniform MV set, independent of the frame type and the direction of prediction, and then make use of them as motion depicter in each frame. To obtain such uniform MV set, we proposed a new motion analysis method using Bi-directional Prediction-Independent Framework (BPIF). Our approach enables a frame-type independent representation that normalizes temporal features including frame type, MB encoding and MVs. Our approach is directly processed on the MPEG bitstream after VLC decoding. Experimental results show that our method has the good performance, the high validity, and the low time consumption.

1 Introduction
A huge volume of information requires the concise feature representations. The motion feature of them provides the easiest access to its temporal features, and is hence the key significance in video indexing and video retrieval. That is the reason that the motion-based video analysis has received large attention in video databases research [1, 2, 3]. In the area of motion-based video analysis, many researchers have followed methods of motion analysis such as optical flow popular in computer vision and block matching popular in image coding literature. Cherfaoui and Bertin [4] have suggested the shot motion classification scheme that computes global motion parameters for the camera following an optical flow computation scheme. An example of block matching based motion classification scheme is the work of Zhang et al. [5] that carry out motion classification by computing the direction of motion vectors and the point of convergence or divergence. Noting that using the motion vectors embedded in P and B frames can avoid expensive optic flow or block matching computation, several researchers have stated exploring shot motion characterization of MPEG clips [6][7]. Kobla [8] have proposed the use of flow information to analyze the independent frame-type pattern (i.e., the pattern of I, P, and B frames) in the MPEG stream. But, it has a problem that flow estimation is considered by only single-directional prediction in most frames. In this paper, we propose the motion analysis method that normalizes MVs in MPEG domain using Bi-directional Prediction-Independent Framework. The
R. Monroy et al. (Eds.): MICAI 2004, LNAI 2972, pp. 592601, 2004. Springer-Verlag Berlin Heidelberg 2004

A Probability-Based Flow Analysis Using MV Information in Compressed Domain

593

normalized MVs (N-MVs) enable accurate video analysis which is useful in contexts where motion has a rich meaning. We have a purpose in constructing an effective and comprehensive motion field by making up for the week point in current sparse motion field. From such process, the N-MVs by BPIF can be utilized as motion depicter to efficiently characterize motion feature of video, and as feature information for global camera motion estimation in each frame, video surveillance system, video indexing and video retrieval. The remainder of the paper is organized as follows. In the next section, we propose the motion analysis method using BPIF in compressed domain and establish the active boundary for the proposed one. In Section 3, experimental results will be demonstrated to corroborate the proposed method. Finally, we conclude this paper in Section 4.

2 Motion Analysis Using Bi-directional Prediction-Independent Framework


An MB in MPEG sequence can have zero, one, or two MVs depending on its frame type and its prediction direction [9]. Moreover, these MVs can be forward-predicted or backward-predicted with respect to reference frames which may or may not occur adjacent to the frame including the MB. We therefore require the more uniform set of MVs, independent of the frame type and the direction of prediction. Our approach in this paper involves representing each MV as the backward-predicted vector with respect to the next frame, independent of the frame type. 2.1 Motion Flow Estimation Let us consider two consecutive reference frames, Ri and Rj. Let the B frames between them be denoted by B1,,Bn, where n is the number of B frames between two reference frames (typically, n=2). From the mutual relation among frames, we can represent each MV as a backward-predicted vector with respect to the next frame, independent of the frame type [8][10]. Roughly, this algorithm consists of following two steps. Motion Flow for Ri and Bn Frame. We derive the flow between the first reference frame Ri and its next frame B1 using the forward-predicted MVs of B1. Then, we can derive the flow, Bn, using the backward-predicted MVs between the second reference frame Rj and the previous frame Bn. Intuitively, if the MB in the B1 frame, B1(u,v), is displaced by the MV (x, y) with respect to the MB in the Ri frame, Ri(u,v), then it is logical to conjecture that the latter MB is displaced by the MV (-x, -y) with respect to B1(u,v), where u and v denote the indices of the current MB in the array of MBs. Therefore, the flow for the Ri frame is obtained by the MV with respect to the MB in the B1 frame. Also, the motion flow in the Bn frame is estimated by using the similar method as described above [8]. But, these algorithms have the problem that the number of extracted N-MVs is not enough to represent each frame. The motion flow for the Ri frame uses only forward-

594

N.W. Kim, T.Y. Kim, and J.S. Choi

predicted vectors in B frames and the motion flow for the Bn frame uses only backward-predicted vectors in Rj frames. It means that this method has considered only single-directional prediction to estimate the motion flow for Ri and Bn frames. Our approach to compensate the above problem is to implement a bi-directional motion analysis algorithm using forward-predicted motion vector from Ri to Rj, free from the limitation of single-directional motion analysis (see Fig. 1). The number of N-MVs in motion flow for the Ri frame can be improved by using both the backwardpredicted MV from the Rj frame to the B1 frame and the forward-predicted MV from the Ri frame to the Rj frame, in contrast with the conventional algorithm which only uses the forward-predicted MV in the B1 frame to estimate the motion flow in the Ri frame. To derive the motion flow for Bn is almost similar to the estimation method in Ri frame.

(a)

(b)

Fig. 1. Flow estimation. (a) Flow estimation of MVs in Ri frames. (b) Flow estimation of MVs in Bn frames.

The numerical formula is expressed in Eq. (1) and Eq. (2). Let the backwardpredicted MV in B1(u,v) be denoted by B1 R j . And let the forward-predicted MV from Ri(u,v) to Rj(u,v) be denoted by Ri R j . Then, we can calculate Ri B1 , the motion flow in Ri frame, as

Ri R j Ri B1 + B1 R j .
the motion flow, Bn R j , from the relationship as follows :

(1)

And, if we denote the forward-predicted MV in Bn(u,v) by Ri Bn , then, we can obtain

Ri R j Ri Bn Bn R j .

(2)

Fig. 1-(a) and Fig. 1-(b) show motion analysis method using BPIF in Ri and Bn frame, respectively. The solid line represents the direction of the real MV, and the dotted line shows the direction of the re-estimated MV. We derive the more elaborate motion flow in Ri and Bn frame by using Eq. (1) and Eq. (2). Motion Flow for B1 ~ Bn-1 Frame. The motion flow between successive B frames is derived by analyzing corresponding MBs in those B frames and their motion vectors with respect to their reference frames. Since each MB in each B frame can be of one of three types, namely forward-predicted (F), backward-predicted (B), or

A Probability-Based Flow Analysis Using MV Information in Compressed Domain

595

bidirectional-predicted (D), there exist nine possible combinations. Here, we represent these nine pairs by FF, FB, FD, BF, BB, BD, DF, DB, and DD. Each of these nine combinations is considered individually, and flow is estimated between them by analyzing each of the MVs with respect to the reference frame. The algorithm is divided into following four parts by combinations of corresponding MBs in successive B frames. Case 1 ) when corresponding MBs have forward- + forward-predicted MV : FF, FD, DF, DD ; Case 2 ) when corresponding MBs have backward- + backward-predicted MV : BB, BD, DB ; Case 3 ) when corresponding MBs have forward- + backward-predicted MV : FB ; Case 4 ) when corresponding MBs have backward- + forward-predicted MV : BF ; The motion flow estimation method in Case 1 and Case 2 of above four cases is obtained by similar method to it in Ri or Bn frame. Let the forward-predicted MV in Bk(u,v) be denoted by Ri Bk , and let the backward-predicted MV in Bk+1(u,v) be denoted by Bk +1R j . Fig. 2-(a) and Eq. (3) expresses the motion flow in Case 1. The figure and the numerical formula in Case 2 are described in Fig. 2-(b) and Eq. (4), respectively.

Bk +1 Ri Ri Bk + Bk Bk +1

(3) (4)

Bk R j Bk Bk +1 + Bk +1 R j .

To derive the motion flow in Case 3 and Case 4, we reuse the forward-predicted MV from the Ri frame to the Rj frame, Ri R j . In Case 3, the motion flow, Bk Bk +1 , is derived by Bk +1R j , Ri Bk , and Ri R j . This case is shown in Fig. 2-(c) and expressed in Eq. (5).

Ri R j Ri Bk Bk Bk +1 + Bk +1 R j .

(5)

And, if we denote the motion flow between Ri(u,v) and Bk+1(u,v) by Ri Bk +1 , and between Bk(u,v) and Rj(u,v) by Bk R j , we can calculate Bk Bk +1 , by using Ri Bk +1 and Bk R j , which is the motion flow in Case 4. This structure is shown in Fig. 2-(d). We can obtain Eq. (6) and Eq. (7) by the analysis of the flow and Eq. (8) is deducted from these equations.

Ri R j Ri Bk + Bk R j
Ri Bk +1 Bk Bk +1 + Ri Bk

(6) (7) (8)

Ri R j Ri Bk +1 Bk R j Bk Bk +1 .
2.2 Active Boundary of the Proposed Method

The flow estimation method in this paper works on the assumption that if Q(u1,v1) which is the MB in the non-reference frame Q has the MV (x, y), it is equivalent that R(u2,v2) which is the MB in the reference frame R is moved by the MV (x, y) to Q(u1,v1).

596

N.W. Kim, T.Y. Kim, and J.S. Choi

(a)

(b)

(c)

(d)

Fig. 2. Flow estimation of MVs in B1 ~ Bn-1 frames. (a)~(d) Flow estimation of MVs from case 1 to case 4, successively.

However, it should be mentioned that it may always not be appropriate to use the MVs in same MB (u1,v1) over Q and R frames. That is, a problem is occurred by the following fact; if Q(u1,v1) has the MV (x, y), it is the MV predicted not by R (u2,v2) , but by R(u1,v1) as Fig. 3-(a). When R(u2,v2) moves to Q frame by the MV (x, y), the corresponding MB is actually Q(u2,v2). Therefore, to satisfy the above assumption, the error distance, D, should be minimized as shown Fig. 3-(b). Namely, most of the MVs in compressed domain should be within the error distance, D. To verify this supposition, we calculate the MV histogram and the normal distribution of MV from various types of video scenes. A random variable D has a normal distribution if its probability density function is
2 p(D) = ( 2 2 ) 2 exp (D M ) / 2 2 1

for < D < .

(standard deviation). Fig. 4 shows the result of the normal distribution in various
sequence such as news, movies, and music videos. Table 1 represents the numerical result. The normal distribution is differently calculated by the maximum value of the error distance which is smaller than the block size (8 D 8) or the MB size (16 D 16). If the error distance is within the block size, the distance between R(u1,v1) and R(u2,v2) which is the corresponding MB of Q(u1,v1) will be smaller than the block size. Therefore, any neighbor MBs except R(u2,v2) doesnt include the area of R(u1,v1) as much as R(u2,v2) (Fig. 3-(c)). For this case, our assumption for the corresponding MBs can be regarded as appropriate.

The normal probability density function has two parameters M(mean) and

A Probability-Based Flow Analysis Using MV Information in Compressed Domain

597

(a)

(b)

(c)

Fig. 3. Error estimation. (a) MB mismatching. (b) Error distance. (c) MB matching.
(a) Static scene (News, etc.) (b) Dynamic s cene (M ovie, etc.) (c) Dynamic s cene (M V, etc.) (a) Fas t (b) M iddle (c) Slow

0.7 0.6

0.08 0.06

P( )

P
-30 -20 -10 0 10 20 30

0.5 0.4 0.3 0.2 0.1 0

0.04 0.02 0 -30 -20 -10 0 10 20 30

(a)

(b)

Fig. 4. Normal Distribution. (a) Various scenes. (b) Music video scenes.

As described in Table 1, most of compressed videos have the small motion under the block size. Except the fast scenes like as music videos, the probability that the MVs in sequence are less than the block size is more than 99%. In music videos, in case of the middle speed scenes, the probability that the MVs are smaller than the block size is over 80%. But, as shown in Table 1, the probability that motion is smaller than the block size is under 70% in music videos with fast scenes. That is, the application of the proposed method in fast scenes has the limitation, when standard deviation is approximately over 7. Nevertheless, the proposed method is clearly available for the image analysis in global motion characterization, not in the local motion characterization in each frame. That is because the maximum motion of MVs hardly strays off MB size.
Table 1. Normal distribution of MV. M News Movies Music Videos Fast Middle Slow 0.03 0.05 0.98 0.53 0.38 0.69 2.04 9.05 6.99 5.48 P(8D 8) 0.99 0.99 0.68 0.79 0.89 P(16 D 16) 0.99 0.99 0.94 0.98 0.99

598

N.W. Kim, T.Y. Kim, and J.S. Choi

3 Experimental Results
In this chapter, we present the effectiveness and performance for the proposed motion flow method. As for the test sequences, our experimental database consists of MPEG1 coded video sequences, which have the various camera work and object motion. Comparison of the Number of N-MVs. To evaluate the effectiveness of the estimation, we provide ground truth and compare it to the results from the flow estimation steps. Using the original uncompressed image frames, we encode the frames into MPEG files, which all B frames are replaced by P frames using a MPEG encoder. IBBPBB ordering (IPB encoded frame), for example, then becomes IPPPPP ordering (IPP encoded frame). We apply our flow estimation steps to the files in IPB format, and we compare the flow vectors of the frames between the two encodings. The number of N-MVs in our approach has increased over 18% than it in [8] as listed in Table 2. In detail, the number of N-MVs in Ri, B1~Bn-1 and Bn obtained by the proposed method has increased over 8%, 6%, and 44%, respectively. Especially, the number of N-MVs has markedly increased in Bn frame. This result originates in the fact that most of the MBs in Bn frames have forward-predicted motion vector from Ri rather than backward-predicted motion vector from Rj. The more the Bn frame has a lot of forward-predicted motion vectors, the better the performance of the proposed algorithm is.

MV Detected Ratio =

Number of MV in IPB frame Number of MV in IPP frame

MV detected ratio (MDR) has been calculated by the comparison of the number of N-MVs between IPP encoded frames and IPB encoded frames. This is not considered whether the MVs extracted from the two encodings are identical in the directions of the corresponding MBs or not. The Verification of the Validity for N-MVs. The effectiveness of the proposed method isnt verified by the increase of the number of N-MVs. For the verification, we have compared the directions of the corresponding MBs. At first, we quantize the vectors of the two encodings in the several principal directions (presently 4 bins), and compare them. If they have the identical principle direction, we regard them in IPB encoded frame as the effective MVs. Using these effective MVs, we can obtain the MV effective ratio (MER).

MV Effecti ve Ratio =

Number of Effecti ve MV in IPB frame Number of MV in IPP frame

The result of the experiments that verifies the validity of N-MVs is numerically summarized in Table 3. Examples of flow estimation are shown in Fig. 5. Fig. 5-(b) is a MB image that is derived from re-encoded IPP format files. Fig. 5-(c) and Fig. 5-(d) are its corresponding MB images from the IPB encoded streams. Fig. 5-(c) and Fig. 5(d) show the MV flows in the proposed method and in Koblas method, respectively. Being compared with Fig. 5-(b) which is the ground truth, we can see that the more accurate flow estimation is accomplished in the proposed method, not in Koblas

A Probability-Based Flow Analysis Using MV Information in Compressed Domain

599

method. The shade of the MB in Fig. 5 represents the direction of the flow vector. Fig. 6 is the comparison result of the MV histogram in IPB sequence and IPP sequence. To investigate the accurate result, we quantize the MVs in more fine directions. The simulation result shows our approach is closer to the ground truth sequence which is IPP encoded frame.
Table 2. Comparison of MDR. Proposed method Avg. MV num MDR 198 0.71 216 0.78 219 0.79 212 0.76 Koblas method [8] Avg. MV num MDR 176 0.63 199 0.71 98 0.35 163 0.58

Ri frame B1 frame Bn frame Total frame

Table 3. Comparison of MER. Ri frame Bi frame Bn frame Total frame Proposed method 0.88 0.75 0.82 0.81 Koblas method [8] 0.81 0.60 0.43 0.61

(a)

(b)

(c)

(d) Fig. 5. Examples of flow estimation (From left to right : Ri, B1, Bn frames). (a) Original Image. (b) MV vector in IPP encoded frame. (c) Estimated MV vector in IPB encoded frame (Proposed method). (d) Estimated MV vector in IPB encoded frame (Koblas method).

600

N.W. Kim, T.Y. Kim, and J.S. Choi


120

IP frame
100

IPB frame (Pro p o sed alg .) IPB frame (Ko b la's alg.)

80

MV num.

60

40

20

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

Bin

Fig. 6. Histogram comparison in N-MV field.

The Application of N-MVs. Finally, we show the object tracking method as one of applications using the N-MV field. Fig. 7 depicts the effective object tracking in the proposed motion flow and, at once, attests the validity of the proposed N-MVs representation by comparing with the tracking method using MV field in IPP encoded frames. The object tracking using only forward-predicted MV in P frame on IPP encoded sequence is shown in Fig. 7-(a). This differs little from Fig. 7-(b) using the motion flow by our BPIF; rather, it has lither curve in the aspect of the moving tracking. This means that the proposed motion analysis method doesnt seriously transform and distort the real MVs on MPEG domain and is sufficiently reasonable. The object tracking method in this paragraph has referred to [11].

(a)

(b)

Fig. 7. Comparison of object tracking. (a) Object tracking in IPP encoded frame. (b) Object tracking in N-MV field.

4 Conclusions
In this paper, we have proposed the motion analysis method on the ground of normalizing the MVs on MPEG compressed domain. We have proposed the Bidirectional Predicted-Independent Framework for generating the structure in which I, P, and B frames can be considered equivalently. Simulation results show that our frame-type-independent framework enables to represent the dense and comprehensive motion field. These N-MVs can be used as the motion depicter for video indexing, which have the strong point of computational efficiency and low storage space.

A Probability-Based Flow Analysis Using MV Information in Compressed Domain

601

Acknowledgement. This research was supported in part by the Ministry of Education, Seoul, Korea, under the BK21 Project, and by the Ministry of Science and Technology, Seoul, Korea, under the NRL Project (M10204000079-02J0000-07310).

References
[1] H.L. Eng and K.K. Ma: Bidirectional motion tracking for video indexing. Multimedia Signal Processing, pp. 153-158, 1999. [2] N.V. Patel and I.K. Sethi: Video shot detection and characterization for video databases. Pattern Recognition, vol. 30, no. 4, pp. 583-592, 1997. [3] J.G. Kim, H.S. Chang, J.W. Kim and H.M. Kim: Efficient camera motion characterization for MPEG video indexing. IEEE International Conference on Multimedia and Expo, vol. 2, pp. 1171-1174, 2000. [4] M. Cherfaoui and C. Bertin: Temporal segmentation of videos: A new approach. in IS & T and SPIE Proc.: Digital Video Compression Algorithm and Technology, vol. 2419, San Jose, 1995. [5] H. Zhang, A. Kankanhalli and S. Smoliar: Automatic partitioning of full-motion video. Multimedia Systems I, 10-28, 1993. [6] H. Zhang, C.Y. Low, Y. Gong and S. Smoliar: Video parsing using compressed data. in Proc. of SPIE Conf. on Image and Video Processing II, vol. 2182, pp. 142-149, 1994. [7] Y. Juan and S.F. Chang: Scene change detection in a MPEG compressed video sequence. in IS & T and SPIE Proc.: Digital Video Compression Algorithm and Technology, vol. 2419, San Jose, 1995. [8] V. Kobla: Indexing and Retrieval of MPEG Compressed Video. Journal of Electronic Imaging, Vol.7 (2), 1998. [9] ISO/IEC/JTC1/SC29/WG11, 13812-2, 1995. [10] O. Sukmarg and K.R. Rao: Fast object detection and segmentation in MPEG compressed domain. Proc. of IEEE TENCON 2000, vol. 3, pp. 364-368, 2000. [11] N.W. Kim, T.Y. Kim and J.S. Choi: Motion analysis using the normalization of motion vectors on MPEG compressed domain. ITC-CSCC2002, vol.3, pp.1408-1411, 2002.

You might also like