IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 15, NO.

7, JULY 2005

813

Fast Mode Decision Algorithm for Intraprediction in H.264/AVC Video Coding
Feng Pan, Xiao Lin, Susanto Rahardja, Keng Pang Lim, Z. G. Li, Dajun Wu, and Si Wu
Abstract—The H.264/AVC video coding standard aims to enable significantly improved compression performance compared to all existing video coding standards. In order to achieve this, a robust rate-distortion optimization (RDO) technique is employed to select the best coding mode and reference frame for each macroblock. As a result, the complexity and computation load increase drastically. This paper presents a fast mode decision algorithm for H.264/AVC intraprediction based on local edge information. Prior to intraprediction, an edge map is created and a local edge direction histogram is then established for each subblock. Based on the distribution of the edge direction histogram, only a small part of intraprediction modes are chosen for RDO calculation. Experimental results show that the fast intraprediction mode decision scheme increases the speed of intracoding significantly with negligible loss of peak signal-to-noise ratio. Index Terms—AVC, H.264, intraprediction, JVT, MPEG, video coding.

Fig 1.

Variable block size for rate distortion optimization.

I. INTRODUCTION

T

HE NEWEST international video coding standard is H.264/AVC [1]. It has been approved recently by ITU-T as Recommendation H.264 and by ISO/IEC as International Standard 14 496-10 (MPEG-4 part 10) Advanced Video Coding (AVC). The elements common to all video coding standards are present in the current H.264/AVC recommendation: an MB is 16 16 in size; luminance (luma) is represented with higher resolution than chrominance (chroma) with 4:2:0 subsampling; motion compensation and block transforms are followed by scalar quantization and entropy coding; motion vectors are predicted from the median of the motion vectors of neighboring blocks; bidirectional pictures (B-pictures) are supported that may be motion compensated from both temporally previous and subsequent pictures; and a direct mode exists for B-pictures in which both forward and backward motion vectors are derived from the motion vector of a co-sited macroblock (MB) in a reference picture. Some new techniques, such as spatial prediction in intracoding, adaptive block size motion compensation, 4 4 integer transformation, multiple reference pictures (up to seven reference pictures) and content adaptive binary arithmetic coding (CABAC), are used in this standard. The testing results of H.264/AVC show that it greatly outperforms existing video coding standards in both peak signal-to-noise ratio (PSNR) and visual quality [2].
Manuscript received October 21, 2003; revised May 20, 2004. This paper was recommended by Associate Editor F. Pereira. The authors are with the Institute for Infocomm Research, 119613 Singapore (e-mail: efpan@i2r.a-star.edu.sg; linxiao@i2r.a-star.edu.sg; rsusanto@i2r.a-star.edu.sg; kplim@i2r.a-star.edu.sg; ezgli@i2r.a-star.edu.sg; djwu@i2r.a-star.edu.sg; swu@i2r.a-star.edu.sg). Digital Object Identifier 10.1109/TCSVT.2005.848356

Fig. 2.

Computation of RDcost.

To achieve the highest coding efficiency, H.264/AVC uses a nonnormative technique called Lagrangian rate-distortion optimization (RDO) technique to decide the coding mode [3] for an MB. Fig. 1 shows the possible MB modes and Fig. 2 shows the RDO process. As can be seen from Fig. 2, in order to choose the best coding mode for an MB, H.264/AVC encoder calculates the rate-distortion (RD) cost (RDcost) of every possible mode and chooses the mode having the minimum value, and this process is repeatedly carried out for all the possible modes for a given MB. Therefore, the computational burden of this type of brute force-searching algorithm is far more demanding than any existing video coding algorithm. To reduce the complexity of H.264/AVC, a number of efforts have been made to explore the fast algorithms in motion estimation, intramode prediction and intermode prediction for H.264/AVC video coding [4], [5]. Fast motion estimation is a well-studied topic and is widely applied in the existing standards such as MPEG-1/2/4 and H.261/H.263. However, these fast motion estimation algorithms cannot be applied directly to H.264/AVC coding due to the variable block size motion estimation. On the other hand, fast intramode decision is a new topic in H.264/AVC coding, and very few previous works exist so far. It is believed that fast intramode decision algorithms are also very

1051-8215/$20.00 © 2005 IEEE

an edge map which represents the local edge orientation and strength is created. . I4MB prediction coding is conducted for samples a-p of a block using samples A-Q. The resulting picture is referred to as an I-picture. Mode 2: DC prediction. Mode 2 (DC): mean of upper and left-hand samples. In previous video coding standards (namely H. This prediction block is subtracted from the current block prior to encoding.814 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. Experimental results will be presented in Section IV and conclusions will be given in Section V. VOL. . the performance of intracoding in H. 3. With these advanced prediction modes. pixels . only a small number of prediction modes are chosen for RDO calculation during intraprediction. (b) Eight “prediction directions” for I4MB prediction. • • • • • • • • • Mode 0: Vertical Prediction Mode 1: Horizontal prediction. In this paper. I4MB Prediction Modes 4 luma block are The nine prediction modes for each 4 shown in Fig. 15. and are predicted based on the neighboring pixel . Mode 5: Vertical-right prediction. The difference between the actual block/MB and their prediction is then coded. 3. and are predicted based on pixel . Mode 8: Horizontal-up prediction. Beside DC prediction.264/AVC related to the fast mode decision algorithms which are adopted as part of nonnormative reference model for H. For example. eight directional prediction modes are specified. [7]. and so on. if we choose Mode 0. Mode 6: Horizontal-down prediction. a prediction block is formed based on previously coded and reconstructed blocks before deblocking. Intraprediction in H. The rest of the paper is organized as follows. pixels and would be predicted by . . Based on the distribution of the edge direction histogram. and a local edge direction histogram is then established for each subblock. Mode 1 (horizontal): extrapolation from left samples. then pixel a would be predicted by .264/AVC. Normally DC prediction is useful for those blocks with little or no local activities. Fig. Four prediction modes are supported. I16MB Prediction Modes As an alternative to I4MB prediction described above.264/AVC Intracoding refers to the case where only spatial redundancies within a video picture are exploited. Section III will present in detail the fast intraprediction algorithm based on the edge direction histogram. the entire MB may be predicted. The chroma samples of an MB are always predicted using a similar prediction technique as for the luma component in I16MB prediction. When using the I4MB prediction. It can be seen that I4MB prediction is conducted for samples a-p of a block using samples A-Q. NO. each 4 4 block of the luma component utilizes one of nine prediction modes. We have observed that the pixels along the direction of local edge are normally of the similar values (this is true for both luma and chroma components). II. Note that DC is a special prediction mode.264/AVC. 3) are used to predict the entire block. B. . we present one of the contributions. Four prediction modes are supported. Mode 3: Diagonal down-left prediction. • • • • Mode 0 (vertical): extrapolation from upper samples. Traditionally. 7. If an MB is encoded in intramode. If we choose Mode 7.264/AVC is comparable to that of the recent still image compression standard JPEG-2000 [8]. intraprediction has been conducted in the transform domain. There are in total eight “prediction directions” and one DC prediction mode for I4MB prediction as detailed in the following [1].264/AVC [6]. OVERVIEW OF INTRACODING IN H. A. Therefore. which is well suited for smooth image areas. For the luma samples. JULY 2005 important in reducing the overall complexity of H. and a good prediction could be achieved if we predict the pixels using those neighboring pixels that are in the same direction of the edge. Section II gives an overview of intracoding in H. I-pictures are encoded by directly applying the transform to all MBs in the picture.264/AVC is always conducted in the spatial domain. and pixels and would be predicted by and so on. the prediction block may be formed for each 4 4 block (denoted as I4MB) or for an entire MB (denoted as I16MB). a fast intramode decision algorithm for H. a uniform prediction is performed for the whole luma component of an MB. The presented algorithm considerably reduces the amount of calculations needed for intraprediction with negligible loss of coding quality. Experimental results show that the fast mode decision algorithms increase the speed of intracoding significantly with negligible loss of the quality. We have made two contributions to H. then the pixels .263 and MPEG-4). Mode 4 (Plane): plane prediction based on a linear spatial interpolation by using the upper and left-hand samples of the MB.264/AVC intraprediction by using local edge information. Mode 4: Diagonal down-right prediction. Mode 7: Vertical-left prediction. where the mean of the left handed and upper samples (pixels A to D and I to L in Fig. This is well suited for smooth image areas where a uniform prediction is performed for the whole luma component of an MB. When utilizing the I16MB prediction. by referring to neighboring samples of previously coded blocks.

The latter using the rooted sum of the squares of is computationally expensive and thus (2) is used. the amplitude could be obtained more accurately by and . we define tions. chroma components in an MB is . the amplitude of the edge vector can be roughly estimated by (2) In fact. Examples of 4 2 4 edge patterns and their preferred intraprediction directions. a good prediction could be achieved if we predict the pixels using those neighboring pixels that are in the same direction of the edge. I4MB prediction and I16MB prediction. 4 shows a few edge patterns of a 4 4 block and their preferred directional predictions. 8 8 Chroma Prediction Mode Each 8 8 chroma component of an MB is predicted from chroma samples above and/or to the left that have previously been encoded and reconstructed. A. Fig. fields which are based on the local gradients [10]. H.PAN et al. The rest of this section will explain in detail the fast intraprediction algorithm by using an edge direction histogram based on edge detection. Since the choice of prediction modes for chroma components is independent to that of luma components. Each pixel in the video picture will then be associated with an element in the edge map. as the corresponding edge vector. etc. Edge Map In order to obtain the edge information in the neighborhood of the intrablock to be predicted. such as edge direction histogram which is based on a simple edge detection algorithm [9]. Therefore. As a result. This means that the encoder has to encode the intrablock using all the mode combinations and choose the one that gives the best RDO performance. Therefore. It means that. the edge map of the video picture is generated by using the Sobel edge operators. the complexity and computational load of the encoder is extremely high.: FAST MODE DECISION ALGORITHM FOR INTRA PREDICTION IN H. in a luma (or chroma) picture. the number of mode combinations for luma and . (3) . there should be four different chroma prediction modes. vertical (Mode 2) and plane (Mode 3). Therefore. thus for each luma prediction modes. respectively. . The same prediction mode is always applied to both chroma blocks. III. For a pixel . and directional (1) and represent the degree of difference in verwhere tical and horizontal directions respectively.264/AVC VIDEO CODING 815 Fig. The four chroma prediction modes are very similar to that of the I16MB prediction except that the order of mode numbers is different: DC (Mode 0). horizontal (Mode 1). and represent the number of modes where for chroma prediction. C. Sobel operator has two convolution kernels which respond to degree of difference in vertical direction and horizontal direc. DETERMINING THE PRIMARY EDGE DIRECTION IN THE IMAGE BLOCK We observed that the pixels along the direction of local edge normally have similar values (this is true for both luma and chroma components). The algorithm described in this paper is based on edge detection due to its simplicity in terms of computational complexity. for an MB.264/AVC uses the RDO technique to achieve the best coding performance. The direction of the edge (in degree) is decided by the hyper-function. There are a number of ways to get the local edge directional information. it has to perform different RDO calculations before a best RDO mode is determined. 4. which is the edge vector containing its edge direction and amplitude.

it is fairly reasonable for us to try plane prediction if it is not obviously a DC prediction. 5. The mode of each . a simple thresholding technique is to build up the edge direction histogram. It shows that this block exhibits strong edge in the Vertical Right direction. let of the prediction mode k. and how strong this edge is. horizontal. 6.6 above horizontal direction. directions. 6. 1) 4 4 Luma Block Edge Direction Histogram: In the case of a 4 4 luma block. the histogram cell with the maximum amplitude indicates that there is a strong edge along and or Fig. 7. . 4(c). plus a DC prediction mode. 2 8 and 16 2 16 prediction mode directions. Fig. An example of such edge direction histogram is shown in Fig. In this paper. then if else if else if else if else if else if else if else if else if (4) Note that Mode 2 is not included in the above algorithm. As mentioned above. vertical and diagonal (plane) directions. Therefore. For each pixel in a 4 4 . this is because that for Mode 8.816 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. Edge Direction Histogram In order to decide whether the image block contains an edge. VOL. JULY 2005 It must be noted that in the actual implementation of the algorithm. 4(c). Mode 2 is missing in the above algorithm.e. It is important to note that Mode 3 and Mode 8 are adjacent due to circular symmetry of the prediction modes. This is because that Mode 2 will always be chosen as one of the candidate mode. as shown in Fig. Obviously. 3.264/AVC there is only limited number of prediction modes for intracoding. Therefore. plus a plane prediction and a DC prediction mode. pixel is determined by its edge direction Therefore. an edge direction histogram is calculated from all the pixel map of the block by summing up the amplitude of the edge with similar edge directions in the block. This is due to the fact that in H. be the histogram cell luma block. 5 shows the edge direction histogram of Fig. we can for sure associate the vertical and horizontal prediction to its respective directional edges. Intra 8 Fig. there are 8 directional prediction modes. prediction is done at an angle of approximately 26. the edge direction histogram of a 4 4 luma block is decided by the following algorithm. as shown in Fig.. The border between any two adjacent directional prediction modes is the bisectrix of the two corresponding directions. applied to B. Though it is not mathematically correct to associate plane prediction to any directional edge. the similar equation of the above is applied. except that the order of mode numbers is different. 2) Edge Direction Histogram for 16 16 Luma Block and 8 8 Chroma Block: In the case of 16 16 luma and 8 8 chroma blocks.3 . NO. there are only two directional prediction modes. Edge direction histogram of Fig. 7. Note that both diagonal down right and diagonal down left prediction modes are associated with the plane prediction. each cell in the edge direction histogram sums up the amplitudes of those pixels with similar edge directions in the block. (3) is not necessary. Note for 8 8 chroma blocks. The edge direction histogram for 16 16 luma is constructed as follows: if .6 ) is the direction at 13. For example. 15. the border of Mode 1 (0 ) and Mode 8 (26. i. the edge direction histogram for this case will be based on three else (5) For the similar reason. and let .

C. while the edge directional histogram is calculated from the original lossless images. instead of 4. if the primary prediction mode is Mode 1. we always choose DC mode as the second candidate in participating the RDO operation (Method 2). one from component U and the other from V. 3). Therefore. the primary prediction mode decided by edge direction histogram is considered as a candidate of best prediction mode. Therefore. we will still present the comparison among all the methods. IV. the histogram cell with the maximum amplitude. there are two different edge direction histograms. Method 3: In this method. instead of 9. the histogram cell with the maximum amplitude is the best candidate for intraprediction (Method 1). which are the cases using Method 1. then two additional candidate prediction modes will be Mode 8 and Mode 6. This will eliminate the effect that different thresholds result in different performances in different sequences. While the edge directional histogram is calculated from the original lossless images as the reconstructed image is not available at the time of calculating the histogram. and is thus considered as the preferable prediction direction. Method 1: The mode with maximum amplitude in edge directional histogram is chosen as the candidate prediction mode. or one of the two neighboring modes (in terms of direction) of the primary prediction mode.: FAST MODE DECISION ALGORITHM FOR INTRA PREDICTION IN H. and its two adjacent cells. B. the prediction mode will be chosen as DC. Method 4: During the experiments of Method 2. even though the histogram might have multiple maximums. in I16MB prediction coding. Note that only the cell with global maximum is chosen as the primary prediction mode. the above algorithm produces one primary prediction mode each for a 4 4 luma block. I16MB Prediction Modes Based on the same observation above.PAN et al.264/AVC. the two additional candidate prediction modes are determined to be the two neighbors of the primary prediction mode in terms of directions (refer to Fig. the primary prediction mode decided above will not always be the best RDO mode in actual coding.264/AVC intracoding is based on the reconstructed images. in general. Fig. Therefore. The . RDO is based on the reconstructed intralossy images. and DC mode is also chosen as the next candidate. it is observed that the chosen intraprediction mode is either the primary prediction mode. for the I4MB prediction coding. and if this amplitude is below a predefined threshold. or one of the two neighboring modes (in terms of direction) of the primary prediction mode. Therefore. We have thus tried a number of ways in deciding the number of preferred prediction modes. The window size of the histogram computation is enlarged. 2 16 luma and 8 2 8 chroma block edge direction A. This is due to the fact that a block of interest is For intrachroma blocks. we will only perform 4 modes RDO calculation. DC mode will be a better choice. in the experimental section. 16 16 luma block. plus DC mode are chosen to take part in RDO calculation. It should be noted that. we will only perform 2 modes RDO calculation. thus an amplitude threshold is needed in deciding whether the intrablock exhibits strong edge presence or is just a flat region. Extensive experiments also show that. Note that Mode 8 and Mode 3 are adjacent modes in terms of directions due to the symmetry of the circle. and 8 8 chroma block. and the rest of this section will describe the detailed implementation of this algorithm. Therefore. The main cause for this phenomenon is that in H. In summary. However. MODE DECISION FOR INTRAPREDICTION Based on the primary prediction mode previously determined. The mode whose direction complies with such is chosen as the primary prediction mode. by including pixels in the left column and upper row of the block of interest. In the case that all the cells have similar amplitudes. as is discussed in the following. 8 8 Chroma Prediction Modes this direction in the block. additional information is added based on Method 2. the chosen intraprediction mode is either the primary prediction mode. the fast mode decision algorithms for intraprediction select a small number of the prediction modes as the candidates to be used in RDO computation. it is difficult to pre-define a universal threshold that suits for different block context and different video sequences. the actual RDO computation in H. Therefore. Example of 16 histogram of.264/AVC VIDEO CODING 817 predicted by the pixels above and/or to the left of the block. For example. I4MB Prediction Modes Experimental results have shown that. the two additional candidate prediction modes are determined to be the two neighbors of the primary prediction mode in terms of directions. Method 2: This method simply encounters DC mode to be candidate mode besides the primary prediction mode. 7. for each 4 4 luma block. However. Experimental results have shown that Method 4 achieves a good balance between computational time and coding efficiency.

DC mode will also be used in the RDO calculation. and the period of I-frames is 100. The detailed procedures in calculating these differences can be found from a JVT document authored by Bjontegaard [13]. the coding cost consists of two parts: rate and distortion. As can be seen from Table I. According to the specifications provided in [11].. Early Termination of RDO Calculation During the intracoding of any prediction mode.818 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. we will perform either 2 or 3 modes RDO calculation.. 36. i. instead of 4. Let JM6. there is either the increase in PSNR or the decrease in bit-rate—not both at the same time. Experiments on IPPPP Sequences It should be noted that. By early termination of the RDO calculation which is deemed to be suboptimal.1e provided by JVT. the PSNR difference and the bit-rate difference . otherwise. 6) GOP structure is IPPPP or IBBPB.1e encoder. the test conditions are as follows. for each 8 8 chroma block intracoding. A. 5) MV resolution is 1/4 pel. if the accumulated cost before encoding the entire sixteen 4 4 blocks is already higher than that of I16MB prediction coding. Algorithm Complexity Analysis Table I summarizes the number of candidates selected for RDO calculation based on edge direction histogram. NO. the same prediction mode is always applied to both chroma blocks. 2) RD optimization is enabled. I4MB prediction coding will apply to the sixteen 4 4 blocks in the MB and the cost of these blocks will be accumulated. which is recommended by JVT Test Model Ad Hoc Group [12]. Note that PSNR and bit-rate differences should be regarded as equivalent. and the coding time statistics is generated from JM6. The averaged PSNR values of luma (Y) and chroma (U. CIF. Thus. The differences in PSNR and bit rate are calculated . the calculation can be terminated if it can foresee that the current mode will not be the best prediction mode. the selection between these two coding modes is determined by the coding costs of the MB by each coding mode. 7. respectively. A group of experiments were carried out on the recommended sequences with quantization parameters 28. Therefore. the coding of the remaining of 4 4 blocks in the MB will be terminated pre-maturely. Therefore. thus great timesaving is expected by using fast intracoding algorithm for this type of sequences. However. 1) MV search range is 32 pels for QCIF. In this experiment. PSNR and bit-rate differences are calculated according to the numerical averages between the RD-curves derived from JM6. Note that in the table positive values mean increments. In RDO. the RDO calculation will be terminated and the calculation of the Distortion is then eliminated. E. After I16MB prediction coding. In RDO.1e for the sequences listed in [12]. Same as before. JULY 2005 TABLE I NUMBER OF CANDIDATE MODES V. the following calculation is defined to find denote the coding time used by the time differences.e. Note that according to the standard.1e encoder and the fast algorithm.1e encoder and be the time taken by the faster intrapreis defined as diction algorithm. there is one I-frame for every 100 coded frames. MBs in P-frames also choose intracoding as the possible coding modes in the RDO operation. In order to evaluate the timesaving of the fast intramode decision algorithm. there could only be 2 candidate modes for RDO calculation.264/AVC coding. Thus of RDO calculations would be our fast intraprediction algorithm has reduced number of RDO modes calculation significantly compared to the 592 modes that are used in the current RDO calculation in H. 3) Reference frame number equals to 1. i. 15. In case that the two chroma components have different primary prediction mode (which is very rare). EXPERIMENTAL RESULTS Our proposed algorithm was implemented into JM6. the total number of frames is 300 for each sequence.264/AVC video coding. 512 Mbytes RAM. the total number . in H. there will be 3. The test platform used is Pentium IV-2. An MB is encoded by either I4MB prediction or I16MB prediction.e. This implies that the current mode will not be the best mode since its coding cost will not be the smallest. Table II shows the tabulated performance comparison of the proposed algorithm with JM6. there might be cases that the cost of rate is higher than the coding cost of the best mode in the previous modes. 32. and 40 as specified by [12]. if the primary prediction modes from the two components are the same. V) is used which is based on the equations below: (6) where the average mean square error (MSE) is given by (7) The comparison results were produced and tabulated based on the difference of coding time . and negative values mean decrements. the encoder with the fast mode decision alif gorithm would need to perform only the two chroma components have the same primary prediction mode.8 GHz. and (8) primary prediction modes from the two components are both considered as candidate modes. VOL. 4) CABAC is enabled. After calculating the cost of rate. a great timesaving could be achieved. D.

as the increased number of reference frames has increased the proportion of intercoding in the overall computational load. Timesaving at different size of searching area. the increase in bit-rate is slightly higher than that in the lower quantization values. 9. 1Psnr = 00 067 dB. 8.1e. Figs. This means that. the time takes to perform the RDO for intercoding is much longer than that for intracoding. However.: FAST MODE DECISION ALGORITHM FOR INTRA PREDICTION IN H. Fig. Mobile. 1Bits = 1 226%. Again. Fig. 11. 10 and 11 show the timesaving at different intraperiods and at different searching area during motion estimation. 12 shows the timesaving by using different number reference frames.PAN et al. i. This is because that in H.e. Fig. with negligible loss of PSNR.However at higher quantization values.1e. It is noted from these ures that the fast intraalgorithm achieves similar timesaving when the intraperiod changes from 50 to 150 frames. 1Psnr = 00 018 dB. It can be seen that the timesaving has reduced as the number of reference frames increases. News.. the timesaving has reduced significantly when the size of the searching area increases. : : Fig. 8 and 9 show the RD curves of the two sequences “news” and “mobile”. It can be seen that the fast intraprediction algorithm achieves consistent timesaving (average 25%) with negligible losses in PSNR and increments in bit rate. We have noticed that the simple early termination scheme described in Subsection IV-E contributed to about 6% to 8% of the total timesaving. This is similar to that case of Fig. the fast intraalgorithm only takes about 3/4 of the time that is needed by JM6. the rate distortion optimization for intercoding mode decision is much more complex than that for intracoding mode decision due to motion estimation operations. 10. Timesaving at different intraperiod. Figs. . : : according to [13]. these twofigures have shown that the fast intraprediction algorithm has the similar RDO performance as that of JM6.264/AVC VIDEO CODING 819 TABLE II RESULTS FOR IPPPP SEQUENCES Fig. 1Bits = 0 451%.264 video coding. and this becomes even so when the searching area increases. 11.

a slight increment in bit rate of about 3. NO. News. It is noted that the timesaving for this type of sequence is much less than that of the IPPPP format. JULY 2005 Fig. : : TABLE IV RESULTS FOR IBBPB SEQUENCES Another interesting observation from the table is that QCIF sequences achieve more timesaving than CIF. 15. 7. 14. and the period of I-frames is set to 100. 1Psnr = 00 294 dB. : : B. 1Psnr = 00 255 dB. C. 12. A total number of 300 frames are used for each sequence. which means that the fast intraalgorithm only takes about 40% of the time that is needed by JM6.1e. the motion estimation takes much longer time than that in P-frame coding. It can be seen from Table IV that the fast intraprediction algorithm achieves consistent timesaving (average 10%) with negligible losses in PSNR in increments in bit rate.1e. and the searching area for those MBs is much smaller compared to the nonboundary MBs. in B-frame coding. the picture type is set to IBBPB . VOL. This is due to the high percentage of the boundary MBs in a QCIF sequence. 13 and 14 show the RD curves of the two sequences “news” and “mobile. these two ures have shown that the fast intraprediction algorithm has similar RDO performance as that of JM6.e. TABLE III RESULTS FOR IIIII SEQUENCES Fig. It can be seen from Table III that the fast intraprediction algorithm achieves consistent timesaving (average 60%). 13.or P-frames. 1Bits = 3 902%.. there are two B-frames between any two I.” Again. Experiments on All Intraframes Sequences In this experiment. 1Bits = 3 168%. Timesaving at different # of reference frames. or equivalently.. i. Experiments on IBBPBB Sequences In this experiment. i.e. This is due to the fact that in H. and also. all the frames in the sequence are intracoded. Mobile. The average loss of PSNR of about 0. these two ures have shown that the fast intraprediction algorithm has the similar RDO performance as that of JM6.1e. Fig. B-frames do not use intracoding. Figs. 15 and 16 show the RD curves of the two sequences “news” and “mobile.264/AVC coding.24 dB. .7%.820 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. a total number of 300 frames are used for each sequence. Figs.” Again. and the period of I-frames is set to 1.

Pan currently serves as the Chapter Chairman of IEEE Consumer Electronics. 2003. L. 905–919. U.. Jul. Sep. Marpe. Table V gives the comparison of these methods.” IEEE Trans. and Z. 1Psnr = 00 013 dB. X. Vailaya. Dec. degrees in communication and electronic engineering from Zhejiang University. P. 1233–1244. “Recommended simulation common conditions for H. and Ph. X. Wavelet Applications in Industrial Processing.264/AVC operated in intra coding mode. Z. 2003. Feng. Lin. pp. [4] X. : : TABLE V COMPARISON OF DIFFERENT FAST INTRA PREDICTION METHODS [1] Information Technology—Coding of Audio-Visual Objects—Part 10: Advanced Video Coding. 2003. Lim. [5] Z. no. Sullivan. Lin. He has published numerous technical papers and offered many short courses for industries. Wu. G. Wiegand. Cycon. Method 3 achieves the best results. 15. [2] “Report of the formal verification tests on AVC (ISO/IEC 14 496-10 | ITU-T Rec. “Image retrieval using color and shape. V. Bazen and S. Feng Pan (M’00–SM’03) received the B. Jain and A. Wu. “Fast mode decision algorithm for JVT intra prediction. and Y. D. [7] K. [9] A. 2003. S. Lim. and Singapore. [12] JVT Test Model Ad Hoc Group. San Diego.264). and K. 7. News. China in 1983. 1996. news and mobile. 2003. Lim. 19.. TX. Li. Sep. Japan. Wu. he has been teaching and researching in a number of universities in China. i. CA. Apr. 24.”. Draft version 4. Zhou. Li and G. MPEG2003/N6231.PAN et al.” in SPIE Conf. vol. digital video compression. [10] A. By making use of the edge direction histogram. Pattern Anal. the settings and parameters used are the same as that in Section V-A. 2002. “Fast intermode decision. Barthel.” Pattern Recognit. Thailand. Japan. He. Hangzhou. H.. San Diego. Mobile. M. “Performance evaluation of motion-JPEG2000 in comparison with H. Singapore. Experimental results show that the fast algorithm has a negligible loss of PSNR compared to the original scheme. digital signal processing. He is now with Institute for Infocomm Research. Awaji Island. Final Draft International Standard. “Evaluation sheet for motion estimation. CA. 1Bits = 3 106%.D. K. P. “Calculation of average PSNR differences between RD-curves. [11] G. “Joint model reference encoding methods and decoding concealment methods. 2001.” presented at the 13th VCEG-M33 Meeting. : : Fig.264/AVC video coding.. 2002. and we only present the results of the two sequences.”.” presented at the 9th JVT Meeting (JVT-I020) . Wu.264/AVC VIDEO CODING 821 VI. F.. “Fast integer pixel motion estimation. Awaji Island. . besides the proposed methods. Dec. Intell. G.Sc. Santa Barbara. H. Mar. and 1989. P. S. 2003. M. Sep.K. Austin. Pan. [3] G. It can be seen from Table V that all the four methods have achieved significant timesaving.-P. J. Dec. 1986. Pan. and K.” presented at the 6th JVT Meeting (JVT-F011). Singapore. 129–137. Sullivan. [6] F. ISO/IEC FDIS 14 496-10.” presented at the 9th JVT Meeting (JVT-I049d0). “Systematic methods for the computation of the directional fields and singular points of fingerprints. 2002. Ireland. Feb. Dec. 29.Sc.” presented at the 14th VCEG-N81 Meeting. pp. and S. CONCLUSION This paper presented a fast mode decision algorithm for intraprediction in H. G. Oct. N. Dr. pp. Chen. [8] D. respectively. Bjontegaard. Mach. Other techniques such as early termination of RDO mode calculation are also used to further reduce the computation time. CA. Pattaya. His research areas are digital image processing.: FAST MODE DECISION ALGORITHM FOR INTRA PREDICTION IN H. [13] G. REFERENCES Fig. 2001. and in terms of RD performance. Wu.e. “Fast integer pel and fractional pel motion estimation for JVT. the number of mode combinations for luma and chroma blocks in an MB that takes part in RDO calculation has been reduced significantly from 592 to as low as 132.” presented at the 6th JVT Meeting (JVT-F017). though it is slightly inferior in timesaving.” presented at the 7th JVT Meeting (JVT-G013) . 1Psnr = 00 156 dB. T. Rahardja. D. vol. K. 16.. U. and digital television broadcasting. S. George. 2003. 1Bits = 0 379%.26L coding efficiency experiments on low resolution progressive scan source material. we have also tried different ways in deciding the number of preferred prediction modes based on the primary prediction mode. D. J. Rahardja. Since then. Gerez. Li. H. Comparison of Different Fast Intraprediction Methods As mentioned in the beginning of Section IV. This results in a great reduction of the complexity and computation load of the encoder. In this experiment.

in 1991. in 1993 and 1998. computer vision.D. degree and the M. Indonesia.S and M. Xi’an. Dr. Nanyang Technological University. and video processing. NO. he joined NTU as an Academic Professor and was appointed the Assistant Director of the Centre for Signal Processing. and received the Ph.Sc.K. and Research and was appointed as the Program Director to lead the Signal Processing Program. in 1993.Sci. 7. He is the Co-Founder of AMIK Raharja Informatika and STMIK Raharja. he is with the Institute for Infocomm Research (I2R). and the M. and number theoretical transform. degree from Nanyang Technological University. Singapore. In 2001. Eng. China. He has published more than 30 journal papers in the fields of video processing. networking. 1993. Southampton. degree in computer science from Northwest University. Singapore. he has been with Institute for Infocomm Research. Singapore. chaotic secure communication. in 2001. and served as a Business Development Manager in 1998. Singapore. He worked with Centre for Signal Processing (CSP) for about five years as a Researcher and Manager on the Multimedia Program.A. Singapore. degrees in telecommunication from Xidian University. Currently. From 1998 to 2000. He is currently the Director of Media Division in the Institute for Infocomm Research. degree in digital communication and microwave circuits.822 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY. He is currently an Associate Professor at the School of Electrical and Electronic Engineering in the Nanyang Technological University. Xi’an. Lim was the recipient of the Du Pont Scholarship and Sony Prize Award. Shen Yang. an institute of higher learning in Tangerang. and the Ph. Singapore.Eng. speech packet lost concealment for Bluetooth WCDMA baseband SOC development. VOL. Susanto Rahardja (M’00–SM’04) received the B. respectively. respectively. G. He is an also Adjunct Assistant Professor of Nanyang Technological University. He worked for DeSOC Technology as a technical director where he contributed on the VoIP solution. where he is now Research Manager in charge of multimedia signal processing areas. His research interests include binary and multiple-valued logic synthesis. he was a Researcher Scholar in the School of Computer Engineering. NTU.D degree from the Electronics and Computer Science Department.S. Xi’an. 15. Li (M’97–SM’04) received the B.. in 1994 and 2001.Eng. China. He has more than 100 articles in international journals and conferences. the M. Singapore. respectively. as a Research Engineer in 1996. Technology. Singapore. and computer network. University of Southampton. In 2002. JULY 2005 Xiao Lin (M’99–SM’02) received the Ph. He is currently working as Senior Technical Officer in the Institute for Infocomm Research. His research interests are multimedia communication. Singapore. Z. and 1997.D. Nanyang Technological University. Dr. hybrid systems. degree from Northeastern University. U. he joined the Agency for Science. a Research Fellow in 1997. degree in the area of logic synthesis and signal processing from the Nanyang Technological University (NTU). China. degree in electrical engineering from the National University of Singapore (NUS). and digital signal processing. His research field includes image/video coding and computer vision. Keng Pang Lim (M’95) received the B. digital communication systems. His research interests include video coding. Singapore. Dajun Wu received the B. Eng. degree in computer engineering from Xi’an Jiatong University. Since 2000. He is an Associate Lead Scientist in Institute for Infocomm Research. where he is currently leading a video coding group. He joined the Centre for Signal Processing. Si Wu received the B. Singapore. and Ph. degrees from the School of Computer Engineering. respectively.D. Rahardja was the recipient of IEE Hartree Premium Award in 2002 and the Tan Kah Kee Young Inventors’ GOLD Award (Open Category) in 2003.Eng. He joined Institute for Infocomm Research. . Singapore. in July 2002. China in 1992 and 1995.

Sign up to vote on this title
UsefulNot useful