You are on page 1of 8

Data Hiding with Rate-Distortion Optimization on

H.264/AVC Video
Yih-Chuan Lin and Jung-Hong Li
Dept. of Computer Sciences and Information Engineering, National Formosa University, Yunlin, Taiwan.
E-mail: lyc@nfu.edu.tw
Abstract - This paper proposes a data hiding algorithm for the video quality caused by the watermark hiding can be
H.264/AVC standard videos. The proposed video data hiding controlled at the bound less than 2 dB.
scheme embeds information that is useful to some specific The remainder of this paper is organized as follows.
applications into the symbols of context adaptive variable Section 2 describes the watermarking principles and related
length coding (CAVLC) domain in H.264/AVC video streams. literatures. Section 3 explains our proposed scheme, including
In order to minimize the changes on both the reproduced video the watermark embedding/extracting schemes and embedding
quality and the output bit-rate, the algorithm selects DCT restriction rule. In Section 4, the performance of our proposed
blocks using a coefficient energy difference (CED) rule and scheme is presented. Finally, some conclusions are given in
then modifies the minor significant symbols, trailing one (T1) Section 5.
symbols and the least significant bits (LSB) of non-zero
quantized coefficient symbols, to hide data into the selected II. BACKGROUND
blocks. Upon considering the joint optimization on rate and In general, most data hiding methods in H.264/AVC are
distortion, the data hiding algorithm considers the data hiding based on entropy coding symbols or motion vectors (MV).
task as a special quantization process and performs within the There are two kinds of entropy coding method in H.264/AVC:
rate-distortion optimization loop of H.264/AVC encoder. The CAVLC and CABAC (Context-adaptive binary arithmetic
experiment results have demonstrated that our scheme has coding). Many scholars choose CAVLC to develop because it
good efficiency on hiding capacity, video quality and output is not complicated and is easy to operate for most situations.
bit-rate. We can modify those nonzero coefficients in DCT blocks for
Keywords: H.264/AVC, data hiding, CAVLC, reconstruction embedding, but it would affect the bit-rate and video quality
loop, coefficient energy difference. seriously. Although the watermark hiding in the DCT blocks is
easy to develop, we should consider avoiding unnecessary
I. INTRODUCTION problems.
Information hiding (or called data hiding interchangeably After transform and quantization, a DCT block usually
hereafter) for video is a video process that adds some useful contains sparse zeros and nonzero coefficients. The nonzero
data to the raw data or compressed formats of the video in a coefficients in high-frequency after the zig-zag reorder are
manner such that the third parties or others can not discern the often sequences ±1, which are called trailing one and they are
presence or contents of the hidden message in perception. limited only up to three at most in H.264/AVC. When the
H.264/AVC can provide better compression efficiency number of trailing ones becomes more, the coding length is
than other exiting standard at the cost of high computation shortest. So most researchers are focus on this part to develop
complexity. Owing to the high popularity of this standard algorithm in data hiding. Consider changing the coefficients in
format over many video applications, the hiding of useful data a DCT block. Four symbols for the CAVLC are available:
into this format attracts a great deal of attention for different coeff_token, trailing_ones_sign_flag, total_zero, and
applications. Recently, many researchers are committed to run_before. The coeff_token is composed of nonzero
develop watermark schemes in H.264/AVC [1-4], but in order coefficients and T1 in a DCT block. In the same case, if the
to make a balance between video quality and bit-rate; they number of trailing one increases, the bit-rate will reduce. On
usually offer only a small capacity to hide data. This paper the contrary, when the number of coefficient is raised, the
proposes a data hiding (or called watermark interchangeably bit-rate will increase oppositely.
hereafter) scheme that is based on the CAVLC in H.264/AVC In Wu et al. [4], their proposed method is emphasizing on
encoder and decoder sides. In the proposed method, one robustness to the compression attacks for H.264/AVC with
watermark bit is embedded by employing the relationship more than a 40:1 compression ratio in I frame. The data
between all of the polarity of T1 symbols in a 4x4 luminance embedded to the predicted 4x4 DCT block is only one bit. In
DCT block. If the DCT block has no any T1, the algorithm Tian et al. [5], this proposed method just modified the nonzero
considers modifying the LSB of the last nonzero coefficient coefficients. Therefore, the bit-rate increase is about 0.1% and
for embedding information. Experiment results have shown the PSNR degradation is less then 0.5dB. It is good at keeping
that our proposed method provide more capacity and can low bit-rate and high quality. However the capacity is too low.
enhance the rate-distortion efficiency. The degradation of In Liao et al. [6], this method embeds message into the trailing
ones of 4x4 blocks during the CAVLC. The feature of this
method is to allow data hiding directly in the compressed is intra-mode, the encoder performs intra-prediction and the
stream in real time and the capacity is more than others [5-6]. mode set contains only I4MB, I16MB and IPCM modes.
In Shahid et al. [7], this proposed method also embeds
watermark into DCT blocks. It modifies the LSB of
coefficients in each inter- and intra-frames and provides a high
capacity of data hiding. In Huang et al. [8], this method is a
new steganography scheme with capacity variability and
synchronization for secure transmission of acoustic data, In
Wang et al. [9], the method has good efficiency, it are always
higher than 45 dB at the hiding capacity of 1.99 bpp by
embedding for all test images

III. THE PROPOSED SCHEME

A. OVERVIEW OF OUR METHOD


Figure. 1 depicts the block diagram of our proposed
method in the H.264/AVC encoder side. The watermark
embedding method is inserted into H.264/AVC during the
encoding process. Data is hided in DCT blocks before entropy
coding. In our proposed method, the watermarking is done on
luminance DCT blocks in both intra and inter modes, not
considering the chrominance DCT blocks.
Fig. 2. The proposed watermarking method at macro-block
level.

Fig. 1. Schematic illustration of our proposed watermarking


/embedding procedure.

When the encoder executes information hiding method,


the rate-distortion must be considered. Because the marked
result changes are reflected to the reconstruction frame, the
encoding of next frame refers to this marked reconstruction
frame. So we must consider the reconstruction loop [7]. In
other words, the data hiding block should perform inside the
reconstruction loop or inside the reconstruction loop with Fig. 3. The proposed watermarking integration with RDO
RDO (Rate Distortion Optimization). Otherwise, the bit-rate procedure.
and video quality would be affected seriously due to the
prediction drift phenomenon between encoder and decoder As indicated in Fig. 2, our proposed method is also
sides. integrated within the RDO procedure in the encoder side.
In the H264/AVC encoding, RDO helps current frame to When the encoder performs the RDO procedure, it selects the
select the best mode and get the best trade-off between best coding mode while watermarking is done at the same time.
distortion of quality and bit-rate. Therefore, our method takes That mode might be different from that without watermarking.
into account RDO in order to get better coding performance But the bit-rate and video quality are best among other modes
while embedding the information into the video blocks. As in the mode set. Fig. 3 illustrates the detail of “RDCost with
shown in Fig. 2, the embedding procedure at the macro-block watermarking” block shown in Fig. 2. As described previously,
level is illustrated. When a macro-block enters the encoder we focused on both intra- and inter-blocks of luminance
side, the encoder firstly determines its encoding mode. If the component for data hiding. As indicated in Fig. 3, the modes
marco-block is inter-mode, the encoder performs both inter- IPCM and SKIP are not considered for embedding.
and intra-prediction to select the best mode from the mode set. As previously described, our method can be done within
The mode set includes PSKIP, P16x16, P16x8, P8x16, P8x8, RDO inside reconstruction loop. As shown in Fig. 2, the block
I4MB, I16MB and IPCM modes. When the marco-block mode “Get best MB mode” selects the best mode to do the coding
task. The performance of data hiding without RDO is not
better than that of considering the RDO based on the results Fig. 5. Example illustration of proposed watermark restriction.
shown in a later section. There is a 4x4 DCT block with five coefficients and the
Fig. 4 illustrates the integration of the proposed method threshold is set 0.25. After zig-zag scanning all of the
with the H.264/AVC decoder. An extracting algorithm is coefficients, the sequence is -2, 4, 3, -3, 0, 0, -1. The last
inserted into H.264/AVC decoding phase. The extracting trailing one is -1. Before embedding phase, we must calculate
phase can be done in DCT blocks after entropy decoding. In the CED firstly and compare the CED value with threshold. As
our method, we embed the watermark on luminance DCT shown in Fig. 5, the block satisfies our restriction, in that the
blocks in both intra- and inter-modes. So we need only to do CED is lower than the threshold.
extract on the luminance part of DCT blocks.
C. EMBEDDING ALGORITHM
In this subsection, we will show the pseudo code for the
embedding algorithm and explains the detailed. In Table I, we
define the symbols and the functions in the pseudo code. These
functions often refer to the DCT block or trailing one set to get
the information of DCT block.

Table I the symbol and function explanation


Fig. 4. Schematic illustration of the proposed watermark Variable or Function Definition
extracting procedure.
DCTB A size 4x4 DCT block
A size 4x4 DCT block by
DCTB "
B. THE RESTRICTION OF OUR METHOD embedding
In literatures, most methods usually utilize the quantized The trailing one set in a DCT
T 1set block
coefficients for embedding; they all have the common feature
that only modifying the value but not changing the sign. The W Watermarking bit, W = {0,1}
proposed algorithm utilizes the relation of the polarity of each Threshold Threshold value
T1 to embedding. The polarity and the sign of coefficient are coeEnergy coefficient energy difference
related. getT 1set ( DCTB) Get the T1 set from DCTB
Based on experiments, we observe a phenomenon that
when the number of coefficients is sparse in a DCT block, getT 1count (T 1set ) Get the number of trailing
changing the sign of trailing one causes the bit-rate increasing one from T1 set
significantly. In intra-prediction phase, the current block refers getLevcount (DCTB ) Get the number of nonzero
to the upper and the left blocks to make prediction and encode level from DCTB
the prediction residual. When changing the sign of trailing one getLastT 1Index (T 1set ) Get the last T1 index from T1
with sparse nonzero coefficients in the current block, the block set
data in spatial domain would change greatly because the getLastLev Index (DCTB ) Get the last nonzero level
energy changes by the sign flip is a greater proportion of the index from DCTB
whole block coefficient energy. When the coded block is XorT 1Polarity (T 1set ) All of polarity doing the
referenced by other uncoded blocks, this bad effect would be XOR operation in T1 set.
propagated to other uncoded blocks due to the reconstruction ChangeSign (DCTB , Changing the sign of T1 on
loop. Thus, we have to draw up a mechanism for preventing Index) index position in DCTB
this effect. If the number of coefficients is not sparsely and the
coefficient energy of trailing one to be changed the sign
ChangeLSB (DCTB , Changing the LSB of T1 on
Index) index position in DCTB
occupies slightly proportion in the current block, we does not
hide any watermark bits to the DCT block. getLSB( DCTB, Index) Getting the LSB of level on
In our method, we set a threshold to decide whether the index position in DCTB
DCT block is suitable to embedding data or not. At first, we getEnergy (DCTB ) Getting coefficient energy
calculate the coefficient energy of the current DCT block and difference in DCTB
the CED after changing the sign of one trailing one. If the
change rate of CED is less than the prespecified threshold, the The Embedding algorithm can be divided into two parts,
block is chosen to hide data. Otherwise, the block is kept intact. as shown in Table II. The first part, for blocks with at least one
One simple example is shown in Fig. 5. trailing one and CED less than the threshold, utilizes all of the
polarity values of trailing ones to hide data. If the sign of
trailing one is negative, the polarity value is 0. On the contrary,
the sign of trailing one is positive, the polarity value is 1. The
polarity values of trailing ones are through an XOR operation.
The result must be identical to the value of the watermark bit to
be hided into the block; otherwise we should change the sign
of last trailing one to satisfy the hiding condition. If the result
equals to the watermarking bit, the process does not modify watermark bit when the number of trailing one is nonzero. If
any thing for the block. The algorithm changes the sign of the the number of trailing one is zero and the last level existence,
last trailing one because the last trailing one in the high we can get the LSB from the last level as watermark bit. If the
frequency zone has lower energy than other trailing ones, not number of level and trailing one is zero, we do not do any
causing significant degradation of quality and bit-rate. thing.

Table II The pseudo code for Embedding Algorithm Table III The pseudo code for Extracting Algorithm
Embedding Algorithm Extracting Algorithm
"
Input: DCTB Input: DCTB
Output: DCTB " Output: W
Initialization: Initialization:
T 1set  getT 1set ( DCTB ) T 1set  getT 1set ( DCTB )
numT1  getT1count (T 1set ) numT1  getT1count (T 1set )
numLevel  getLevcount ( DCTB) numLevel  getLevcount ( DCTB)
Begin Embedding() Begin Extracting()
if( numT1  0 ) if( numT1  0 )
coeEngergy  getEnergy (DCTB ) coeEngergy  getEnergy ( DCTB " )
if( coeEngergy  Threshold ) if( coeEngergy  Threshold )
W  XorT1Polarity(T1set )
"
W  XorT1Polarity(T1set )
if( W "!  W ) output W
LastT1  getLastT1Index( DCTB) end
ChangeSign( DCTB, LastT 1) else if( numT 1  0 & &numlevel  0 )
output DCTB" LastLevel  getLastLevIndex( DCTB " )
end W  getLSB ( DCTB " , LastLevel )
end output W
else if( numT1  0 & &numlevel  0 ) end
LastLevel  getLastLevIndex(DCTB ) End
ChangeLSB ( DCTB , LastLevel ,W )
output DCTB" IV. EXPERIMENTAL RESULTS
end
End A. THE EXPERIMENT ENVIRONMENT

The second part, when the number of nonzero Table IV the experimental parameters for H.264/AVC codec.
coefficients is nonzero and the number of trailing one is zero, Parameter Information
utilizes the last level to change the LSB for hiding data. Profile IDC 66(baseline)
Otherwise if the number of levels and trailing ones are zero, Intra period 15(I-P-P-P)
we do not perform the embedding work. The advantage of the Slice mode 0
method in the first case is that the change of the sign does not Frames to be encoded 300
affect other symbols in the same block. According to the Motion Estimation scheme Fast Full Search
CAVLC rule, the trailing_ones_sign_flag indicate the sign of Rate Control Disable
trailing one, it is encoded as one bit in the NAL (Network
Abstraction Layer). If the sign is negative, it will be encoded Table V the test video format parameters
bit 1. On the contrary, if the sign of trailing one is positive, it Parameter Information
will be encoded one bit 0. We change only the sign of last Video format QCIF
trailing one so that the encoded block has the same length as YUV format 4:2:0
that prior to embedding process. Frame Size 176×144
D. EXTRACTING ALGORITHM Frame rate 30 fps
The extracting phase as shown in Table III is easier than
the embedding phase. The watermarking extracting algorithm We utilize the H.264/AVC JM Reference software [9] as
is performed between the entropy decoding phase and the the platform to simulate our proposed method. This subsection
inverse quantization phase. We find out all of the trailing ones presents that the experiment parameters for our method in JM
in current DCT block firstly and calculate the CED value; if reference software. We use the version of JM software is 12.2,
the CED is lower than threshold, we collect all of the polarity where the related environmental parameters are shown in
values for each trailing one to do XOR operation to get the Table IV. In the experiment, four videos: “akiyo,” “foreman,”
“mobile,” and “news” are used as test data set. Their format
information is shown in Table V. The secret data to be hided Table VI Comparison the efficiency between the original’s
into the test videos is a random bit stream. and proposed method for foreman in QP = 15
QP = 15
B. The EXPERIMENT RESULTS
PSNR(dB) Bit-rate(kbit) Capacity(bit)
In this subsection, we demonstrate the experiment results Original 47.32 969.62
and make an explanation about the results. Three methods are without ER 45.18 1070.09 337752
considered. The original method refers to the method without With ER
data hiding; the “within RDO” method represents the method
T=0.5 46.35 1023.22 165019
operated in the RDO loop while the “without RDO” method
T=0.1 46.35 1024.11 164923
means that it executes after the RDO stage in the
T=0.05 46.36 1025.11 165190
reconstruction loop of encoder. As shown in Figs. 6 and 7, the
“within RDO” method is superior to the “without RDO” in
Table VII Comparison the efficiency between the original and
terms of the output video bit-rate and the reconstructed video
embedding method for foreman in QP = 27
PSNR.
QP = 27
PSNR(dB) Bit-rate(kbit) Capacity(bit)
Original 37.5 196.26
without ER 36.62 228.05 80708
With ER
T=0.5 37.33 205.92 22118
T=0.1 37.33 205.7 22216
T=0.05 37.32 205.59 22273

Table VIII Comparison the efficiency between the original’s


and proposed method for foreman in QP = 31
QP = 31
PSNR(dB) Bit-rate(kbit) Capacity(bit)
Fig. 6. Comparison of the video quality for video foreman
encoded at varying QP values Original 34.86 74.92
without ER 34.1 140.93 48152
With ER
T=0.5 34.65 127.25 11449
T=0.1 34.64 126.81 11289
T=0.05 34.63 126.85 11409

From the experiments, we can observe that the


degradation of bit-rate and video quality caused by embedding
can be controlled effectively by adding embedding restriction.
But it also raises another question. When the threshold is small,
the performance is improved to a saturation degree. In other
words, the effectiveness of the embedding restriction rule has
a limitation level for controlling the degradation. For other test
videos, we illustrate their results in terms of video quality and
Fig. 7. Comparison of output bit-rate for video foreman
bit-rate in Figs. 8-15.
encoded at varying QP values
In Fig. 7, the bit-rate of the within RDO method is higher
than that of the original. This is not a desired phenomenon for
some applications. We use a threshold value of CED to select
appropriate DCT blocks to embed data. The number of DCT
blocks that can be embedded is decreasing with the restriction
threshold. This mechanism helps us to control the degradation
of marked video quality, bit-rate change, and the capacity of
data hiding.
In the experiments, we set different threshold values T of
embedding restriction rule as 1, 0.5, 0.1 or 0.05 for the “within
RDO” scheme. The results are shown in Tables VI to VIII.
We can find that the degradation of quality is reduced from
Fig. 8. Comparison of the video quality between our method
3dB to 1dB and that the bit-rate after embedding is not
and the original for video foreman at varying QP values
increasing significantly by setting the restriction rule.
Fig. 9. Comparison of the bit-rate between our method and the
original for video foreman at varying QP values Fig. 13. Comparison of the video quality between our method
and the original for video mobile at varying QP values

Fig. 10. Comparison of the video quality between our method


and the original for video akiyo at varying QP values
Fig. 14. Comparison of the video quality between our method
and original for video news at varying QP values

Fig. 11. Comparison of the bit-rate between our method and


the original for video akiyo at varying QP values

Fig. 15. Comparison the video quality between our method


and the original for video news at varying QP values

For smaller threshold values, most of the DCT blocks in


the video are excluded to modify the T1 symbols. However, it
doesn’t affect the scheme because in that case it modifies the
LSB of the last coefficient in the block. Therefore, for smaller
threshold values, the number of DCT blocks hided using the
T1 symbols is less than that of using the LSB replacement.
This means that the bit-rate and video quality will be kept
saturation. Only changing the LSB of the last coefficient in the
block would not affect the bit-rate and PSNR significantly.
Fig. 12. Comparison of the video quality between our method The capacity for each test video is shown in Figs. 16-19
and the original for video mobile at varying QP values
According to Fig. 3, our proposed method does not aim at
the SKIP mode blocks for data hiding. When the cost of SKIP
mode is lower than others, the mode decision phase selects the
SKIP mode to be the block mode, the number of SKIP mode
blocks is increasing with the QP value, as the results shown in
Fig. 20.

Fig. 16. Comparison of the capacity between our method and


the original for video foreman at varying QP values

Fig. 20. Comparison of the number of SKIP mode block for


video foreman encoded at varying QP values

In Figs. 21 to 23, our proposed method and Shahid’s [7]


are compared in terms of bit-rate, PSNR and capacity. There
are two variants of our proposed method; the one with
threshold value of CED T=0.1 and the other with T=0.5,
respectively. When the QP values are higher than 11, Shahid’s
Fig. 17. Comparison of the capacity between our method and capacity is rapidly declined due to the number of coefficients
the original for video akiyo at varying QP values in high QP values is sparse. The efficiency of our method with
CED is close to Shahid’s regarding the bit-rate and video
quality.

Fig. 18. Comparison of the capacity between our method and


the original for video mobile at varying QP values Fig. 21. Comparison video quality of our proposed and Shahid
for video foreman encoded at varying QP values

Fig. 22. Comparison bit-rate of the number of our proposed


Fig. 19. Comparison of the capacity between our method and
and Shahid for video foreman encoded at varying QP values
the original for video news at varying QP values
[2] S.K. Kapotas, E.E. Varsaki, A.N. Skodras, “Data Hiding in
H.264 Encoded Video Sequences”, IEEE 9th Workshop
on Multimedia Signal Processing, October 1-3, 2007,
Crete, pp. 373-376.
[3] B.G. Mobasseri, Y.N. Raikar, “Authentication of H.264
Streams by Watermarking CAVLC blocks”, SPIE
Conference on Security, Steganography and
Watermarking of Multimedia Contents IX, San Jose, CA,
January 28-February 2, 2007.
[4] G.Z. Wu, Y.J. Wang, W.H. Hsu, “Robust watermark
embedding detection algorithm for H.264 video”,
Journal of Electronic Imaging 14(1), 013013, 2005
[5] L. Tian, N. Zheng, J. Xue and T. Xu, “A CAVLC-Based
Fig. 23. Comparison capacity of our proposed and Shahid for Blind Watermarking Method for H.264/AVC Compressed
video foreman encoded at varying QP values Video”, In: Asia-Pacific Services Computing Conference,
2008. APSCC 2008, pp. 1295–1299. IEEE, Los Alamitos
In Table IX, we compare the capacity performance (2008)
between Shahid’s scheme and our proposed algorithm. At the [6] K. Liao, D. Ye, S. Lian, Z. Guo, J. Wang, “Lightweight
same QP, our method can provide higher capacity than that of Information Hiding in H.264/AVC Video Stream”, mines,
Shahid’s, and the capacity of Shahid’s is decreasing seriously vol. 1, pp.578-582, 2009 International Conference on
with the QP value decreased. Multimedia Information Networking and Security, 2009
[7] Z. Shahid, M. Chaumont, W. Puech, “Considering the
Table IX Comparison capacity of our method and Shaid’s for Reconstruction Loop for Data Hiding of Intra and Inter
foreman at varying QP Frames of H.264/AVC”, published in European Signal
Proposed method Shahid[7] Processing Conference (EUSIPCO), 2009.
QP T = 0.5 T = 0.1 [8] X. Huang, Y. Abe, and I. Echizen, “Capacity Adaptive
Capacity (bit) Synchronized Acoustic Steganography Scheme”, Journal
11 281591 281497 280578 of Information Hiding and Multimedia Signal Processing,
15 165019 164923 139629 Vol. 1, No. 2, pp. 72-90, Apr. 2010
19 82915 83241 67582 [9] Z.H. Wang, T.D. Kieu, C.C. Chang, M.C. Li, "A Novel
23 40620 40652 29851 Information Concealing Method Based on Exploiting
27 22118 22216 12108 Modification Direction" Journal of Information Hiding
31 11449 11289 4357 and Multimedia Signal Processing, Vo1. 1, No. 1, pp. 1-9,
Jan. 2010
V. CONCLUSIONS [10] K. Sühring, H.264/AVC Reference Software Group
[On-line]. Available: http://iphome.hhi.de/suehring/tml/,
In this paper, we propose a data hiding algorithm that has
Joint Model 12.2 (JM12.2), Jan. 2009.
considered the rate distortion performance for H.264/AVC
standard. The algorithm can control the increase of bit-rate and
decrease of PSNR after hiding secret data into the videos at the
cost of reducing the capacity of data to be hided. The
information is hided in the T1 symbols of CAVLC domain in
H.264/AVC encoder. In order to reduce the propagation of
hiding modification to the subsequent blocks, the proposed
algorithm can selection those blocks with minor energy
change to hide data. With the selection scheme, the proposed
algorithm can control the threshold value to adjust adaptively
the capacity for different application requirements.

ACKNOWLEDGEMENT
This research is supported in part by National Science Council,
Taiwan under the grant NSC 98-2221-E-150-051

REFERENCES
[1] G. Qiu, P. Marziliano, A. Ho, D. He, Q. Sun, “A Hybrid
Watermarking Scheme for H.264 Video”, Processing of
the 17th International Conference on Pattern Recognition,
ICPR, vol.4, pp.865-868, Aug. 2004.

You might also like