Professional Documents
Culture Documents
Abstract—We design a video encoding scheme that is suited for UAV and camera mounts which is known. Therefore, there is
applications such as unmanned aerial vehicle (UAV) video surveil- a need for low complexity video encoders that can efficiently
lance where the encoder complexity needs to be low. Our low com- compress UAV fly-over videos with primarily global motion.
plexity encoder predicts frames using the global motion informa-
tion available in UAVs and thus achieves lower complexity and Traditional video compression standards such as H.264/AVC
more than 40% BD-rate savings for fly-over videos compared to a and the recently finalized High Efficiency Video Coding
complexity-constrained H.264 encoder with motion estimation re- (HEVC) typically target applications such as entertainment
stricted to 8 8 blocks and half pixel accuracy. We also incorporate video broadcast, storage and playback, and video streaming,
a spectral entropy based bit allocation scheme into this encoder to where encoders can be complex but decoders need to be rel-
achieve near constant quality within groups of pictures (GOPs) at
the cost of small increases in delay and complexity, and a small drop atively simple [5], [6]. The bulk of the complexity in such
in compression efficiency. Both these encoders with their corre- encoders is due to the block motion estimation (ME) performed
sponding low complexity “matched” decoders provide significant to compensate for the local motion and predict frames from
gains of more than 49% BD-rate savings over the Wyner-Ziv based previously encoded ones. Therefore, traditional compression
DISCOVER codec which has a low complexity encoder and a high schemes are not designed to address the more stringent encoder
complexity decoder. Furthermore, for videos where the global mo-
tion is spatially consistent within 2 2 blocks, we show that the complexity constraints and the degrees of freedom present in
computational complexity of these proposed encoders can be sig- UAV video applications.
nificantly reduced with only about 1% BD-rate increase. In this paper, we design novel low complexity encoders
Index Terms—Complexity, global motion, low complexity video suited for moderate to high frame-rate UAV video coding that
encoding, spectral entropy based bit allocation. use global motion compensation for frame prediction. The
global motion information is assumed to be available from
other modules or easily derivable from the known movement
I. INTRODUCTION of the UAV and camera mounts, and is specified using the
W ITH a growing interest in unmanned aerial vehicles homography transformation with eight parameters per frame
(UAVs) for a wide range of applications, there is an (unlike the six parameter affine transformation used in our
escalating need for incorporating video data compression units earlier papers [7], [8]). This global motion prediction approach
into UAVs. In addition to being used in military applications, compensates the motion in the entire frame, unlike the block
UAVs are increasingly being used in commercial applica- motion based prediction used in mainstream video codecs
tions, particularly where the use of manned aircraft poses a where the motion for each block can be specified separately
severe danger to the pilots [1], [2]. The increasing quantity using a motion vector (MV).
and quality of camera sensors pose challenges for the video Our encoders with global motion compensation achieve
compression systems on board the UAVs. At the same time, lower complexity and more than 40% bitrate savings compared
available transmission bandwidth has become more limited to a complexity-constrained H.264/AVC encoder with block
due to the increase in the number of UAVs sharing the same ME restricted to 8 8 blocks and half pixel accuracy, since
link [3]. These challenges are compounded by the stringent our encoders replace the highly complex block ME engine of
space, weight, and power constraints and the desire for longer the H.264 encoder with the relatively simpler global motion
endurance, greater functionality, and lower fuel consumption compensation and do not need to transmit block MVs. We also
in UAVs [4]. Furthermore, unlike typical video sequences that design a second set of encoders with global motion compen-
have substantial local motion, the motion in the UAV fly-over sation and a spectral entropy based bit allocation scheme that
videos is primarily global and due to the movement of the achieve near constant quality across frames, but at the cost
of a small increase in delay, a slight increase in complexity,
and a modest loss in compression efficiency. We show that
Manuscript received October 09, 2013; revised April 03, 2014; accepted July
18, 2014. Date of publication August 07, 2014; date of current version Jan- these encoders with global motion in conjunction with their
uary 20, 2015. This research was supported in part by Raytheon Applied Signal corresponding low complexity “matched” decoders provide
Technology, Inc. The guest editor coordinating the review of this manuscript
substantial gains over the Wyner-Ziv DISCOVER codec [10]
and approving it for publication was Dr. Vladan Velisavljevic.
The authors are with the Department of Electrical and Computer Engi- which has a low complexity encoder and a high complexity
neering, University of California, Santa Barbara, CA 93106 USA (e-mail: decoder. Additionally, we demonstrate that the complexities
malavika@ece.ucsb.edu; gibson@ece.ucsb.edu).
of the proposed encoders can be significantly reduced without
Color versions of one or more of the figures in this paper are available online
at http://ieeexplore.ieee.org. significant compression performance losses, for videos with
Digital Object Identifier 10.1109/JSTSP.2014.2345563 global motion that is spatially consistent within 2 2 blocks
1932-4553 © 2014 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
140 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 9, NO. 1, FEBRUARY 2015
(not presented in our earlier work [7]–[9]). Furthermore, we codec based on the Stanford architecture has been shown
include complexity analysis and performance results for a to outperform H.264/AVC intra and sometimes H.264/AVC
complexity-constrained H.264/AVC encoder in which motion ‘zero-motion’ standard coding [10].
vectors are initialized using the global motion information One of the drawbacks in many implementations of the Stan-
derived from the known movement of the UAV and camera ford architecture is their use of a feedback channel for encoder
mounts, also not presented in our earlier papers [7]–[9]. Fur- rate control [20]. The feedback channel requires low delay de-
thermore, the analyses and discussions in this paper are more coding at the receiver because the Wyner-Ziv frame is decoded/
detailed and comprehensive than in our earlier papers [7]–[9]. reconstructed several times in order to compute the total number
This paper is organized as follows. Section II presents a brief of bits required for successful reconstruction of that frame [20].
overview of related research on low complexity video encoders There have been modifications proposed to avoid the feedback
and the use of global motion models for motion estimation channel in [21]–[23]; however, these modifications increase the
and/or compensation in video codecs. Section III discusses the complexity of the encoder and result in a loss of quality, es-
architecture of the proposed encoders and the global motion pecially for high bit-rates. Since in our target applications, im-
compensation and the spectral entropy based bit allocation posing instantaneous decoding of the video stream at the re-
scheme used in our encoders. Section IV enumerates and ceiver can be too restrictive, the DISCOVER codec based on
introduces the notation for all the different encoders analyzed the Stanford architecture is not suitable.
and compared in this paper. Section V presents a theoret-
ical analysis of the complexities of the proposed and other
B. Global Motion in Video Codecs
hybrid encoders in terms of the number of computations,
number of memory accesses, and storage buffer sizes required. The MPEG4 Part 2 video coding standard supported global
Section VI evaluates and compares the rate-distortion perfor- motion using a single affine transformation to describe the mo-
mances of our proposed encoders to that of the other H.264 tion of an entire frame with respect to its reference frame. How-
based encoders and the Wyner-Ziv DISCOVER codec. Finally, ever the global motion toolset was not widely adopted since the
Section VII summarizes the paper and draws conclusions. large complexity cost of global motion estimation did not pro-
vide commensurate gains in rate-distortion performance [24].
II. RELATED PRIOR RESEARCH Therefore, the H.264/AVC standard that succeeded MPEG4 Part
2 did not support global motion compensation. There have been
efforts [25], [26] to introduce additional global motion predic-
A. Low Complexity Encoders
tion modes in H.264/AVC codecs that provide up to 27% bitrate
Many researchers have worked on low complexity versions savings over the standard H.264/AVC codec.
of standard video encoders like H.264/AVC and MPEG-4. The Global motion has also been utilized within a temporal fil-
low complexity baseline profile of the H.264/AVC standard tering framework to improve the performance of H.264/AVC
discards complex coding toolsets such as B slices and CABAC codecs [27] and HEVC codecs [28]. Stojanovic and Ohm [29]
entropy coding [5] resulting in performance loss. Works such have used global motion compensation between temporally dis-
as [11], [12] design encoders that can achieve multiple com- tant frames to improve the compression efficiency of HEVC,
plexity-rate-distortion points by choosing lower complexity particularly for video sequences with camera zoom. However,
coding modes or by dynamically changing the block ME search all these techniques use global motion within a block ME frame-
range. Fast mode decision algorithms [13], [14] have also been work for standard video sequences and not for motion compen-
proposed to reduce the complexity of H.264 encoders. sating entire frames in video sequences with primarily global
A complexity analysis [15] of the recently finalized HEVC motion.
standard states that the “implementation cost of a HEVC de- Video codecs tailored for UAV applications have also been
coder is not expected to be higher than that of an H.264/AVC proposed that use the available global motion information.
decoder”, but the encoder is “expected to be several times Gong et al. [30] use a homography model for the global motion,
more complex than an H.264/AVC encoder.” Therefore, one merge the first intraframe and subsequent interframe residues
can anticipate that low complexity HEVC encoders would be in a group of pictures (GOP) into a single “big image,” and
developed in a manner similar to those of the low complexity code this image using JPEG2000. However, the residue data
H.264/AVC encoders. in the “big image” is not conducive to JPEG2000 compression
Outside the standards community, recent research on low since JPEG2000 is primarily designed for natural images. The
complexity video encoders has primarily been based on the work by Rodriguez et al. [31] uses the available global motion
theory of Wyner-Ziv distributed source coding [16]. The first information to initialize MVs and thus simplify the block
Wyner-Ziv based video coding algorithms were developed ME in a MPEG-4 encoder. Recently, Soares and Pinho [32]
simultaneously by a group at Stanford University [17] and a and Angelino et al. [33] have presented modifications of the
group at the University of California, Berkeley [18]. Although H.264/AVC encoder to initialize MVs using the camera motion
it has been shown under i.i.d. and Gaussian assumptions information from UAV sensors. However these approaches still
that such architectures achieve the same performance as perform block ME, albeit at a lower complexity, and transmit
conventional motion compensated predictive video coding the derived block MVs. Transmitting the global motion infor-
systems [19], these results have not been achieved by practical mation instead of the derived MVs is more rate efficient for
Wyner-Ziv based codecs. The DISCOVER codec, a Wyner-Ziv fly-over videos.
BHASKARANAND AND GIBSON: GLOBAL MOTION ASSISTED LOW COMPLEXITY VIDEO ENCODING FOR UAV APPLICATIONS 141
(1)
across them is compensated by computing for each full- are ordered based on their energies i.e., .
pixel position in the current frame using (1) and reading Let there be sampling functions (blocks/frames) each with
the data at location in the reference frame, approximated such components. Out of the total coefficients, let
to the nearest half-pixel position with the pixel data being gen- coefficients be coded. Then the spectral entropy-based coeffi-
erated using the H.264/AVC interpolation filter. Moreover, in cient selection [42], [45] dictates that the number of coefficients
the cases when some portions of the observed scene towards coded in each component should be proportional to the vari-
the edges of the current frame are not present in the reference ance of that component i.e., , where
frame, the pixels in those regions of the current frame are pre- and .
dicted from the border pixels of the reference frame extrapo- If is the average number of bits spent to code a coef-
lated using the nearest-neighbor algorithm. However, since our ficient of component , the total number of bits spent is
encoders use bidirectional prediction, these “new” regions in the where is the number
current frame are present in either of the past or future reference of bits required to code the binary significance map that indi-
frame and hence do not greatly affect the encoder rate-distortion cates the significant coefficients. The coding distortion is gen-
performance for the video sequences studied here. erated by two sources: quantization and discarding coefficients.
The homography matrix can be viewed as a concise descrip- Hence the expected value of the distortion of the th component
tion of the translation of each pixel at location in the can be written as
current frame to the pixel at location in the reference
frame. Therefore, if the global motion is spatially consistent
within a small spatial neighborhood, we can approximate that
all the pixels in the spatial neighborhood defined by
in the current frame get
translated by the same amount to match the spatial neighbor- (5)
hood in the reference frame. Such an ap-
In this equation, the quantization error is computed assuming
proximation further reduces the computational complexity of
that the overload distortion is negligible and the high-resolution
the homography transformation since (1) needs to be evaluated
approximation holds and is a constant determined by the dis-
only once for each spatial neighborhood. Later in Sections V and
tribution of the normalized random variable [46].
VI, we demonstrate that this approximation of the homography
transformation for 2 2 blocks (i.e., and ) sig- Hence, the problem of bit allocation is to find for
nificantly reduces the motion compensation complexity with a so as to minimize subject to
very small loss in compression performance. the bit budget constraint that .
Using Lagrangian optimization methods, the number of bits
B. Spectral Entropy Bit Allocation allocated to each of the coded coefficients of component can
In this section, we present a bit allocation scheme based on be shown to be
the principles of spectral entropy developed by Campbell [41]
and Yang et al. [42] as a basis for efficiently sampling frequency
coefficients. Yang et al. [42] have shown that the Campbell
(6)
bandwidth is the minimum average bandwidth for encoding the
This is similar to the result of classical bit allocation [46] except
process across all possible distortion levels. In addition, Jung
and Gibson [43] have obtained an expression for coefficient rate that the geometric mean of 's has been replaced
using the logarithm of the ratio of rate-distortion function slopes by and the geometric mean of 's has
for the given source and a uniform source, where the logarithm been replaced by . The corresponding total distor-
is averaged over large distortions. These results indicate a rela- tion is
tionship between coefficient rate and the rate-distortion function
of a source. Therefore, we have developed a spectral entropy
based bit allocation scheme [34], [44] starting with the spectral
entropy based coefficient selection method developed by Yang (7)
et al. [42]. Here we include a brief derivation and discussion of The proposed bit allocation method examines the input trans-
our bit allocation scheme for completeness. form coefficients and chooses to code only those that are signif-
Consider a zero-mean, stationary, continuous-time random icant for retaining signal fidelity. Therefore, this method adapts
process . Using the K-L expansion in the time interval to the actual coefficient values that need to be coded. In con-
, the process can be decomposed as trast, the classical bit allocation method relies entirely on the
energies of the transform components and hence is designed for
a class of inputs, all having the same component energies but
(4)
different coefficient values. Ortega and Ramchandran [47] have
noted that “input-by-input” approaches that adapt to the source
where 's are normalized eigenfunctions and 's are un- data being compressed are likely to be superior to “one size fits
correlated random variables with and . all” approaches that are designed to perform well on average for
Without loss of generality, we can assume that the components a class of inputs.
BHASKARANAND AND GIBSON: GLOBAL MOTION ASSISTED LOW COMPLEXITY VIDEO ENCODING FOR UAV APPLICATIONS 143
For the case when all the components have the same nor- TABLE I
malized distribution, , (6) can be rewritten NOMENCLATURE OF THE HYBRID ENCODERS EVALUATED.
as
(8)
where is the spectral entropy ex-
pressed in bits and is the coefficient rate. In (8),
the first term is an average bitrate, the second term depends
on the source, and the last term depends on the current com-
ponent being coded.
The corresponding classical bit allocation equation can be ex-
pressed as
Fig. 2. Rate-distortion performance for 176 144 “aerial_beach1_crop” sequence. The encoder nomenclature remains the same as described in Table I. (a) Average
PSNR vs. bitrate. (b) Average SSIM vs. bitrate.
Fig. 3. Quality variation across GOPs for 176 144 “aerial_beach1_crop” sequence. The encoder nomenclature remains the same as described in Table I.
(a) Standard deviation of PSNR vs. average PSNR. (b) Standard deviation of SSIM vs. average SSIM.
frames captured at the rate of 24 to 30 frames per second. Al- “matched” decoders for 89 frames of the 176 144 (QCIF)
though the global motion parameters for the “aerial_beach1_crop” sequence. We also include results for
and encoders can be derived from the position the Wyner-Ziv based DISCOVER codec [10]. Fig. 2 plots the
and attitude of the UAV, we have found it difficult to obtain un- average PSNR and SSIM against the average bit-rate while
compressed video sequences with this metadata. Data obtained Fig. 3 plots the standard deviation of PSNR and SSIM within
from the sensor data management system of the Air Force Re- GOPs against the average PSNR and SSIM.
search Laboratory [60] was found to contain incorrect camera From Fig. 2, we see that the BGMh_8 8 encoder performs as
information. Therefore, for the results presented in this section, well as the BMh_8 8 encoder, but at lower complexity. This is
we estimate the homography global motion parameters from because the BGMh_8 8 encoder uses smaller ME search areas
the video sequence using the Scale Invariant Feature Transform centered around the MVs initialized using global motion param-
(SIFT) [61] to find point correspondences between two video eters, but still transmits MVs.
frames and the RANdom Sample And Consensus (RANSAC) Comparing the curves for the BMh_8 8 and the GMh_1 1
algorithm [62] to determine the homography transformation that encoders, we note that the use of the known global motion
best describes the global motion between those two frames. instead of block motion gives a significant BD-rate saving of
We evaluate the average rate-distortion performances of the 40% or equivalently a BD-PSNR improvement of 2.5 dB. This
encoders by plotting the average frame quality against the av- performance improvement is largely because the GMh_1 1
erage bitrate. We also quantify the consistency of frame quality encoder does not need to transmit motion information since it
by computing the standard deviation of the frame quality within utilizes the global motion parameters derivable at both the en-
each GOP and plot it against the average frame quality. The coder and decoder. In contrast, the BMh_8 8 encoder spends
frame quality is measured using PSNR and SSIM [63] and a significant fraction of its bitrate on coding MVs, ranging
encoder performances are compared using the BD-rate and from 10% at higher bitrates to 80% at very low bitrates. The
BD-PSNR [64] measures. GMh_2 2 encoder provides compression performance similar
In Figs. 2 and 3 we compare the performance of the var- (1% BD-rate increase) to that of the GMh_1 1 encoder, despite
ious hybrid encoders listed in Table I with their respective its much lower computational complexity. This is because for
148 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 9, NO. 1, FEBRUARY 2015
TABLE VI motion parameters are derived from the video data. In such
PERFORMANCE GAINS OF PROPOSED GMH_1 1 AND GMS_1 1 ENCODERS cases, the global motion parameters used in the video encoder
OVER THE BMH_8 8 ENCODER. need to be embedded in the video bitstream and transmitted.
The transmission of an eight parameter homography matrix with
single-precision requires . If two homography
matrices are transmitted for every frame, this would translate
to an additional 15 kbps for a video sequence at 30 frames per
second.
For the “aerial_beach1_crop” sequence for which results
were presented in Figs. 2 and 3, the GMh_1 1 and GMs_1 1
encoders still achieve BD-rate savings of 34% and 24% respec-
tively compared to the BMh_8 8 encoder, instead of 40% and
30% BD-rate savings without transmitting the homography
parameters. Similar small drops in the performance gains of
the GMh_1 1 and GMs_1 1 encoders over the BMh_8 8
encoder have been observed for the other test sequences when
the “aerial_beach1_crop” sequence, the motion is spatially the homography parameters are transmitted.
consistent within 2 2 blocks i.e., the motion of all the 4 pixels Therefore, the additional bitrate required to transmit the
in each 2 2 block are very similar. global motion parameters to the decoder does not greatly affect
The use of spectral entropy based bit allocation in the the performance improvements achieved by the GMh_1 1 and
GMs_1 1 encoder makes the quality across frames more GMs_1 1 encoders over the BMh_8 8 encoder. This addi-
constant as seen in Fig. 3 but slightly degrades the average tional bitrate can be reduced by predicting the global motion
rate-distortion performance (BD-rate loss of 16% compared parameters for the current frame from the parameters of the
to GMh_1 1 encoder). However, the GMs_1 1 encoder previous frame using a method like that given in [65].
still outperforms the BMh_8 8 encoder particularly at lower
bitrates. The improvement in performance of the GMs_1 1
encoder with respect to that of the BMh_8 8 encoder is 30% VII. DISCUSSION AND CONCLUSIONS
in terms of BD-rate savings and 1.6 dB in terms of BD-PSNR Motivated by UAV video applications, we have proposed low
increase. The compression performances of the GMs_1 1 and complexity (GMh_1 1) encoders that utilize the known global
GMs_2 2 encoders are similar, once again because of the motion and are superior to the complexity-constrained H.264
spatial consistency of the motion within 2 2 blocks. (BMh_8 8) encoder with 8 8 block ME for fly-over videos
The GMh_1 1 and GMs_1 1 encoders with corresponding mainly due to two reasons: (a) they replace the highly com-
low complexity “matched” decoders achieve BD-rate savings of plex block motion estimation engine with the relatively sim-
56% and 49% respectively over the DISCOVER codec which pler global motion compensation and hence reduce the encoder
has a low complexity encoder and a high complexity decoder. In complexity and (b) they do not need to transmit MVs which
addition for the DISCOVER codec, the PSNR of frames within if transmitted can constitute a significant fraction of the bit-
a GOP fluctuates considerably as demonstrated in Fig. 3. This stream. Our new (GMh_1 1) encoders achieve more than 40%
is because the quality of the side information derived at the BD-rate savings or equivalently more than 1.7 dB BD-PSNR
decoder depends on the distance of the frame from the intra improvement at lower complexity compared to the complexity-
frames and this affects the reconstructed frame quality. Unlike constrained H.264 (BMh_8 8) encoder. We have demonstrated
the DISCOVER codec, our proposed encoders do not require that for videos with global motion that is spatially consistent
feedback channels and provide better compression efficiency within 2 2 blocks, the computational complexity of this en-
than the BMh_8 8 encoder. The zero motion H.264 encoder coder can be reduced by 75% to obtain the GMh_2 2 encoder
(ZMh_8 8) has the poorest rate-distortion performance of all without considerably affecting the compression performance.
the encoders compared here and is included only for reference. We have also incorporated into these encoders, a spectral
Similar results have been obtained for other test sequences entropy based QM design scheme that provides near constant
with primarily global motion and are summarized in Table VI in quality within GOPs at the cost of small increases in delay
terms of the BD-rate savings and BD-PSNR improvements and complexity, and a small drop in compression efficiency.
of the GMh_1 1 and GMs_1 1 encoders with respect to the Compared to the complexity-constrained H.264 encoder
BMh_8 8 encoder. Since the performances of the BMh_8 8 (BMh_8 8), these (GMs_1 1 and GMs_2 2) encoders with
and BGMh_8 8 encoders are similar, we choose to present spectral entropy QM design still provide more than 28%
the performance gains with respect to only the BMh_8 8 BD-rate savings.
encoder. Along the same lines, we present results only for the All our proposed encoders (GMh_1 1, GMh_2 2,
GMh_1 1 and GMs_1 1 encoders whose compression per- GMs_1 1, GMs_2 2) provide significant compression gains
formances are similar to those of the GMh_2 2 and GMs_2 2 of more than 49% BD-rate savings over the Wyner-Ziv DIS-
encoders respectively. In all cases, the proposed GMh_1 1 COVER codec. Furthermore, we have considered the case
and GMs_1 1 encoders achieve significant compression per- when the global motion parameters cannot be derived at the
formance improvement over the BMh_8 8 encoder. decoder and have shown that embedding the global motion
In certain application scenarios, all the information about the parameters in the video bitstream only requires an additional
UAV and camera motion required to derive the global motion 15 kbps and hence does not greatly reduce the compression
parameters might not be available at the decoder or the global performance gains of the proposed encoders.
BHASKARANAND AND GIBSON: GLOBAL MOTION ASSISTED LOW COMPLEXITY VIDEO ENCODING FOR UAV APPLICATIONS 149
In our complexity analysis, we evaluate the encoders in [12] Y. Tan, W. Lee, J. Tham, and S. Rahardja, “Complexity-rate-distortion
terms of the following factors that most significantly contribute optimization for real-time H.264/AVC encoding,” in Proc. Int. Conf.
Comput. Commun. Netw., Aug. 2009, pp. 1–6.
to their complexity and power consumption in real-world sys- [13] L. Su, Y. Lu, F. Wu, S. Li, and W. Gao, “Real-time video coding under
tems: number of computations, memory accesses, and memory power constraint based on H.264 codec,” in Proc. SPIE 6508, Visual
storage. The fewer computations and memory accesses re- Commun. Image Process. (VCIP), Jan. 2007, vol. 6508, pp. 650 802-
quired by our GMh_1 1, GMh_2 2, and GMs_2 2 encoders 1–650 802-12.
[14] Y.-C. Lin, T. Fink, and E. Bellers, “Fast mode decision for H.264 based
compared to the BMh_8 8 encoder can translate to lower on rate-distortion cost estimation,” in Proc. IEEE Int. Conf. Acoust.,
power consumed for video encoding in UAVs. Moreover, the Speech, Signal Process. (ICASSP), 2007, vol. 1, pp. I-1137–I-1140.
bitrate savings achieved at fixed video quality by our encoders [15] F. Bossen, B. Bross, K. Suhring, and D. Flynn, “HEVC complexity and
implementation analysis,” IEEE Trans. Circuits Syst. Video Technol.,
can help reduce power required to transmit the compressed vol. 22, no. 12, pp. 1685–1696, Dec. 2012.
video bitstream in UAVs, because a reduced bitrate directly [16] A. Wyner and J. Ziv, “The rate-distortion function for source coding
translates into lower transmitted power. with side information at the decoder,” IEEE Trans. Inf. Theory, vol.
The strength of our encoders is that they utilize the global IT-22, no. 1, pp. 1–10, Jan. 1976.
[17] A. Aaron, R. Zhang, and B. Girod, “Wyner-Ziv coding of motion
motion information available in many UAV applications. In video,” in Proc. Asilomar Conf. Signals, Syst. Comput., Nov. 2002,
cases when the global motion parameters are not readily avail- vol. 1, pp. 240–244.
able or cannot be estimated accurately from UAV sensor data, [18] R. Puri and K. Ramchandran, “PRISM: A new robust video coding ar-
they could be computed using fast image registration techniques chitecture based on distributed compression principles,” in Proc. 40th
Allerton Conf. Communicat., Control, Comput., Oct. 2002.
that could possibly employ fast point-feature descriptors such [19] P. Ishwar, V. Prabhakaran, and K. Ramchandran, “Towards a theory
as FAST-ER [66] and real-time RANSAC variations such as for video coding using distributed compression principles,” in Proc.
ARRSAC [67] or be refined using algorithms such as those in Int. Conf. Image Process. (ICIP), 2003, vol. 2, pp. II-687–90.
[20] C. Brites, J. Ascenso, and F. Pereira, “Feedback channel in pixel do-
[68] from initial estimates derived from UAV sensor data. These main Wyner-Ziv video coding: Myths and realities,” in Proc. 14th Eur.
estimated global motion parameters can be used to achieve the Signal Process. Conf., Sep. 2006.
compression performance of any of our proposed encoders, [21] C. Brites and F. Pereira, “Encoder rate control for transform domain
albeit at a higher complexity due to the global motion parameter Wyner-Ziv video coding,” in Proc. IEEE Int. Conf. Image Process.
(ICIP), 2007, vol. 2, pp. II-5–II-8.
estimation or refinement. However, if the additional complexity [22] M. Morbee, J. Prades-Nebot, A. Pizurica, and W. Philips, “Rate al-
of global motion parameter estimation or refinement is not location algorithm for pixel-domain distributed video coding without
acceptable, a H.264/AVC encoder like the BMh_8 8 encoder feedback channel,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal
could be used at the cost of lower compression efficiency. Process. (ICASSP), Apr. 2007, vol. 1, pp. 521–524.
[23] C. Yaacoub, J. Farah, and B. Pesquet-Popescu, “Feedback channel
Therefore, one needs to chose the appropriate encoder based suppression in distributed video coding with adaptive rate allocation
on the complexity-compression trade-off flexibilities provided and quantization for multiuser applications,” EURASIP J. Wireless
by the application and scenario. Commun. Netw., vol. 2008, no. 1, pp. 427 247:1–13, Oct. 2008.
[24] A. M. Tourapis, F. Wu, and S. Li, “Direct macroblock coding for pre-
dictive (P) pictures in the H.264 standard,” in Proc. SPIE, 2004, vol.
REFERENCES 5308, pp. 364–371.
[25] A. Smolic, Y. Vatis, H. Schwarz, P. Kauff, U. Goelz, and T. Wie-
[1] R. Schneiderman, “Unmanned drones are flying high in the military/ gand, “Improved video coding using long-term global motion compen-
aerospace sector [Special reports],” IEEE Signal Process. Mag., vol. sation,” in Proc. Visual Commun. and Image Process., Jan. 2004, pp.
29, no. 1, pp. 8–11, Jan. 2012. 343–354.
[2] Z. Sarris, “Survey of UAV applications in civil markets,” in Proc. IEEE [26] A. Glantz, A. Krutz, and T. Sikora, “Adaptive global motion temporal
Mediterranean Conf. Control and Autom., 2001, pp. 1–11. prediction for video coding,” in Proc. Picture Coding Symp., 2010, pp.
[3] D. L. Hench, P. N. Topiwala, and Z. Xiong, “Channel adaptive video 202–205.
compression for unmanned aerial vehicles (UAVs),” in Proc. SPIE [27] M. Esche, A. Glantz, A. Krutz, and T. Sikora, “Adaptive temporal tra-
5558 Applicat. of Digital Image Process. XXVII, 2004, pp. 475–484. jectory filtering for video compression,” IEEE Trans. Circuits Syst.
[4] T. Klassen, “The UAV video problem: Using streaming video with un- Video Technol., vol. 22, no. 5, pp. 659–670, May 2012.
manned aerial vehicles,” Military and Aerosp. Electron., vol. 20, no. 7, [28] A. Krutz, A. Glantz, M. Tok, M. Esche, and T. Sikora, “Adaptive
Jul. 2009. global motion temporal filtering for high efficiency video coding,”
[5] T. Wiegand, G. Sullivan, G. Bjontegaard, and A. Luthra, “Overview IEEE Trans. Circuits Syst. Video Technol., vol. 22, no. 12, pp.
of the H.264/AVC video coding standard,” IEEE Trans. Circuits Syst. 1802–1812, Dec. 2012.
Video Technol., vol. 13, no. 7, pp. 560–576, Jul. 2003. [29] A. Stojanovic and J.-R. Ohm, “Exploiting long-term redundancies in
[6] G. J. Sullivan, J.-R. Ohm, W.-J. Han, and T. Wiegand, “Overview of the reconstructed video,” IEEE J. Sel. Topics Signal Process., vol. 7, no.
high efficiency video coding (HEVC) standard,” IEEE Trans. Circuits 6, pp. 1042–1052, Dec. 2013.
Syst. Video Technol., vol. 22, no. 12, pp. 1649–1668, Dec. 2012. [30] J. Gong, C. Zheng, J. Tian, and D. Wu, “An image-sequence com-
[7] M. Bhaskaranand and J. D. Gibson, “Low-complexity video encoding pressing algorithm based on homography transformation for unmanned
for UAV reconnaissance and surveillance,” in Proc. Military Commu- aerial vehicle,” in Proc. Int. Symp. Intell. Inf. Process. Trusted Comput.,
nicat. Conf. (MILCOM), 2011, pp. 1633–1638. Oct. 2010, pp. 37–40.
[8] M. Bhaskaranand and J. D. Gibson, “Global motion compensation and [31] A. F. Rodriguez, B. B. Ready, and C. N. Taylor, “Using telemetry data
spectral entropy bit allocation for low complexity video coding,” in for video compression on unmanned air vehicles,” in Proc. AIAA Guid-
Proc. IEEE Int. Conf. Commun. (ICC), 2012, pp. 2043–2047. ance, Navigation, Control Conf. Exhibit, 2006.
[9] M. Bhaskaranand and J. D. Gibson, “Low complexity video encoding [32] P. H. F. T. Soares and M. d. S. Pinho, “Video compression for UAV
and high complexity decoding for UAV reconnaissance and surveil- applications using a global motion estimation in the H.264 standard,”
lance,” in Proc. IEEE Int. Symp. Multimedia, Anaheim, CA, USA, Dec. in Proc. Int. Workshop Telecomm., Santa Rita do Sapucai, Brazil, May
2013, pp. 163–170. 2013.
[10] X. Artigas, J. Ascenso, M. Dalai, S. Klomp, D. Kubasov, and M. Ouaret, [33] C. Angelino, L. Cicala, M. D. Mizio, P. Leoncini, E. Baccaglini, M.
“The DISCOVER codec: Architecture, techniques and evaluation,” in Gavelli, N. Raimondo, and R. Scopigno, “Sensor aided H.264 video
Proc. Picture Coding Symp. (PCS), 2007, vol. 17, pp. 1103–1120. encoder for UAV applications,” in Picture Coding Symp. (PCS), Dec.
[11] Z. He, Y. Liang, L. Chen, I. Ahmad, and D. Wu, “Power-rate-distortion 2013, pp. 173–176.
analysis for wireless video communication under energy constraints,” [34] M. Bhaskaranand and J. D. Gibson, “Spectral entropy-based bit allo-
IEEE Trans. Circuits Syst. Video Technol., vol. 15, no. 5, pp. 645–658, cation,” in Proc. Int. Symp. Inf. Theory its Applicat. (ISITA), 2010, pp.
May 2005. 243–248.
150 IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 9, NO. 1, FEBRUARY 2015
[35] H. Malvar, A. Hallapuro, M. Karczewicz, and L. Kerofsky, [58] A. Abou-Elailah, F. Dufaux, J. Farah, M. Cagnazzo, and B. Pesquet-
“Low-complexity transform and quantization in H.264/AVC,” IEEE Popescu, “Successive refinement of motion compensated interpolation
Trans. Circuits Syst. Video Technol., vol. 13, no. 7, pp. 598–603, Jul. for transform- domain distributed video coding,” in Proc. Eur. Signal
2003. Process. Conf. (EUSIPCO), Aug. 2011, vol. 1, pp. 11–15.
[36] R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer [59] C. Brites, J. Ascenso, and F. Pereira, “Learning based decoding
Vision, 2nd ed. Cambridge, U.K.: Cambridge Univ. Press, 2004. approach for improved Wyner-Ziv video coding,” in Proc. Picture
[37] D. Lee, Y. Kim, and H. Bang, “Vision-based terrain referenced Coding Symp. (PCS), 2012, pp. 165–168.
navigation for unmanned aerial vehicles using homography [60] “UAV video data from the sensor data management system (SDMS)
relationship,” J. Intell. Robot. Syst., vol. 69, no. 1-4, pp. 489–497, Jan. of the Air Force Research Lab (AFRL),” [Online]. Available: https://
2013. www.sdms.afrl.af.mil
[38] M. K. Kaiser, N. Gans, and W. Dixon, “Vision-based estimation for [61] D. G. Lowe, “Distinctive image features from scale-invariant key-
guidance, navigation, and control of an aerial vehicle,” IEEE Trans. points,” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, Nov. 2004.
Aerosp. Electron. Syst., vol. 46, no. 3, pp. 1064–1077, Jul. 2010. [62] M. A. Fischler and R. C. Bolles, “Random sample consensus: A par-
[39] A. E. R. Shabayek, C. Demonceaux, O. Morel, and D. Fofi, “Vision adigm for model fitting with applications to image analysis and auto-
based UAV attitude estimation: Progress and insights,” J. Intell. Robot. mated cartography,” Commun. ACM, vol. 24, no. 6, pp. 381–395, Jun.
Syst., vol. 65, no. 1–4, pp. 295–308, Jan. 2012. 1981.
[40] F. Kendoul, “Survey of advances in guidance, navigation, and control [63] Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality as-
of unmanned rotorcraft systems,” J. Field Robot., vol. 29, no. 2, pp. sessment: From error visibility to structural similarity,” IEEE Trans.
315–378, Mar. 2012. Image Process., vol. 13, no. 4, pp. 600–612, Apr. 2004.
[41] L. L. Campbell, “Minimum coefficient rate for stationary random pro- [64] G. Bjontegaard, “Calculation of Average PSNR Differences between
cesses,” Inf. Control, vol. 3, no. 4, pp. 360–371, 1960. RD-Curves,” 2001, ITU-T SC16/Q.6 VCEG-M33.
[42] W. Yang, J. Gibson, and T. He, “Coefficient rate and lossy source [65] M. Tok, A. Krutz, A. Glantz, and T. Sikora, “Lossy parametric motion
coding,” IEEE Trans. Inf. Theory, vol. 51, no. 1, pp. 381–386, Jan. model compression for global motion temporal filtering,” in Proc. Pic-
2005. ture Coding Symp. (PCS), 2012, pp. 309–312.
[43] J. Jung and J. Gibson, “The interpretation of spectral entropy based [66] E. Rosten, R. Porter, and T. Drummond, “Faster and better: A machine
upon rate distortion functions,” in Proc. IEEE Int. Symp. Inf. Theory, learning approach to corner detection,” IEEE Trans. Pattern Anal.
Jul. 2006, pp. 277–281. Mach. Intell., vol. 32, no. 1, pp. 105–119, Jan. 2010.
[44] M. Bhaskaranand and J. D. Gibson, “Spectral entropy-based quanti- [67] R. Raguram, J.-M. Frahm, and M. Pollefeys, “A comparative analysis
zation matrices for H.264/AVC video coding,” in Proc. 44th Asilomar of RANSAC techniques leading to adaptive real-time random sample
Conf. Signals, Syst. Comput., 2010, pp. 421–425. consensus,” in Computer Vision – ECCV. New York, NY, USA:
[45] W. Yang and J. Gibson, “Coefficient rate in transform coding,” in Proc., Springer, 2008, pp. 500–513.
35th Allerton Conf. Communicat., Control, Comput., Sep.-Oct. 29–1, [68] S. Baker and I. Matthews, “Lucas-Kanade 20 years on: A unifying
1997, pp. 128–137. framework,” Int. J. Comput. Vis., vol. 56, no. 3, pp. 221–255, Feb. 2004.
[46] A. Gersho and R. M. Gray, Vector Quantization and Signal Compres-
sion. Norwell, MA, USA: Kluwer, 1991.
[47] A. Ortega and K. Ramchandran, “Rate-distortion methods for image
and video compression,” IEEE Signal Process. Mag., vol. 15, no. 6, Malavika Bhaskaranand received her Ph.D. degree
pp. 23–50, Nov. 1998. in electrical and computer engineering from the Uni-
[48] R. Mester and U. Franke, “Spectral entropy-activity classification in versity of California, Santa Barbara, in 2013, under
adaptive transform coding,” IEEE J. Sel. Areas Commun., vol. 10, no. the supervision of Prof. Jerry D. Gibson. She received
5, pp. 913–917, Jun. 1992. the B.E. degree from the National Institute of Tech-
[49] M. Budagavi and M. Zhou, “Next generation video coding for mobile nology Karnataka, India, in 2004 and the M.S. de-
applications: Industry requirements and technologies,” in Proc. SPIE gree from the University of California, Santa Bar-
6508, Visual Communicat. and Image Process. (VCIP), 2007, pp. 650 bara, in 2008. She was a Senior Development Engi-
813-1–650 813-6. neer with the Media Processing Group, Ittiam Sys-
[50] “DISCOVER Codec Evaluation,” [Online]. Available: http://www. tems in Bangalore from 2004 to 2007 where she de-
img.lx.it.pt/~discover/home.html veloped video encoders, decoders, and transcoders on
[51] A. M. Tourapis, “Enhanced predictive zonal search for single and embedded platforms. Her current research interests include video compression,
multiple frame motion estimation,” in Proc. Visual Commun. Image video processing, and image processing.
Process. (VCIP), Jan. 2002, pp. 1069–1079.
[52] K.-H. Chen and Y.-S. Chu, “A low-power multiplier with the spurious
power suppression technique,” IEEE Trans. VLSI Syst., vol. 15, no. 7,
pp. 846–850, Jul. 2007. Jerry D. Gibson is Professor of Electrical and Com-
[53] B. Ramkumar and H. Kittur, “Low-power and area-efficient carry se- puter Engineering at the University of California,
lect adder,” IEEE Trans. VLSI Syst., vol. 20, no. 2, pp. 371–375, Feb. Santa Barbara. He has been an Associate Editor of the
2012. IEEE TRANSACTIONS ON COMMUNICATIONS and the
[54] A. Gupte, B. Amrutur, M. Mehendale, A. Rao, and M. Budagavi, IEEE TRANSACTIONS ON INFORMATION THEORY. He
“Memory bandwidth and power reduction using lossy reference frame was an IEEE Communications Society Distinguished
compression in video encoding,” IEEE Trans. Circuits Syst. Video Lecturer for 2007–2008. He is an IEEE Fellow, and
Technol., vol. 21, no. 2, pp. 225–230, Feb. 2011. he has received The Fredrick Emmons Terman Award
[55] J.-C. Tuan, T.-S. Chang, and C.-W. Jen, “On the data reuse and memory (1990), the 1993 IEEE Signal Processing Society
bandwidth analysis for full-search block-matching VLSI architecture,” Senior Paper Award, the 2009 IEEE Technical Com-
IEEE Trans. Circuits Syst. Video Technol., vol. 12, no. 1, pp. 61–72, mittee on Wireless Communications Recognition
Jan. 2002. Award, and the 2010 Best Paper Award from the IEEE TRANSACTIONS ON
[56] H. Shim and C.-M. Kyung, “Selective search area reuse algorithm for MULTIMEDIA. He is the author, coauthor, and editor of several books, the most
low external memory access motion estimation,” IEEE Trans. Circuits recent of which are The Mobile Communications Handbook (Editor, 3rd ed.,
Syst. Video Technol., vol. 19, no. 7, pp. 1044–1050, Jul. 2009. 2012), Rate Distortion Bounds for Voice and Video (Coauthor with Jing Hu,
[57] “DISCOVER Codec,” [Online]. Available: http://www.dis- NOW Publishers, 2014), and Information Theory and Rate Distortion Theory
coverdvc.org/ for Communications and Compression (Morgan-Claypool, 2014).