Professional Documents
Culture Documents
Lecture8 PDF
Lecture8 PDF
Image data is by its nature multidimensional and tend to take up a lot of space
Storage requirements
2 hours of video in HDTV resolution
Compression 2 * 60 * 60 * 30 * 1920 * 1080 * 3 ≈ 1.22TB
Computational expenses
Encryption of data is computationally expensive
Lecture 8 ©2005 P-O Östberg, Umeå University Page 3 Lecture 8 ©2005 P-O Östberg, Umeå University Page 4
1
Coding redundancy Interpixel redundancy
Coding redundancy is due to inefficiencies in data representations Interpixel redundancy is defined as failure to identify and utilize data
relationships
n
p r (rk ) = k k = 0,1,..., L − 1
n If a pixel value can be reasonably predicted from its neighboring (or preceeding
L −1 / following) pixels the image is said to contain interpixel redundancy
Lavg = ∑ l (rk ) p r ( rk )
k =0 Interpixel redundancy depends on the resolution of the image
pr(rk) – the probability that a pixel r will have value k • The higher the (spatial) resolution of an image, the more probable it is that
two neighboring pixels will depict the same object
L - the number of (possible) grays levels
• The higher the frame rate in a video stream, the more probable it is that the
nk - the number of pixels with value k corresponding pixel in the following frame will depict the same object
n - the number of pixels
l(rk) – the number of bits required to store a pixel r with value k These types of predictions are made more difficult by the presence of noise
L – the number of (possible) pixel values
Lavg - the average number of bits required to store a pixel
Lecture 8 ©2005 P-O Östberg, Umeå University Page 7 Lecture 8 ©2005 P-O Östberg, Umeå University Page 8
The human visual system does not rely on quantitative analysis of individual
pixel values when interpreting an image – an observer searches for distinct
features and mentally combines them into recognizable groupings
In this process certain information is relatively less important than other – this
information is called psychovisually redudant
The channel encoder inserts a form of “controlled redundancy” (e.g. Hamming Mapper - transforms the image to a (non-visual) format designed to reduce
codes or parity bits) in the data representation in order to detect and / or interpixel redundancies
correct errors introduced by channel noise Quantizer - reduces psychovisual redundancies by quantizing data deemed
This increases the size of the data representation and if the channel is noise less important for visual interpretation (omitted for lossless compression)
free (i.e. not prone to errors) the channel encoder and decoder are omitted Symbol encoder - codes the data efficiently (typically using some form of
variable-length coding scheme) and aims to reduce coding redundancies
Lecture 8 ©2005 P-O Östberg, Umeå University Page 11 Lecture 8 ©2005 P-O Östberg, Umeå University Page 12
2
Lossless compression Huffman coding
Provides limited compression capabilities - information theory provides a Provides a data representation with the smallest possible number of code
mathematical framework for (and good measures of) the levels of symbols per source symbol (when coding the symbols of an information
compression that can be achieved source individually)
Principally motivated by application specifications – data loss may be Huffman code construction is done in two steps
unacceptable 1) Source reductions
• Medical imaging – diagnostic quality must be maintained Create a series of source reductions by ordering the probabilities of the
• Satellite imaging – image acquisition costs are astronomical symbols and then combine the symbols of lowest probability to form new
• Legal and historical archival symbols recursively
• Future application requirements may be unknown This step is sometimes referred to as “building a Huffman tree” and can be
done statistically
2) Code assignment
When only two symbols remain, retrace the steps in 1) and assign a code
bit to the symbols in each step
Encoding and decoding is done using these codes in a lookup table manner
Lecture 8 ©2005 P-O Östberg, Umeå University Page 13 Lecture 8 ©2005 P-O Östberg, Umeå University Page 14
Encoding is done by choosing any number in the interval to represent the data
Decoding is done by retracing the steps in the code construction
Lecture 8 ©2005 P-O Östberg, Umeå University Page 15 Lecture 8 ©2005 P-O Östberg, Umeå University Page 16
Lecture 8 ©2005 P-O Östberg, Umeå University Page 17 Lecture 8 ©2005 P-O Östberg, Umeå University Page 18
3
Run-length coding Predictive coding
Provides a data representation where sequences of symbol values of the same Provides a data representation where code words express source symbol
value are coded as “run-lengths” deviations from predicted values (usually values of neighboring pixels)
Predictive coding efficiently reduces interpixel redundancies
• 1D – the image is treated as a row-by-row array of pixels
• 2D – blocks of pixels are viewed as source symbols • 1D & 2D – pixels are predicted from neighboring pixels
• 3D – pixels are predicted between frames as well
Works well for simple, artificial images
Does not work in the presence of noise Works well for all images with a high degree of interpixel redundancies
Can be combined with bit-plane coding (using Gray codes) Works in the presence of noise (just not as efficiently)
Predictive coding can be used in both lossless and lossy compression schemes
Lecture 8 ©2005 P-O Östberg, Umeå University Page 19 Lecture 8 ©2005 P-O Östberg, Umeå University Page 20
Lecture 8 ©2005 P-O Östberg, Umeå University Page 21 Lecture 8 ©2005 P-O Östberg, Umeå University Page 22
4
Transform coding Karhunen-Loève Transform (KLT)
The Karhunen-Loève Transform (KLT, also known as the Hotelling transform)
minimizes the mean-square error between the original (sub)image and the
(decompressed) reconstruction
This makes the KLT the optimal information compaction transformation for any
image and any number of retained coefficients
The basis functions of the KLT depend on the image however, making them
impossible to precompute and therefore not a practical choice for image
compression
Lecture 8 ©2005 P-O Östberg, Umeå University Page 25 Lecture 8 ©2005 P-O Östberg, Umeå University Page 26
Lecture 8 ©2005 P-O Östberg, Umeå University Page 27 Lecture 8 ©2005 P-O Östberg, Umeå University Page 28
Lecture 8 ©2005 P-O Östberg, Umeå University Page 29 Lecture 8 ©2005 P-O Östberg, Umeå University Page 30
5
Subimage size selection Subimage size selection
Usually a (integer power of two) size is chosen that will reduce the correlation
between adjacent subimages in the reconstructed image
Lecture 8 ©2005 P-O Östberg, Umeå University Page 31 Lecture 8 ©2005 P-O Östberg, Umeå University Page 32
Lecture 8 ©2005 P-O Östberg, Umeå University Page 35 Lecture 8 ©2005 P-O Östberg, Umeå University Page 36
6
Wavelet coding Wavelet coding
Like transform coding, wavelet coding is based on the premise that using a
linear transform (here a wavelet transform) will result in transform
coefficients that can be stored more efficiently than the pixels themselves
Due to the facts that wavelets are computationally efficient and that the wavelet
basis functions are limited in duration, subdivision of the original image is
unnecessary
Lecture 8 ©2005 P-O Östberg, Umeå University Page 37 Lecture 8 ©2005 P-O Östberg, Umeå University Page 38
Lecture 8 ©2005 P-O Östberg, Umeå University Page 39 Lecture 8 ©2005 P-O Östberg, Umeå University Page 40