Professional Documents
Culture Documents
Combined Questions PDF
Combined Questions PDF
Explain.
Team 2 questions
Topic: Applications of adaptive quantization and Haar wavelet transforms.
Team members: Sushruth N, Ranjan P, Thejus P, Ajith Kumar S, Amit S K,
Kritika G, Deepak TS ,Shubhashree
In mathematics, the Haar wavelet is a sequence of rescaled "square-shaped" functions which together
form a wavelet family or basis. Wavelet analysis is similar to Fourier analysis in that it allows a target
function over an interval to be represented in terms of an orthonormal basis. The Haar sequence is now
recognised as the first known wavelet basis and extensively used as a teaching example.
The Haar sequence was proposed in 1909 by Alfréd Haar. Haar used these functions to give an example
of an orthonormal system for the space of square-integrable functions on the unit interval [0, 1]. The
study of wavelets, and even the term "wavelet", did not come until much later. As a special case of
the Daubechies wavelet, the Haar wavelet is also known as Db1.
The Haar wavelet is also the simplest possible wavelet. The technical disadvantage of the Haar wavelet is
that it is not continuous, and therefore not differentiable. This property can, however, be an advantage
for the analysis of signals with sudden transitions, such as monitoring of tool failure in machines.
Applications
Modern cameras are capable of producing images with resolutions in the range of tens of megapixels.
These images need to be compressed before storage and transfer. The Haar transform can be used for
image compression. The basic idea is to transfer the image into a matrix in which each element of the
matrix represents a pixel in the image. For example, a 256×256 matrix is saved for a 256×256
image. JPEG image compression involves cutting the original image into 8×8 sub-images. Each sub-image
is an 8×8 matrix.
Q. What is DCT? Write the equation for 1D and 2D DCT. What are its
applications? (5M)
A discrete cosine transform (DCT) expresses a finite sequence of data points in terms of a
sum of cosine functions oscillating at different frequencies. In particular, a DCT is a Fourier-
related transform similar to the discrete Fourier transform (DFT), but using only real
numbers. The DCTs are generally related to Fourier Series coefficients of a periodically and
symmetrically extended sequence whereas DFTs are related to Fourier Series coefficients of a
periodically extended sequence. DCTs are equivalent to DFTs of roughly twice the length,
operating on real data with even symmetry
. The DCT is often used in signal and image processing, especially for lossy compression,
because it has a strong "energy compaction" property. A related transform,
the modified discrete cosine transform, or MDCT (based on the DCT-IV), is used
in AAC, Vorbis, WMA, and MP3 audio compression.DCTs are also widely employed in
solving partial differential equations by spectral methods, where the different variants of the
DCT correspond to slightly different even/odd boundary conditions at the two ends of the
array.The DCT is used in JPEG image compression, MJPEG, MPEG, DV, Daala,
and Theora video compression.
Define wavelets and discrete wavelet transform. Explain the procedure for wavelet
based image compression with a block diagram. Mention the advantages of wavelet over
other compression techniques. (5M)
Wavelets are mathematical functions that cut up data into different frequency components,
and then study each component with a resolution matched to its scale.
Wavelet transform decomposes a signal into a set of basis functions. These basis functions
are called wavelets.
Discrete Wavelet transform (DWT) is the one which transforms a discrete time signal to a
discrete wavelet representation.
1)What is vector quantization?Mention few applications of vector quantization ?(5 marks)
Ans) Vector quantization (VQ) is a classical quantization technique from signal processing that allows
the modeling of probability density functions by the distribution of prototype vectors. It was originally
used for data compression. It works by dividing a large set of points (vectors) into groups having
approximately the same number of points closest to them. Each group is represented by
its centroid point, as in k‐means and some other clustering algorithms.
The density matching property of vector quantization is powerful, especially for identifying the density
of large and high‐dimensional data. Since data points are represented by the index of their closest
centroid, commonly occurring data have low error, and rare data high error. This is why VQ is suitable
for lossy data compression. It can also be used for lossy data correction and density estimation.
Vector quantization is based on the competitive learning paradigm, so it is closely related to the self‐
organizing map model and to sparse coding models used in deep learningalgorithms such
as autoencoder.
Vector quantization is used for lossy data compression, lossy data correction, pattern recognition,
density estimation and clustering.
Lossy data correction, or prediction, is used to recover data missing from some dimensions. It is done by
finding the nearest group with the data dimensions available, then predicting the result based on the
values for the missing dimensions, assuming that they will have the same value as the group's centroid.
For density estimation, the area/volume that is closer to a particular centroid than to any other is
inversely proportional to the density (due to the density matching property of the algorithm).
Applications
Use in data compression
Vector quantization, also called "block quantization" or "pattern matching quantization" is often used
in lossy data compression. It works by encoding values from a multidimensional vector space into a
finite set of values from a discrete subspace of lower dimension. A lower‐space vector requires less
storage space, so the data is compressed. Due to the density matching property of vector quantization,
the compressed data has errors that are inversely proportional to density.
The transformation is usually done by projection or by using a codebook. In some cases, a codebook can
be also used to entropy code the discrete value in the same step, by generating a prefix coded variable‐
length encoded value as its output.
The set of discrete amplitude levels is quantized jointly rather than each sample being quantized
separately. Consider a k‐dimensional vector of amplitude levels. It is compressed by choosing the
nearest matching vector from a set of n‐dimensional vectors, with n < k.
All possible combinations of the n‐dimensional vector form the vector space to which all the quantized
vectors belong.
Only the index of the codeword in the codebook is sent instead of the quantized values. This conserves
space and achieves more compression.
Twin vector quantization (VQF) is part of the MPEG‐4 standard dealing with time domain weighted
interleaved vector quantization.
Video codecs based on vector quantization
Bink video[2]
Cinepak
Daala is transform‐based but uses vector quantization on transformed coefficients[3]
Digital Video Interactive: Production‐Level Video and Real‐Time Video
Indeo
Microsoft Video 1
QuickTime: Apple Video (RPZA) and Graphics Codec (SMC)
Sorenson SVQ1 and SVQ3
Smacker video
VQA format, used in many games
The usage of video codecs based on vector quantization has declined significantly in favor of those
based on motion compensated prediction combined with transform coding, e.g. those defined
in MPEG standards, as the low decoding complexity of vector quantization has become less relevant.
Audio codecs based on vector quantization
AMR‐WB+
CELP
DTS
G.729
iLBC
Ogg Vorbis [4]
Opus is transform‐based but uses vector quantization on transformed coefficients
TwinVQ
Use in pattern recognition
VQ was also used in the eighties for speech[5] and speaker recognition.[6] Recently it has also been used
for efficient nearest neighbor search [7] and on‐line signature recognition In pattern
recognition applications, one codebook is constructed for each class (each class being a user in biometric
applications) using acoustic vectors of this user. In the testing phase the quantization distortion of a
testing signal is worked out with the whole set of codebooks obtained in the training phase. The
codebook that provides the smallest vector quantization distortion indicates the identified user.
The main advantage of VQ in pattern recognition is its low computational burden when compared with
other techniques such as dynamic time warping (DTW) and hidden Markov model (HMM). The main
drawback when compared to DTW and HMM is that it does not take into account the temporal
evolution of the signals (speech, signature, etc.) because all the vectors are mixed up. In order to
overcome this problem a multi‐section codebook approach has been proposed.[9] The multi‐section
approach consists of modelling the signal with several sections (for instance, one codebook for the initial
part, another one for the center and a last codebook for the ending part).
Use as clustering algorithm
As VQ is seeking for centroids as density points of nearby lying samples, it can be also directly used as a
prototype‐based clustering method: each centroid is then associated with one prototype. By aiming to
minimize the expected squared quantization error[10] and introducing a decreasing learning gain fulfilling
the Robbins‐Monro conditions, multiple iterations over the whole data set with a concrete but fixed
number of prototypes converges to the solution of k‐means clustering algorithm in an incremental
manner.
Team 5
Topic: KL Transform and Symelet Wavelet
Q1) What is KL transform? Write the major steps involved in finding the
KL transform? (10M)
The KL Transform is also known as the Hoteling transform or the Eigen Vector transform. The
KL Transform is based on the statistical properties of the image and has several important
properties that make it useful for image processing particularly for image compression.The
main purpose of image compression is to store the image in fewer bits as compared to original
image, now data from neighboring pixels in an image are highly correlated. More image
compression can be achieved by de-correlating this data. The KL transform does the task of
de-correlating the data thus facilitating higher degree of compression. K-L transform applies
to random signals/images and has wide applications in data reduction, rotation and data
decorrelation applications. --------------(2)
(I) Find the mean vector and covariance matrix of the given image X---------- (2)
(II) Find the Eigen values and then the eigen vectors of the covariance matrix----------(2)
(III) Create the transformation matrix T, such that rows of T are eigen vectors----------
Consider an image X
(I) Find the mean vector and covariance matrix of the given image
X Now the image is broken down into column vectors as
The covariance for the vector population of size N can be approximated by the
formula The covariance matrix is real and symmetric
(II) Find the Eigen values and then the eigen vectors of the covariance matrix
Let viand λI be the eigen vectors and eigen values of Cx, where 1<=i<=N.
(II….) Find the Eigen values and then the eigen vectors of the covariance matrix
Let v1 and λ I be the eigen vectors and eigen values of Cx, where 1<=i<=N.
Eigen value can be found out using the equation
Eigen Vector can be found out corresponding to each Eigen vector as shown
below Cxv1=λv1 (Finding the first Eigen Vector)
(III) Create the transformation matrix T, such that rows of T are eigen vectors
i. The KL Transformation matrix is formed using the Eigen vectors. Each eigen vector is
arranged as a row of the transformation matrix.
ii. The vector corresponding to the largest Eigen value is placed on the first row and so on.
iii. This KL Transform matrix, T, is orthogonal
We obtain the KL Transformed image by simply multiplying the Transformation matrix with
the centralized image vector (x−mx)
Therefore, X=T.(x−mx)
This is the formula for KL Transform.
1. What is JPEG? Explain JPEG Compression Algorithm.
JPEG is a commonly used method of lossy compression for digital images. The degree of
compression can be adjusted, allowing a tradeoff between storage size and image quality with a
compression ratio 10:1; but with the little perceptible loss in image quality.
2. Write a short note on JPEG 2000. Brief out the difference
between JPEG and JPEG2000
JPEG 2000 standard for the compression of still images is based on the Discrete Wavelet
Transform (DWT). This transform decomposes the image using functions called wavelets.
The basic idea is to have a more localized analysis of the information which is not possible
using cosine functions whose temporal or spatial supports are identical to the data.
Better image quality that JPEG at the same file size or alternatively 25-35 % smaller file
sizes with the same quality.
Good image quality at low bit rates ( even with compression ratios over 80 :1)
Low complexity option for devices with limited resources.
Scalable image files – no decomposition needed for reformatting with JPEG 2000, the
image that best matches the target device can be extracted from a single compressed file
on a server. Options include:
o Image sizes from thumbnail to full size.
o Grayscale to full 3 channel color.
o Low quality image to lossless (identical to original image)
JPEG 2000 is more suitable to web-graphics than baseline JPEG because it supports
alpha-channel (transparency component)
Region of Interest (ROI): One can define some more interesting parts of image, which are
coded with more bits than surrounding areas.
Difference between JPEG and JPEG2000
Analysis
The source output is passed through a bank of filters, called the analysis filter bank, which
covers the range of frequencies that make up the source output. The passbands of the filters
can be nonoverlapping or overlapping. Nonoverlapping and overlapping filter banks are
shown in Figure 14.8. The outputs of the filters are then subsampled.
The justification for the subsampling is the Nyquist rule and its generalization, which tells us
that we only need twice as many samples per second as the range of frequencies. This means
that we can reduce the number of samples at the output of the filter because the range of
frequencies at the output of the filter is less than the range of frequencies at the input to the
filter. This process of reducing the number of samples is
called decimation,1 or downsampling. The amount of decimation depends on the ratio of the
bandwidth of the filter output to the filter input. If the bandwidth at the output of the filter
is 1/M of the bandwidth at the input to the filter, we would decimate the output by a factor of
M by keeping every imageth sample. The symbol Mↆ is used to denote this decimation.
Once the output of the filters has been decimated, the output is encoded using one of several
encoding schemes, including ADPCM, PCM, and vector quantization.
Synthesis
The quantized and coded coefficients are used to reconstruct a representation of the original
signal at the decoder. First, the encoded samples from each subband are decoded at the
receiver. These decoded values are then upsampled by inserting an appropriate number of 0s
between samples. Once the number of samples per second has been brought back to the
original rate, the upsampled signals are passed through a bank of reconstruction filters. The
outputs of the reconstruction filters are added to give the final reconstructed outputs.
We can see that the basic subband system is simple. The three major components of this
system are the analysis and synthesis filters, the bit allocation scheme, and
the encoding scheme..
The bit allocation procedures have also been extensively studied in the contexts of subband
coding, wavelet-based coding, and transform coding.
The separation of the source output according to frequency also opens up the possibility for
innovative ways to use compression algorithms. The decomposition of the source output in
this manner provides inputs for the compression algorithms, each of which has more clearly
defined characteristics than the original source output. We can use these characteristics to
select separate compression schemes appropriate to each of the different inputs.
Human perception of audio and video inputs is frequency dependent. We can use this fact to
design our compression schemes so that the frequency bands that are most important to
perception are reconstructed most accurately. Whatever distortion there has to be is
introduced in the frequency bands to which humans are least sensitive
2.Explain embedded zerotree coder as an application of wavelets for image cpmpression?
2. Prediction of the absence of significant information across scales by exploiting the self-
similarity inherent in images
4. “Universal” lossless data compression which is achieved via adaptive arithmetic coding
Why Wavelets?
• Traditional DCT & subband coding: trends “obscure” anomalies that carry info – E.g.,
edges get spread, yielding many non-zero coefficients to be coded
• Wavelets are better at localizing edges and other anomalies – Yields a few non-zero
coefficients & many zero coefficients – Difficulty: telling the decoder “where” the few non-
zero’s are!!!
• Natural images in general have a low pass spectrum. – the wavelet coefficients will, on
average, be smaller in the higher subbands than in the lower subbands.
• Large wavelet coefficients are more important than smaller wavelet coefficients.
Embedded Zero tree algorithm is a simple yet powerful algorithm having the property that the
bits in the stream are generated in the order of their importance. The first step in this
algorithm is setting up an initial threshold. Any coefficient in the wavelet is said to be
significant if its absolute value is greater than the threshold. In a hierarchical sub-band
system, every coefficient is spatially related to a coefficient in the lower band. Such
coefficients in the higher bands are called ‘descendants’This is shown in figure
If a coefficient is significant and positive, then it is coded as ‘positive significant’ (ps). If a
coefficient is significant and negative, then it is coded as ‘negative significant’ (ns). If a
coefficient is insignificant and all its descendants are insignificant as well, then it is coded as
‘zero tree root’ (ztr). If a coefficient is insignificant and all its descendants are not
insignificant, then it is coded as ‘insignificant zero’ (iz). The algorithm involves two passes –
Dominant pass and Subordinate pass. In the dominant pass, the initial threshold is set to one
half of the maximum pixel value. Subsequent passes have threshold values one half of the
previous threshold. The coefficients are then coded as ps, ns, iz or ztr according to their
values. The important part is that if a coefficient is a zerotree root, then the descendants need
not be encoded. Thus only the significant values are encoded
In the subordinate pass, those coefficients which were found significant in the dominant pass
are quantized based on the pixel value. In the first pass, the threshold is half of the maximum
magnitude, so the interval is divided into two and the subordinate pass codes a 1 if the
coefficient is in the upper half of the interval and codes a 0 if the coefficient is in the lower
half of the interval. Thus if the number of passes is increased, the precision of the coefficients
is increased.
The EZW scheme implemented divides an image into blocks of standard size and performs
coding on each of the blocks. It can be seen that for very small block sizes and very large
block sizes the visual distortions are more pronounced. In case of smaller images the
distortions may be due to the size of the header details needed to be added for each block,
which would be significant compared to the information of the block itself. In case of large
block size the initial threshold is high and hence needs a lot of passes to achieve significant
amount of visual quality.
Q1. (a)What is Haar transform and also state some properties of haar transform.
(5 marks)
Ans: The Haar wavelet is a sequence of rescaled "square-shaped" functions which together form
a wavelet family or basis. Wavelet analysis is similar to Fourier analysis in that it allows a target
function over an interval to be represented in terms of an orthonormal basis.
The Haar wavelet is also the simplest possible wavelet. The technical disadvantage of the
Haar wavelet is that it is not continuous, and therefore not differentiable. This property can, however,
be an advantage for the analysis of signals with sudden transitions, such as monitoring of tool failure
in machines.
Wavelet can keep track of time and frequency information. There are two functions that play a primary
role in wavelet analysis, the scaling function (father wavelet) and the wavelet (mother wavelet). The
simplest wavelet analysis is based on Haar scaling function.
Its scaling function can be described as
Haar Wavelet's properties:
(1) Any function can be the linear combination of ψ ( x), ψ ( 2 x), ψ ( 2 2 x), L,ψ ( 2 k x), L φ ( x), φ ( 2 x),
φ ( 2 2 x), L φ ( 2 k x), L and their shifting functions
(2) Any function can be the linear combination of constant function, and their shifting functions.
Ans: Daubechies wavelets are usually defined by their number of vanishing moments(m), or,
equivalently, the length of the corresponding filter (2m). The Haar wavelet is a special case of
the Daubechies, with m=1.
Several aspects can be considered, all connected with each others. The 2 most important, I
think, are the following:
- it is somehow related to the regularity (smoothness, if you wish) of the wavelet: the higher m,
the more regular the wavelet. With only one vanishing moment, Haar is particularly irregular, in
fact it is even discontinuous. More regular wavelets will be able to approximate, or maybe
compress, regular signals more efficiently. This is linked to a property called polynomial
suppression: the number m is in fact the maximum degree of polynomials orthogonal to the
wavelet.
- it is also related to the support of the wavelet function, or equivalently of its corresponding
filter. The higher the number of vanishing moments, the larger the support, which means more
computation. Daubechies wavelets are optimal for a given number of vanishing moments (i.e.
they have the smaller support); since Haar is the simplest Daubechies, it makes it the cheapest
wavelet in terms of computation (plus it is relatively simple to implement).
Q2.Explain vector quantization, list the applications and describe how it is used in data
compression(10marks).
Vector quantization
The idea of scalar quantization generalizes immediately to vector quantization (VQ). In this case,
we have to perform quantization over blocks of data, instead of a single scalar value. The
quantization output is an index value which indicates another data block (vector) from a finite set
of vectors, called the codebook. The selected vector is usually an approximation of the input data
block. The important point about VQ is that, we require reproduction vectors (instead of
reproduction levels) that are known by the encpder and the decoder. The encoder takes an input
vector, determines the best representing reproduction vector, and transmits the index of that
vector. The decoder takes that index, and forms the reproduction vector because it already knows
the reproduction vectors instead of the original. Consider the following figure:
The three 2x2 data vectors at the left (x1 ,x2 ,x3 ) are quantized to the 2x2 data vector at the right
(Yi ). This means that the encoder transmits the symbol which represents vector when it
encounters the 2x2 vectors at the left as its input. ObviouslyYi, should be a good representation
Yi for the left vectors. The decoder, therefore, reproduces Yi at the places of the original 2x2
vectors at the left. The issues of "how good a representation of the right vector is for the left
vectors" is still valid, and the distortion measurements will be similar to the scalar case. The
overall encoder and decoder can be given as:
The encoder, therefore, just finds the vector in the set from Y1 to Ynto which is closest to the
input vectorXn . Let's say that the closest vector is . The encoder, then transmits the index
corresponding toYi , which is i. The task of the decoder is even easier. It just gets the index i, and
extracts the vector Yi from the codebook, which is the same as the codebook of the encoder. The
quantized version ofXn is, therefore, Yi
APPLICATION
Vector quantization is used for lossy data compression, lossy data correction, pattern recognition,
density estimation and clustering.
Lossy data correction, or prediction, is used to recover data missing from some dimensions. It is
done by finding the nearest group with the data dimensions available, then predicting the result
based on the values for the missing dimensions, assuming that they will have the same value as
the group's centroid.
For density estimation, the area/volume that is closer to a particular centroid than to any other is
inversely proportional to the density (due to the density matching property of the algorithm).
Team 10 – Manu, Skanda, Varun, Rajath Jain, Vishwas, Prince, Shakeel, Vinay
Topic: Quantization and coding of transform coefficients
1. What are transforms? Explain DCT with suitable diagram and mention its
advantages.
A.
2. Describe multistage vector quantization and illustrate a three stage quantizer.
A.