You are on page 1of 10

Image compression

Written by Ahmed Hesham Mostafa,Sec 1,Level 4,CS


Image compression is the application of data compression. In effect, the objective is to
reduce redundancy of the image data in order to be able to store or transmit data in an efficient form, Image compression may be lossy or lossless.

Lossless compression:Lossless compression is preferred for archival purposes and often for medical imaging, technical drawings, clip art, or comics. This is because lossy compression methods, especially when used at low bit rates, introduce compression artifacts.

Lossy compression:Lossy methods are especially suitable for natural images such as photographs in applications where minor (sometimes imperceptible) loss of fidelity is acceptable to achieve a substantial reduction in bit rate. The lossy compression that produces imperceptible differences may be called visually lossless.

Methods for lossless image compression are: 1) Entropy encoding


In information theory an entropy encoding is a lossless data compression scheme that is independent of the specific characteristics of the medium. Two of the most common entropy encoding techniques are Huffman coding and arithmetic coding. a) Huffman coding int bits; while (true) { if ((bits = input.readbits(1)) == -1){ System.err.println("should not happen! trouble reading bits"); } else { // use the zero/one value of the bit read // to traverse Huffman coding tree // if a leaf is reached, decode the character and print UNLESS // the character is pseudo-EOF, then decompression done if ( (bits & 1) == 0) // read a 0, go left in tree
1

else // read a 1, go right in tree if (at leaf-node in tree) { if (leaf-node stores pseudo-eof char) break; // out of loop else write character stored in leaf-node }}}

Huffman Coding of Images


In order to encode images:

Divide image up into 8x8 blocks Each block is a symbol to be coded compute Huffman codes for set of block Encode blocks accordingly

b) Arithmetic coding
Encode Pseudo code and this is the pseudo code for the initialization:

Get probabilities and scale them Save probabilities in the output file High = FFFFh (16 bits) Low = 0000h (16 bits) Underflow_bits = 0 (16 bits should be enough)

Where:

High and low, they define where the output number falls. Underflow_bits, the bits which could have produced underflow and thus they were shifted.

And the routine to encode a symbol:


Range = ( high - low ) + 1 High = low + ( ( range * high_values [ symbol ] ) / scale ) - 1 Low = low + ( range * high_values [ symbol - 1 ] ) / scale Loop. (will exit when no more bits can be outputted or shifted) Msb of high = msb of low? Yes
2

o o o

Output msb of low Loop. While underflow_bits > 0 Let's output underflow bits pending for output Output Not ( msb of low ) go to shift Second msb of low = 1 and Second msb of high = 0 ? Check for underflow Yes Underflow_bits += 1 Here we shift to avoid underflow Low = low & 3FFFh High = high | 4000h go to shift No The routine for encoding a symbol ends here.

No
o o

Shift:

Shift low to the left one time. Now we have to put in low and high new bits Shift high to the left one time, and or the lsb with the value 1 Repeat to the first loop.

Decoding The first thing to do when decoding is read the probabilities, because the encode did the scaling you just have to read them and to do the ranges. The process will be the following: see in what symbol our number falls, extract the code of this symbol from the code. Before starting we have to init "code" this value will hold the bits from the input, init it to the first 16 bits in the input. And this is how it's done:

Range = ( high - low ) + 1 See where the number lands Temp = ( ( code - low ) + 1 ) * scale) - 1 ) / range ) See what symbols corresponds to temp. Range = ( high - low ) + 1 Extract the symbol code High = low + ( ( range * high_values [ symbol ] ) / scale ) - 1 Low = low + ( range * high_values [ symbol - 1 ] ) / scale Note that those formulae are the same that the encoder uses Loop. Msb of high = msb of low? Yes o Go to shift No o Second msb of low = 1 and Second msb of high = 0 ? o Yes Code = code ^ 4000h Low = low & 3FFFh High = high | 4000h go to shift o No
3

The routine for decoding a symbol ends here.

Shift: low to the left one time. Now we have to put in low, high and code new bits

2)Run-length encoding
Encoding Strings Traditional RLE Encoding using traditional RLE is fairly simple: Step 1. Set the previous symbol equal to an unmatchable value. Step 2. Read the next symbol from the input stream. Step 3. If the symbol is an EOF exit. Step 4. Write out the current symbol. Step 5. If the symbol is an does not match the previous symbol, set the previous symbol to the current symbol, and go to step 2. Step 6. Read and count additional symbols until a non-matching symbol is found. This is the run length. Step 7. Write out the run length. Step 8. Write out the non-matching symbol. Step 9. Set the previous symbol to the non-matching symbol, and go to step 2

When actually implementing traditional RLE, a little attention to detail is required in Step 6. The run length is stored in a finite number of bits (I used an unsigned char). Runs longer than the amount that can be counted need to be broken up into to smaller runs. When the maximum count is reached, just write the count value and start the process of looking for a run all over again. You also need to handle the case where a run is ended by an EOF. When a run is ended by an EOF, write out the run length and exit.That's all there is to it.

Encoding using the PackBits variant is slightly more complicate than traditional RLE. The block header cannot be written until the type of block and it's length have been determined. Until then data must be held in a buffer of the maximum length that can be copied verbatim. The following steps describe PackBits style encoding: Step 1. Read symbols from the input stream into the buffer until one of the following occurs: A. The buffer is full (go to step 2) B. An EOF is reached (go to step 3) C. The last three symbols are identical (go to step 4) Step 2. If the buffer is full: A. write the buffer size - 1 B. write contents of the buffer C. go to step 1 Step 3. If the symbol is an EOF: A. number of symbols in the buffer - 1 B. write contents of the buffer C. exit Step 4. If the last three symbols match, a run has been found. Determine the number of symbols in the buffer prior to the start of the run (n). Step 5. Write n - 1 followed by the contents of the buffer up to the start of the run. Step 5. Set the run length to 3. Step 6. Read additional symbols until a non-matching symbol is found. Increment the run length for each matching symbol. Step 7. Write out 2 - the run length the run length followed by the run symbol. Step 8. Write the non-matching symbol to the buffer and go to step 1.

That's pretty much all there is. You need to stop counting your run length in Step 6 if it reaches the maximum length you can account for in your header. My actual implementation is also a little less greedy. When I reach the maximum number of symbols that can be copied verbatim, I read an extra symbol or two in case the symbols at the end of a buffer are actually the start of a run.

Decoding traditionally encoded strings is even easier than encoding. Not only are there less steps, but there are no caveats. To decode a traditionally encoded stream: Step 1. Set the previous symbol equal to an unmatchable value. Step 2. Read the next symbol from the input stream. Step 3. If the symbol is an EOF exit. Step 4. Write out the current symbol. Step 5. If the symbol is an does not match the previous symbol, set the previous symbol to the current symbol, and go to step 2. Step 6. Read the run length. Step 7. Write out a run of the current symbol as long as indicated by the run length. Step 8. Go to step 1.

If that wasn't easy enough, it is even easier to decode strings encoded by the variant PackBits algorithm. To decode a variant PackBits encoded stream: Step 1. Read the block header (n). Step 2. If the header is an EOF exit. Step 3. If n is non-negative, copy the next n + 1 symbols to the output stream and go to step 1. Step 4. If n is negative, write 2 - n copies of the next symbol to the output stream and go to step1.

3) DEFLATE
do read block header from input stream. if stored with no compression skip any remaining bits in current partially processed byte read LEN and NLEN (see next section) copy LEN bytes of data to output otherwise if compressed with dynamic Huffman codes read representation of code trees (see subsection below) loop (until end of block code recognized) decode literal/length value from input stream if value < 256 copy value (literal byte) to output stream otherwise if value = end of block (256) break from loop otherwise (value = 257..285) decode distance from input stream move backwards distance bytes in the output stream, and copy length bytes from this position to the output stream. end loop while not last block

Methods for lossy compression 1) Chroma subsampling


Because the human visual system is less sensitive to the position and motion of color than luminance,[1] bandwidth can be optimized by storing more luminance detail than color detail. At normal viewing distances, there is no perceptible loss incurred by sampling the color detail at a lower rate. In video systems, this is achieved through the use of color difference components. The signal is divided into a luma (Y') component and two color difference components (chroma). Chroma subsampling deviates from color science in that the luma and chroma components are formed as a weighted sum of gamma-corrected (tristimulus) R'G'B' components instead of linear (tristimulus) RGB components. As a result, luminance and color detail are not completely independent of one another. There is some "bleeding" of luminance and color information between the luma and chroma components. The error is greatest for highly-saturated colors and
7

can be somewhat noticeable in between the magenta and green bars of a color bars test pattern (that has chroma subsampling applied). This engineering approximation (by reversing the order of operations between gamma correction and forming the weighted sum) allows color subsampling to be more easily implemented.

2) Transform coding
The idea of transform coding is to transform the input into a different form which can then either be compressed better, or for which we can more easily drop certain terms without as much qualitative loss in the output. One form of transform is to select a linear set of basis functions (i) that span the space to be transformed. Some common sets include sin, cos, polynomials, spherical harmonics, Bessel functions, and wavelets. Figure 18 shows some examples of the first three basis functions for discrete cosine, polynomial, and wavelet transformations. For a set of n values, transforms can be expressed as an n n matrix T. Multiplying the input by this matrix T gives, the transformed coefficients. Multiplying the coefficients by T1 will convert the data back to the original form. For example, the coefficients for the discrete cosine transform (DCT) are Tij = (p 1/n cos (2j+1)i_ p 2n i = 0, 0 j < n 2/n cos (2j+1)i_ 2n 0 < i < n, 0 j < n The DCT is one of the most commonly used transforms in practice for image compression, more so than the discrete Fourier transform (DFT). This is because the DFT assumes periodicity, which is not necessarily true in images. In particular to represent a linear function over a region requires many large amplitude high-frequency components in a DFT. This is because the periodicity assumption will view the function as a sawtooth, which is highly discontinuous at the teeth requiring the high-frequency components. The DCT does not assume periodicity and will only require much lower amplitude high-frequency components. The DCT also does not require a phase, which is typically represented using complex numbers in the DFT. For the purpose of compression, the properties we would like of a transform are (1) to decorrelate the data, (2) have many of the transformed coefficients be small, and (3) have it so that from the point of view of perception, some of the terms are more important than others.

3) Fractal compression
Fractal image representation can be described mathematically as an iterated function system (IFS). For Binary Images We begin with the representation of a binary image, where the image may be thought of as a subset of . An IFS is a set of contraction mappings 1,...,N,

According to these mapping functions, the IFS describes a two-dimensional set S as the fixed point of the Hutchinson operator

That is, H is an operator mapping sets to sets, and S is the unique set satisfying H(S) = S. The idea is to construct the IFS such that this set S is the input binary image. The set S can be recovered from the IFS by fixed point iteration: for any nonempty compact initial set A0, the iteration Ak+1 = H(Ak) converges to S. The set S is self-similar because H(S) = S implies that S is a union of mapped copies of itself:

So we see the IFS is a fractal representation of S. Extension to Grayscale IFS representation can be extended to a grayscale image by considering the image's graph as a subset of . For a grayscale image u(x,y), consider the set S = {(x,y,u(x,y))}. Then similar to the binary case, S is described by an IFS using a set of contraction mappings 1,...,N, but in ,

Encoding A challenging problem of ongoing research in fractal image representation how to choose the 1,...,N such that its fixed point approximates the input image, and how to do this efficiently. A simple approach[1] for doing so is the following:

1. Partition the image domain into blocks Ri of size ss. 2. For each Ri, search the image to find a block Di of size 2s2s that is very similar to Ri. 3. Select the mapping functions such that H(Di) = Ri for each i. In the second step, it is important to find a similar block so that the IFS accurately represents the input image, so a sufficient number of candidate blocks for Di need to be considered. On the other hand, a large search considering many blocks is computationally costly. This bottleneck of searching for similar blocks is why fractal encoding is much slower than for example DCT and wavelet based image representations.

Resources
- http://en.wikipedia.org/wiki/Image_compression - http://en.wikipedia.org/wiki/Entropy_encoding - http://en.wikipedia.org/wiki/Huffman_coding - http://en.wikipedia.org/wiki/Arithmetic_coding - http://en.wikipedia.org/wiki/Run-length_encoding - http://en.wikipedia.org/wiki/DEFLATE - http://en.wikipedia.org/wiki/Chroma_subsampling - http://en.wikipedia.org/wiki/Transform_coding - http://en.wikipedia.org/wiki/Fractal_compression - http://www.arturocampos.com/ac_arithmetic.html - http://michael.dipperstein.com/rle/index.html - http://www.ietf.org/rfc/rfc1951.txt - Introduction to Data Compression, Guy E. Blelloch, Computer, Science Department, Carnegie Mellon University. - A RAPID ENTROPY-CODING ALGORITHM, Wm. Douglas Withers, Department of Mathematics, United States Naval Academy, Annapolis, MD 21402, and Pegasus Imaging Corporation.

10

You might also like